scsi: lpfc: Change default IRQ model on AMD architectures

The current driver attempts to allocate an interrupt vector per cpu using the systems managed IRQ allocator (flag PCI_IRQ_AFFINITY). The system IRQ allocator will either provide the per-cpu vector, or return fewer vectors. When fewer vectors, they are evenly spread between the numa nodes on the system. When run on an AMD architecture, if interrupts occur to a cpu that is not in the same numa node as the adapter generating the interrupt, there are extreme costs and overheads in performance. Thus, if 1:1 vector allocation is used, or the "balanced" vectors in the other numa nodes, performance can be hit significantly. A much more performant model is to allocate interrupts only on the cpus that are in the numa node where the adapter resides. I/O completion is still performed by the cpu where the I/O was generated. Unfortunately, there is no flag to request the managed IRQ subsystem allocate vectors only for the CPUs in the numa node as the adapter. On AMD architecture, revert the irq allocation to the normal style (non-managed) and then use irq_set_affinity_hint() to set the cpu affinity and disable user-space rebalancing. Tie the support into CPU offline/online. If the cpu being offlined owns a vector, the vector is re-affinitized to one of the other CPUs on the same numa node. If there are no more CPUs on the numa node, the vector has all affinity removed and lets the system determine where it's serviced. Similarly, when the cpu that owned a vector comes online, the vector is reaffinitized to the cpu. Link: https://lore.kernel.org/r/20191105005708.7399-10-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
author: James Smart <jsmart2021@gmail.com> 2019-11-04 16:57:06 -0800
committer: Martin K. Petersen <martin.petersen@oracle.com> 2019-11-06 00:04:04 -0500
commit: dcaa213679387e95a315dca05c57dbb15273703c (patch)
tree: 841877bd096a0e4d1d633e29b5213395604efceb /drivers/scsi/lpfc/lpfc.h
parent: 93a4d6f40198dffcca35d9a928c409f9290f1fe0 (diff)
download: lwn-dcaa213679387e95a315dca05c57dbb15273703c.tar.gz
lwn-dcaa213679387e95a315dca05c57dbb15273703c.zip
1 files changed, 21 insertions, 0 deletions
diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 88d5fd99f5ec..935f98804198 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -837,6 +837,7 @@ struct lpfc_hba {
 	uint32_t cfg_fcp_mq_threshold;
 	uint32_t cfg_hdw_queue;
 	uint32_t cfg_irq_chann;
+	uint32_t cfg_irq_numa;
 	uint32_t cfg_suppress_rsp;
 	uint32_t cfg_nvme_oas;
 	uint32_t cfg_nvme_embed_cmd;
@@ -1312,6 +1313,26 @@ lpfc_phba_elsring(struct lpfc_hba *phba)
 }
 
 /**
+ * lpfc_next_online_numa_cpu - Finds next online CPU on NUMA node
+ * @numa_mask: Pointer to phba's numa_mask member.
+ * @start: starting cpu index
+ *
+ * Note: If no valid cpu found, then nr_cpu_ids is returned.
+ *
+ **/
+static inline unsigned int
+lpfc_next_online_numa_cpu(const struct cpumask *numa_mask, unsigned int start)
+{
+	unsigned int cpu_it;
+
+	for_each_cpu_wrap(cpu_it, numa_mask, start) {
+		if (cpu_online(cpu_it))
+			break;
+	}
+
+	return cpu_it;
+}
+/**
  * lpfc_sli4_mod_hba_eq_delay - update EQ delay
  * @phba: Pointer to HBA context object.
  * @q: The Event Queue to update.
author	James Smart <jsmart2021@gmail.com>	2019-11-04 16:57:06 -0800
committer	Martin K. Petersen <martin.petersen@oracle.com>	2019-11-06 00:04:04 -0500
commit	dcaa213679387e95a315dca05c57dbb15273703c (patch)
tree	841877bd096a0e4d1d633e29b5213395604efceb /drivers/scsi/lpfc/lpfc.h
parent	93a4d6f40198dffcca35d9a928c409f9290f1fe0 (diff)
download	lwn-dcaa213679387e95a315dca05c57dbb15273703c.tar.gz lwn-dcaa213679387e95a315dca05c57dbb15273703c.zip