summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCaleb Sander Mateos <csander@purestorage.com>2024-11-07 11:30:51 -0700
committerJakub Kicinski <kuba@kernel.org>2024-11-11 14:13:44 -0800
commitb83db10996f5276f998bb5bb59da2fada560efbd (patch)
treec41c6d597af4b07519bc94fb768c0984cb74afe8
parentf95a392ed43c864578ec21aafd90d835ba5ef3af (diff)
downloadlwn-b83db10996f5276f998bb5bb59da2fada560efbd.tar.gz
lwn-b83db10996f5276f998bb5bb59da2fada560efbd.zip
mlx5/core: relax memory barrier in eq_update_ci()
The memory barrier in eq_update_ci() after the doorbell write is a significant hot spot in mlx5_eq_comp_int(). Under heavy TCP load, we see 3% of CPU time spent on the mfence instruction. 98df6d5b877c ("net/mlx5: A write memory barrier is sufficient in EQ ci update") already relaxed the full memory barrier to just a write barrier in mlx5_eq_update_ci(), which duplicates eq_update_ci(). So replace mb() with wmb() in eq_update_ci() too. On strongly ordered architectures, no barrier is actually needed because the MMIO writes to the doorbell register are guaranteed to appear to the device in the order they were made. However, the kernel's ordered MMIO primitive writel() lacks a convenient big-endian interface. Therefore, we opt to stick with __raw_writel() + a barrier. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241107183054.2443218-1-csander@purestorage.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-rw-r--r--drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
index 4b7f7131c560..b1edc71ffc6d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
@@ -72,7 +72,7 @@ static inline void eq_update_ci(struct mlx5_eq *eq, int arm)
__raw_writel((__force u32)cpu_to_be32(val), addr);
/* We still want ordering, just not swabbing, so add a barrier */
- mb();
+ wmb();
}
int mlx5_eq_table_init(struct mlx5_core_dev *dev);