From: Zijian Zhang When performing XDP_REDIRECT from one mlnx device to another, using smp_processor_id() to select the queue may go out-of-range. Assume eth0 is redirecting a packet to eth1, eth1 is configured with only 8 channels, while eth0 has its RX queues pinned to higher-numbered CPUs (e.g. CPU 12). When a packet is received on such a CPU and redirected to eth1, the driver uses smp_processor_id() as the SQ index. Since the CPU ID is larger than the number of queues on eth1, the lookup (priv->channels.c[sq_num]) goes out of range and the redirect fails. This patch fixes the issue by mapping the CPU ID to a valid channel index using modulo arithmetic. sq_num = smp_processor_id() % priv->channels.num; With this change, XDP_REDIRECT works correctly even when the source device uses high CPU affinities and the target device has fewer TX queues. v2: Suggested by Jakub Kicinski, I add a lock to synchronize TX when xdp redirects packets on the same queue. Signed-off-by: Zijian Zhang Reviewed-by: Hariprasad Kelam --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 3 +++ drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 8 +++----- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 ++ 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 14e3207b14e7..2281154442d9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -516,6 +516,9 @@ struct mlx5e_xdpsq { /* control path */ struct mlx5_wq_ctrl wq_ctrl; struct mlx5e_channel *channel; + + /* synchronize simultaneous xdp_xmit on the same ring */ + spinlock_t xdp_tx_lock; } ____cacheline_aligned_in_smp; struct mlx5e_xdp_buff { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 5d51600935a6..6225734b256a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -855,13 +855,10 @@ int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) return -EINVAL; - sq_num = smp_processor_id(); - - if (unlikely(sq_num >= priv->channels.num)) - return -ENXIO; - + sq_num = smp_processor_id() % priv->channels.num; sq = priv->channels.c[sq_num]->xdpsq; + spin_lock(&sq->xdp_tx_lock); for (i = 0; i < n; i++) { struct mlx5e_xmit_data_frags xdptxdf = {}; struct xdp_frame *xdpf = frames[i]; @@ -942,6 +939,7 @@ int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, if (flags & XDP_XMIT_FLUSH) mlx5e_xmit_xdp_doorbell(sq); + spin_unlock(&sq->xdp_tx_lock); return nxmit; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 9c46511e7b43..ced9eefe38aa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -1559,6 +1559,8 @@ static int mlx5e_alloc_xdpsq(struct mlx5e_channel *c, if (err) goto err_sq_wq_destroy; + spin_lock_init(&sq->xdp_tx_lock); + return 0; err_sq_wq_destroy: -- 2.37.1 (Apple Git-137.1)