When running XDP forwarding and interface gets shut down, kernel might panic or show SLUB "poison overwritten" errors due to a race condition between NAPI polling and resource freeing. Observed error is one of following: - Poison overwrriten [ 1889.547746] eth1: Link is Down [ 1889.549940] ============================================================================= [ 1889.549954] BUG kmalloc-4k (Tainted: G B ): Poison overwritten [ 1889.549959] ----------------------------------------------------------------------------- [ 1889.549963] 0xffffff882dcc4d80-0xffffff882dcc4da7 @offset=19840. First byte 0x0 instead of 0x6b [ 1889.549969] Allocated in __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] age=169 cpu=7 pid=27759 [ 1889.550020] __kmem_cache_alloc_node+0x100/0x2e8 [ 1889.550032] __kmalloc+0x58/0x1a0 [ 1889.550039] __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] [ 1889.550052] alloc_dma_desc_resources+0xec/0x164 [stmmac] [ 1889.550064] stmmac_setup_dma_desc+0xec/0x1e4 [stmmac] [ 1889.550076] stmmac_open+0x28/0x94 [stmmac] [...] - Wrong memory address [ 1901.546692] Unable to handle kernel paging request at virtual address dead000000000122 [...] [ 1902.964068] Call trace: [ 1902.967193] free_to_partial_list+0x560/0x600 [ 1902.972227] __slab_free+0x1a8/0x420 [ 1902.976480] __kmem_cache_free+0x204/0x218 [ 1902.981254] kfree+0x6c/0x128 [ 1902.984900] kvfree+0x3c/0x4c [ 1902.988545] page_pool_release+0x234/0x27c [ 1902.993320] page_pool_destroy+0xcc/0x190 [ 1902.998006] __free_dma_rx_desc_resources+0x100/0x360 [stmmac] [ 1903.004516] free_dma_desc_resources+0x8c/0xac [stmmac] [ 1903.010419] stmmac_release+0x1c0/0x2b4 [stmmac] [...] Root cause is stmmac_release() stops DMA and frees TX/RX ring buffers and page pools while NAPI/XDP could still be accessing these resources in the background. Fix this by following: - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop - Call synchronize_rcu() after stopping DMA but before freeing resources to ensure all ongoing NAPI operations complete - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to drop packets when interface is going down. This has already been done for stmmac_xdp_xmit() so make it consistent - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation. This was only done for stmmac_reset_subtask() during abnormal operation, which is not enough. This does not affect normal operation as this flag is used only for XDP apps Co-developed-by: Chang-Sub Lee Signed-off-by: Chang-Sub Lee Signed-off-by: Jakub Raczynski --- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 3591755ea30b..3b7b7b0cab9b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4147,6 +4147,9 @@ static int __stmmac_open(struct net_device *dev, stmmac_reset_queues_param(priv); + /* Clear DOWN flag when opening the interface */ + clear_bit(STMMAC_DOWN, &priv->state); + ret = stmmac_hw_setup(dev); if (ret < 0) { netdev_err(priv->dev, "%s: Hw setup failed\n", __func__); @@ -4251,9 +4254,18 @@ static void __stmmac_release(struct net_device *dev) /* Free the IRQ lines */ stmmac_free_irq(dev, REQ_IRQ_ERR_ALL, 0); + /* Set DOWN flag to prevent XDP from processing new packets */ + set_bit(STMMAC_DOWN, &priv->state); + /* Stop TX/RX DMA and clear the descriptors */ stmmac_stop_all_dma(priv); + /* Ensure NAPI has finished before freeing resources. + * This prevents use-after-free when NAPI is mid-execution + * accessing TX/RX ring buffers and page pool during ifconfig down. + */ + synchronize_rcu(); + /* Release and free the Rx/Tx resources */ free_dma_desc_resources(priv, &priv->dma_conf); @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, if (unlikely(!xdpf)) return STMMAC_XDP_CONSUMED; + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) + return -ENETDOWN; + queue = stmmac_xdp_get_tx_queue(priv, cpu); nq = netdev_get_tx_queue(priv->dev, queue); @@ -5308,7 +5323,9 @@ static int __stmmac_xdp_run_prog(struct stmmac_priv *priv, res = stmmac_xdp_xmit_back(priv, xdp); break; case XDP_REDIRECT: - if (xdp_do_redirect(priv->dev, xdp, prog) < 0) + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) + res = STMMAC_XDP_CONSUMED; + else if (xdp_do_redirect(priv->dev, xdp, prog) < 0) res = STMMAC_XDP_CONSUMED; else res = STMMAC_XDP_REDIRECT; -- 2.34.1