The CPU receives frames from the MAC through conventional DMA: the CPU allocates buffers for the MAC, then the MAC fills them and returns ownership to the CPU. For each hardware RX queue, the CPU and MAC coordinate through a shared ring array of DMA descriptors: one descriptor per DMA buffer. Each descriptor includes the buffer's physical address and a status flag ("OWN") indicating which side owns the buffer: OWN=0 for CPU, OWN=1 for MAC. The CPU is only allowed to set the flag and the MAC is only allowed to clear it, and both must move through the ring in sequence: thus the ring is used for both "submissions" and "completions." In the stmmac driver, stmmac_rx() bookmarks its position in the ring with the `cur_rx` index. The main receive loop in that function checks for rx_descs[cur_rx].own=0, gives the corresponding buffer to the network stack (NULLing the pointer), and increments `cur_rx` modulo the ring size. After the loop exits, stmmac_rx_refill(), which bookmarks its position with `dirty_rx`, allocates fresh buffers and rearms the descriptors (setting OWN=1). If it fails any allocation, it simply stops early (leaving OWN=0) and will retry where it left off when next called. This means descriptors have a three-stage lifecycle (terms my own): - `empty` (OWN=1, buffer valid) - `full` (OWN=0, buffer valid and populated) - `dirty` (OWN=0, buffer NULL) But because stmmac_rx() only checks OWN, it confuses `full`/`dirty`. In the past (see 'Fixes:'), there was a bug where the loop could cycle `cur_rx` all the way back to the first descriptor it dirtied, resulting in a NULL dereference when mistaken for `full`. The aforementioned commit resolved that *specific* failure by capping the loop's iteration limit at `dma_rx_size - 1`, but this is only a partial fix: if the previous stmmac_rx_refill() didn't complete, then there are leftover `dirty` descriptors that the loop might encounter without needing to cycle fully around. The current code therefore panics (see 'Closes:') when stmmac_rx_refill() is memory-starved long enough for `cur_rx` to catch up to `dirty_rx`. Fix this by further tightening the clamp from `dma_rx_size - 1` to `dma_rx_size - stmmac_rx_dirty() - 1`, subtracting any remnant dirty entries and limiting the loop so that `cur_rx` cannot catch back up to `dirty_rx`. This carries no risk of arithmetic underflow: since the maximum possible return value of stmmac_rx_dirty() is `dma_rx_size - 1`, the worst the clamp can do is prevent the loop from running at all. Fixes: b6cb4541853c7 ("net: stmmac: avoid rx queue overrun") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221010 Cc: stable@vger.kernel.org Signed-off-by: Sam Edwards --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 6827c99bde8c..f98b070073c0 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -5609,7 +5609,8 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue) dma_dir = page_pool_get_dma_dir(rx_q->page_pool); bufsz = DIV_ROUND_UP(priv->dma_conf.dma_buf_sz, PAGE_SIZE) * PAGE_SIZE; - limit = min(priv->dma_conf.dma_rx_size - 1, (unsigned int)limit); + limit = min(priv->dma_conf.dma_rx_size - stmmac_rx_dirty(priv, queue) - 1, + (unsigned int)limit); if (netif_msg_rx_status(priv)) { void *rx_head; -- 2.52.0 The stmmac driver handles interrupts in the usual NAPI way: an interrupt arrives, the NAPI instance is scheduled and interrupts are masked, and the actual work occurs in the NAPI polling function. Once no further work remains, interrupts are unmasked and the NAPI instance is put to sleep to await a future interrupt. In the receive case, the MAC only sends the interrupt when a DMA operation completes; thus the driver must make sure a usable RX DMA descriptor exists before expecting a future interrupt. The main receive loop in stmmac_rx() exits under one of 3 conditions: 1) It encounters a DMA descriptor with OWN=1, indicating that no further pending data exists. The MAC will use this descriptor for the next RX DMA operation, so the driver can expect a future interrupt. 2) It exhausts the NAPI budget. In this case, the driver doesn't know whether the MAC has any usable DMA descriptors. But when the driver consumes its full budget, that signals NAPI to keep polling, so the question is moot. 3) It runs out of (non-dirty) descriptors in the RX ring. In this case, the MAC will only have a usable descriptor if stmmac_rx_refill() succeeds (at least partially). Currently, stmmac_rx() lacks any check against scenario #3 and stmmac_rx_refill() failing: it will stop NAPI polling and unmask interrupts to await an interrupt that will never arrive, stalling the receive pipeline indefinitely. Fix this by checking stmmac_rx_dirty(): it will return 0 if stmmac_rx_refill() fully succeeded and we can safely await an interrupt. Any nonzero value means some allocations failed, in which case we risk dropping frames if a large traffic burst exhausts the surviving non-dirties. Therefore, simply return the full budget (to keep polling) until all allocations succeed. Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers.") Cc: stable@vger.kernel.org Signed-off-by: Sam Edwards --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index f98b070073c0..81f764352f3d 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -5604,6 +5604,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue) unsigned int desc_size; struct sk_buff *skb = NULL; struct stmmac_xdp_buff ctx; + int budget = limit; int xdp_status = 0; int bufsz; @@ -5870,6 +5871,10 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue) priv->xstats.rx_dropped += rx_dropped; priv->xstats.rx_errors += rx_errors; + /* If stmmac_rx_refill() failed, keep trying until it doesn't. */ + if (unlikely(stmmac_rx_dirty(priv, queue) > 0)) + return budget; + return count; } -- 2.52.0