On netns teardown, fqdir_pre_exit() walks the fqdir rhashtable and flushes every fragment queue that is not yet complete using inet_frag_queue_flush(). That helper frees all the skbs queued on the fragment queue but does not set INET_FRAG_COMPLETE, and leaves q->fragments_tail and q->last_run_head pointing at the freed skbs. The queue itself stays in the rhashtable. fqdir_pre_exit() first lowers high_thresh to 0 to stop new queue lookups, but it cannot stop a fragment that already obtained the queue through inet_frag_find() earlier and stalled just before taking the queue lock. Once that fragment resumes after the flush and takes the queue lock, it passes the INET_FRAG_COMPLETE check and then dereferences the freed fragments_tail. inet_frag_queue_insert() reads FRAG_CB() and ->len of that pointer and, on the append path, writes ->next_frag, causing a slab use-after-free. IPv6, nf_conntrack_reasm6 and 6lowpan reassembly share the same flush path and are affected as well. Reset rb_fragments, fragments_tail and last_run_head in inet_frag_queue_flush() so a flushed queue no longer points at the freed skbs. A fragment that resumes after the flush and takes the queue lock then finds an empty queue and starts a new run instead of dereferencing the freed fragments_tail. ip_frag_reinit() already performed this reset after its own flush, so drop the now duplicate code there. Cc: stable@vger.kernel.org Fixes: 006a5035b495 ("inet: frags: flush pending skbs in fqdir_pre_exit()") Suggested-by: Eric Dumazet Signed-off-by: Hyunwoo Kim --- Changes in v2: - Move the queue pointer reset into inet_frag_queue_flush() to remove the duplicate reset in ip_frag_reinit(). - Drop the INET_FRAG_COMPLETE setting since it leaks the queue on the fqdir_pre_exit() path. - v1: https://lore.kernel.org/all/ah1Sw2g-I89BRRiT@v4bel/ --- net/ipv4/inet_fragment.c | 3 +++ net/ipv4/ip_fragment.c | 3 --- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 393770920abd..1127519b8416 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -328,6 +328,9 @@ void inet_frag_queue_flush(struct inet_frag_queue *q, reason = reason ?: SKB_DROP_REASON_FRAG_REASM_TIMEOUT; sum = inet_frag_rbtree_purge(&q->rb_fragments, reason); sub_frag_mem_limit(q->fqdir, sum); + q->rb_fragments = RB_ROOT; + q->fragments_tail = NULL; + q->last_run_head = NULL; } EXPORT_SYMBOL(inet_frag_queue_flush); diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 56b0f738d2f2..c790d2f49487 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -250,9 +250,6 @@ static int ip_frag_reinit(struct ipq *qp) qp->q.flags = 0; qp->q.len = 0; qp->q.meat = 0; - qp->q.rb_fragments = RB_ROOT; - qp->q.fragments_tail = NULL; - qp->q.last_run_head = NULL; qp->iif = 0; qp->ecn = 0; -- 2.43.0