Blamed commit moved the TIME_WAIT-derived ISN from the skb control block to a per-CPU variable, assuming the value would always be consumed by tcp_conn_request() for the same packet that wrote it. That assumption is violated by multiple drop paths between the producer (__this_cpu_write(tcp_tw_isn, isn) in tcp_v{4,6}_rcv()) and the consumer (tcp_conn_request()): - min_ttl / min_hopcount check - xfrm policy check - tcp_inbound_hash() MD5/AO mismatch - tcp_filter() eBPF/SO_ATTACH_FILTER drop - th->syn && th->fin discard in tcp_rcv_state_process() TCP_LISTEN - psp_sk_rx_policy_check() in tcp_v{4,6}_do_rcv() - tcp_checksum_complete() in tcp_v{4,6}_do_rcv() - tcp_v{4,6}_cookie_check() returning NULL When a packet is dropped on any of these paths, tcp_tw_isn is left set. The next SYN processed on the same CPU then consumes the non zero value in tcp_conn_request(), receiving a predictable ISN. We could fix this by clearing tcp_tw_isn at tcp_v{4,6}_do_rcv() start, at the expense of slower fast path. This patch clears tcp_tw_isn only when drops are happening. It also adds two DEBUG_NET_WARN_ON_ONCE() to detect if any drop paths are missed in the future, as they check the state of tcp_tw_isn at the beginning of processing a packet that could potentially be a new SYN. Fixes: 41eecbd712b7 ("tcp: replace TCP_SKB_CB(skb)->tcp_tw_isn with a per-cpu field") Reported-by: Jakub Kicinski Signed-off-by: Eric Dumazet --- net/ipv4/tcp_ipv4.c | 3 +++ net/ipv6/tcp_ipv6.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c0526cc0398049fb34b5de20a1175d54942e80cd..529f10a768183f74b85fcfdbee94d35c88c55122 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1878,6 +1878,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) reset: tcp_v4_send_reset(sk, skb, sk_rst_convert_drop_reason(reason)); discard: + __this_cpu_write(tcp_tw_isn, 0); sk_skb_reason_drop(sk, skb, reason); /* Be careful here. If this function gets more complicated and * gcc suffers from register pressure on the x86, sk (in %ebx) @@ -2198,6 +2199,7 @@ int tcp_v4_rcv(struct sk_buff *skb) } } + DEBUG_NET_WARN_ON_ONCE(__this_cpu_read(tcp_tw_isn)); process: if (static_branch_unlikely(&ip4_min_ttl)) { /* min_ttl can be changed concurrently from do_ip_setsockopt() */ @@ -2274,6 +2276,7 @@ int tcp_v4_rcv(struct sk_buff *skb) } discard_it: + __this_cpu_write(tcp_tw_isn, 0); SKB_DR_OR(drop_reason, NOT_SPECIFIED); /* Discard frame. */ sk_skb_reason_drop(sk, skb, drop_reason); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index d13d49bfef19457cc5902cb556605a80f4c0ab2c..079f2324a3a777cfe30dd59777c9d3d6d9a6bae0 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1641,6 +1641,7 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) reset: tcp_v6_send_reset(sk, skb, sk_rst_convert_drop_reason(reason)); discard: + __this_cpu_write(tcp_tw_isn, 0); if (opt_skb) __kfree_skb(opt_skb); sk_skb_reason_drop(sk, skb, reason); @@ -1839,6 +1840,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) } } + DEBUG_NET_WARN_ON_ONCE(__this_cpu_read(tcp_tw_isn)); process: if (static_branch_unlikely(&ip6_min_hopcount)) { /* min_hopcount can be changed concurrently from do_ipv6_setsockopt() */ @@ -1913,6 +1915,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) } discard_it: + __this_cpu_write(tcp_tw_isn, 0); SKB_DR_OR(drop_reason, NOT_SPECIFIED); sk_skb_reason_drop(sk, skb, drop_reason); return 0; -- 2.54.0.563.g4f69b47b94-goog