This patch enhances GSO segment handling by properly checking the SKB_GSO_DODGY flag for frag_list GSO packets, addressing low throughput issues observed when a station accesses IPv4 servers via hotspots with an IPv6-only upstream interface. Specifically, it fixes a bug in GSO segmentation when forwarding GRO packets containing a frag_list. The function skb_segment_list cannot correctly process GRO skbs that have been converted by XLAT, since XLAT only translates the header of the head skb. Consequently, skbs in the frag_list may remain untranslated, resulting in protocol inconsistencies and reduced throughput. To address this, the patch explicitly sets the SKB_GSO_DODGY flag for GSO packets in XLAT's IPv4/IPv6 protocol translation helpers (bpf_skb_proto_4_to_6 and bpf_skb_proto_6_to_4). This marks GSO packets as potentially modified after protocol translation. As a result, GSO segmentation will avoid using skb_segment_list and instead falls back to skb_segment for packets with the SKB_GSO_DODGY flag. This ensures that only safe and fully translated frag_list packets are processed by skb_segment_list, resolving protocol inconsistencies and improving throughput when forwarding GRO packets converted by XLAT. Signed-off-by: Jibin Zhang --- v4: Change according to Willem de Bruijn's suggestions 1. Set SKB_GSO_DODGY when XLAT modifies headers for GSO packets. 2. GSO segmentation is downgraded to use skb_segment instead of skb_segment_list to ensure robust handling of potentially modified packets when the SKB_GSO_DODGY flag is set on a GSO packet. v3: Apply the same fix to tcp6_gso_segment(), as suggested. v2: To apply the added condition to a narrower scop In this version, the condition (skb_has_frag_list(gso_skb) && (gso_skb->protocol == skb_shinfo(gso_skb)->frag_list->protocol)) is moved into inner 'if' statement to a narrower scope. Send out the patch again for further discussion because: 1. This issue has a significant impact and has occurred in many countries and regions. 2. Currently, modifying BPF is not a good option, because BPF code cannot access the header of skb on the fraglist, and the required changes would affect a wide range of code. 3. Directly disabling GRO aggregation for XLAT flows is also not a good solution, as this change would disable GRO even when forwarding is not needed, and it would also require cooperation from all device drivers. [3]: https://patchwork.kernel.org/patch/14397344 [2]: https://patchwork.kernel.org/patch/14375646 [1]: https://patchwork.kernel.org/patch/14350844 --- net/core/filter.c | 2 ++ net/ipv4/tcp_offload.c | 3 ++- net/ipv4/udp_offload.c | 3 ++- net/ipv6/tcpv6_offload.c | 3 ++- 4 files changed, 8 insertions(+), 3 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 616e0520a0bb..bcd73d9bd764 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3353,6 +3353,7 @@ static int bpf_skb_proto_4_to_6(struct sk_buff *skb) shinfo->gso_type &= ~SKB_GSO_TCPV4; shinfo->gso_type |= SKB_GSO_TCPV6; } + shinfo->gso_type |= SKB_GSO_DODGY; } bpf_skb_change_protocol(skb, ETH_P_IPV6); @@ -3383,6 +3384,7 @@ static int bpf_skb_proto_6_to_4(struct sk_buff *skb) shinfo->gso_type &= ~SKB_GSO_TCPV6; shinfo->gso_type |= SKB_GSO_TCPV4; } + shinfo->gso_type |= SKB_GSO_DODGY; } bpf_skb_change_protocol(skb, ETH_P_IP); diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index fdda18b1abda..942a948f1a31 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -107,7 +107,8 @@ static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb, if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) { struct tcphdr *th = tcp_hdr(skb); - if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size) + if ((skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size) && + !(skb_shinfo(skb)->gso_type & SKB_GSO_DODGY)) return __tcp4_gso_segment_list(skb, features); skb->ip_summed = CHECKSUM_NONE; diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 19d0b5b09ffa..589456bd8b5f 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -514,7 +514,8 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) { /* Detect modified geometry and pass those to skb_segment. */ - if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size) + if ((skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size) && + !(skb_shinfo(gso_skb)->gso_type & SKB_GSO_DODGY)) return __udp_gso_segment_list(gso_skb, features, is_ipv6); ret = __skb_linearize(gso_skb); diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c index effeba58630b..5670d32c27f8 100644 --- a/net/ipv6/tcpv6_offload.c +++ b/net/ipv6/tcpv6_offload.c @@ -170,7 +170,8 @@ static struct sk_buff *tcp6_gso_segment(struct sk_buff *skb, if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) { struct tcphdr *th = tcp_hdr(skb); - if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size) + if ((skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size) && + !(skb_shinfo(skb)->gso_type & SKB_GSO_DODGY)) return __tcp6_gso_segment_list(skb, features); skb->ip_summed = CHECKSUM_NONE; -- 2.45.2