Padding TCP options with NOPs is optional, so it is legal to send an MSS option that is not aligned to a word boundary and therefore not aligned for checksum calculation. The current TCPMSS target is not robust to this: when the MSS option is unaligned it produces an invalid checksum, and the packet is dropped. This has not been observed in any real environment. Senders place the MSS at the beginning of the options block, where it is naturally aligned, but the spec allows unaligned options and the kernel shouldn't silently drop legal packets. When the changed word is not aligned, the modified bytes straddle two checksum words, and using the standard incremental update helper (which assumes alignment) produces an invalid checksum: | w1 | w2 | OLD | a b | c d | NEW | a b' | c' d | Since b' and c' sit across w1 and w2, we could compute the incremental checksum in two operations by recalculating w1 and then w2: C' = C - w1 + w1' - w2 + w2' But working it out: C' = C - w1 - w2 + w1' + w2' = C - (a * 2^8 + b) - (c * 2^8 + d) + (a * 2^8 + b') + (c' * 2^8 + d) = C + 2^8 * (a - a + c' - c) + (b' - b + d - d) = C + 2^8 * (c' - c) + (b' - b) = C - (2^8 * c + b) + (2^8 * c' + b') So an unaligned incremental checksum can be done in a single operation by byteswapping the changed bytes before passing them to the helper. This patch implements that trick for unaligned MSS option updates. Signed-off-by: Kacper Kokot --- I decided to go with the get_unaligned_be16 suggestion because it's idiomatic and it produces shorter assembly on x86-64 (6 instructions vs 9). SYN processing is a cold path so I didn't look into it further. v2: - Use get_unaligned_be16 (Fernando's suggestion) - Fix alignment check expression (David) - Mention it's a theoretical bug in the commit message - Drop cc stable, the bug is only theoretical net/netfilter/xt_TCPMSS.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/net/netfilter/xt_TCPMSS.c b/net/netfilter/xt_TCPMSS.c index 80e1634bc51f..32c87a520361 100644 --- a/net/netfilter/xt_TCPMSS.c +++ b/net/netfilter/xt_TCPMSS.c @ -117,8 +117,9 @@ tcpmss_mangle_packet(struct sk_buff *skb, for (i = sizeof(struct tcphdr); i <= tcp_hdrlen - TCPOLEN_MSS; i += optlen(opt, i)) { if (opt[i] == TCPOPT_MSS && opt[i+1] == TCPOLEN_MSS) { u_int16_t oldmss; + u16 csum_oldmss, csum_newmss; - oldmss = (opt[i+2] << 8) | opt[i+3]; + oldmss = get_unaligned_be16(&opt[i+2]); /* Never increase MSS, even when setting it, as * doing so results in problems for hosts that rely @@ -130,8 +131,19 @@ tcpmss_mangle_packet(struct sk_buff *skb, opt[i+2] = (newmss & 0xff00) >> 8; opt[i+3] = newmss & 0x00ff; + csum_oldmss = htons(oldmss); + csum_newmss = htons(newmss); + + /* MSS may be unaligned; fix up the incremental checksum + * to avoid an invalid checksum and a dropped packet. + */ + if (((char *)&opt[i + 2] - (char *)tcph) & 0x1) { + csum_oldmss = swab16(csum_oldmss); + csum_newmss = swab16(csum_newmss); + } + inet_proto_csum_replace2(&tcph->check, skb, - htons(oldmss), htons(newmss), + csum_oldmss, csum_newmss, false); return 0; } -- 2.43.0