When using sockmap to forward TCP traffic to the application layer of the peer socket, the peer socket's tcp_bpf_recvmsg_parser processing flow will synchronously update the tp->copied_seq field. This causes tp->rcv_nxt to become less than tp->copied_seq. Later, when this socket receives SKB packets from the protocol stack, in the call chain tcp_data_ready → tcp_epollin_ready, the function tcp_epollin_ready will return false, preventing the socket from being woken up to receive new packets. Therefore, it is necessary to synchronously update the tp->rcv_nxt information in sk_psock_skb_ingress. Signed-off-by: GuoYong Zheng --- net/core/skmsg.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 9becadd..e9d841c 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -576,6 +576,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb, struct sock *sk = psock->sk; struct sk_msg *msg; int err; + u32 seq; /* If we are receiving on the same sock skb->sk is already assigned, * skip memory accounting and owner transition seeing it already set @@ -595,8 +596,15 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb, */ skb_set_owner_r(skb, sk); err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg, true); - if (err < 0) + if (err < 0) { kfree(msg); + } else { + bh_lock_sock_nested(sk); + seq = READ_ONCE(tcp_sk(sk)->rcv_nxt) + len; + WRITE_ONCE(tcp_sk(sk)->rcv_nxt, seq); + bh_unlock_sock(sk); + } + return err; } -- 1.8.3.1