Unlike SOCKMAP, BPF_SOCK_OPS_RCVQ_CB does not iterate existing skbs in the receive queue when it is enabled for the first time. In practical production use cases, this behavior is usually not a problem. We can safely assume that the upper-layer protocol is designed with specific synchronisation points where the connection is temporarily quiet. At these points, the application can completely drain the receive queue and safely enable BPF_SOCK_OPS_RCVQ_CB while no skbs are pending. A prime example is an application transitioning from HTTP to an RPC protocol: Client Server | | | --- HTTP Upgrade request ---------> | | | [Drain all skbs] | | [Enable BPF_SOCK_OPS_RCVQ_CB] | <-- HTTP 200/Switching protocol --- | | | | --- RPC Frame 1 ------------------> | However, to strictly prevent any potential race conditions arising from unconventional upper-layer protocol designs, let's explicitly signal a failure if BPF_SOCK_OPS_RCVQ_CB is enabled while the receive queue is not empty. -EUCLEAN is chosen to indicate that the caller needs to clean up the receive queue before enabling the feature. Signed-off-by: Kuniyuki Iwashima --- net/core/filter.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/core/filter.c b/net/core/filter.c index 5913b3be9f1d..883e4aaed49e 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5395,6 +5395,9 @@ static int bpf_sock_ops_check_rcvq_cb(struct sock *sk, int val) if (unlikely(sk_is_mptcp(sk))) return -EOPNOTSUPP; + + if (!skb_queue_empty(&sk->sk_receive_queue)) + return -EUCLEAN; } return 0; -- 2.54.0.746.g67dd491aae-goog