From: Chuck Lever While lock_sock is held, incoming TCP segments land on sk->sk_backlog rather than sk->sk_receive_queue. tls_rx_rec_wait() inspects only sk_receive_queue, so backlog data remains invisible. For non-blocking callers (read_sock, and recvmsg or splice_read with MSG_DONTWAIT) this causes a spurious -EAGAIN. For blocking callers it forces an unnecessary sleep/wakeup cycle. Flush the backlog inside tls_rx_rec_wait() before checking sk_receive_queue so the strparser can parse newly-arrived segments immediately. On the next loop iteration tls_read_flush_backlog() may redundantly flush, but this path is cold and the cost is negligible. Backlog processing can run tcp_reset(), which calls tcp_done_with_error() to set sk->sk_err = ECONNRESET and then tcp_done() to set sk->sk_shutdown = SHUTDOWN_MASK. The pre-existing top-of-loop sk_err check already ran before the flush, so the freshly-set error would be masked by the next-line sk_shutdown test returning 0 (EOF). Re-check sk_err immediately before the sk_shutdown test so a connection abort surfaces as -ECONNRESET rather than a clean EOF. Commit f508262ae9f2 ("tls: Preserve sk_err across recvmsg() when data has been copied") gave the top-of-loop sk_err check a has_copied split. The recheck applies the same handling: when the caller has already copied bytes, sk_err is reported but preserved so the error surfaces on the next call; otherwise sock_error() consumes it so the error is reported exactly once. Suggested-by: Sabrina Dubroca Link: https://lore.kernel.org/netdev/ahgHgQ84RCc8uYrG@krikkit/ Reviewed-by: Hannes Reinecke Signed-off-by: Chuck Lever --- net/tls/tls_sw.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index df4cdf11f784..5a4300c943a1 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1400,6 +1400,8 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, if (ret < 0) return ret; + if (sk_flush_backlog(sk)) + released = true; if (!skb_queue_empty(&sk->sk_receive_queue)) { /* Defer notification to the exit point; this thread * will consume the record directly. @@ -1409,6 +1411,16 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, break; } + /* sk_flush_backlog() can run tcp_reset(), which sets + * sk_err and then sk_shutdown via tcp_done(). Recheck + * sk_err here so a connection abort surfaces as the + * actual error rather than a clean EOF. + */ + if (sk->sk_err) { + if (has_copied) + return -READ_ONCE(sk->sk_err); + return sock_error(sk); + } if (sk->sk_shutdown & RCV_SHUTDOWN) return 0; -- 2.54.0