sk_psock_backlog() recovers the psock it operates on from the delayed work item, but it takes its lifetime reference with sk_psock_get(psock->sk). That reloads sk->sk_user_data and can therefore return a replacement psock after the old psock was detached and a new one was attached to the same socket. In that case the worker locks and drains the old psock, but the reference it acquired belongs to the replacement psock. The exit path then puts the detached old psock, which can underflow its refcount after the last unlink while the replacement psock keeps the leaked reference. Take the reference on the delayed-work psock directly with refcount_inc_not_zero(). If that fails, the old psock is already being dropped, so skip the detached backlog instead of processing or putting it. This keeps the worker's get/put pair on the same psock whose work_state, ingress queue and state bits it manipulates. The buggy scenario involves two paths, with each column showing the order within that path: path A label: detach and reattach path path B label: old backlog worker 1. The last unlink drops the old 1. Delayed work resumes from the psock into sk_psock_drop(). old psock embedded in work. 2. sk_psock_drop() clears 2. The worker still sees sk->sk_user_data before the old SK_PSOCK_TX_ENABLED on that TX state is cleared. old psock. 3. A new attach publishes a 3. sk_psock_get(psock->sk) replacement psock on the same reloads sk->sk_user_data and socket. refs the replacement psock. 4. The old psock is still queued for 4. The worker locks, drains and delayed backlog work. finally puts the detached old psock. Sanitizer validation reported: Non-fatal target warning: refcount_t underflow/use-after-free warning from refcount_warn_saturate triggered by sk_psock_backlog putting the detached old psock after last_old_ref_before_put reached 0. use-after-free Signed-off-by: Zhang Cen --- --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -684,12 +684,12 @@ static void sk_psock_backlog(struct work_struct *work) if (!sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) return; - /* Increment the psock refcnt to synchronize with close(fd) path in - * sock_map_close(), ensuring we wait for backlog thread completion - * before sk_socket freed. If refcnt increment fails, it indicates - * sock_map_close() completed with sk_socket potentially already freed. + /* Hold the delayed-work psock itself so teardown synchronizes with + * the same object whose work_state, queues and state bits we touch. + * If the refcnt is already zero, this psock is being dropped and its + * detached backlog must no longer run. */ - if (!sk_psock_get(psock->sk)) + if (!refcount_inc_not_zero(&psock->refcnt)) return; mutex_lock(&psock->work_mutex); while ((skb = skb_peek(&psock->ingress_skb))) {