A passive-open child inherits the listener's smc_clcsock_data_ready().
sk_clone_lock() clears its sk_user_data to NULL because the listener tagged
it SK_USER_DATA_NOCOPY. Until accept restores the callback, a BPF sock_ops
program can add the established child to a sockmap, and sk_psock_init()
installs a sk_psock into the NULL sk_user_data. The inherited callback then
reads it back through smc_clcsock_user_data(), which strips only NOCOPY,
takes the sk_psock for an smc_sock, and dereferences a clcsk_* field past
its end:
BUG: KASAN: slab-out-of-bounds in smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
Read of size 8 at addr ffff8880013b8674 by task syz.6.12484/67930
smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
tcp_urg+0x24d/0x360 net/ipv4/tcp_input.c:6264
tcp_rcv_state_process+0x280d/0x4940 net/ipv4/tcp_input.c:7336
tcp_child_process+0x371/0xa50 net/ipv4/tcp_minisocks.c:1002
tcp_v4_rcv+0x1eaa/0x2a00 net/ipv4/tcp_ipv4.c:2186
[...]
Allocated by task 67930:
sk_psock_init+0x142/0x740 net/core/skmsg.c:766
sock_hash_update_common+0xd3/0x990 net/core/sock_map.c:1010
bpf_sock_hash_update+0x114/0x170 net/core/sock_map.c:1229
__cgroup_bpf_run_filter_sock_ops+0x74/0xa0 kernel/bpf/cgroup.c:1727
tcp_init_transfer+0x1085/0x1100 net/ipv4/tcp_input.c:6693
[...]
Resolve the conflict on the write path. Reserve the child's sk_user_data
with a NULL pointer tagged SK_USER_DATA_NOCOPY so sk_psock_init() returns
-EBUSY, and release it at accept. smc_clcsock_user_data() still strips the
tag to NULL, so the inherited callback stays a no-op.
Fixes: a60a2b1e0af1 ("net/smc: reduce active tcp_listen workers")
Signed-off-by: Sechang Lim
---
v3:
- reserve sk_user_data on the write path instead of the read-side check (D. Wythe)
v2:
- https://lore.kernel.org/netdev/20260619150342.3626224-1-rhkrqnwk98@gmail.com/
v1:
- https://lore.kernel.org/netdev/20260614120931.4041687-1-rhkrqnwk98@gmail.com/
net/smc/af_smc.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index b5db69073e20..78f162344fe3 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -154,7 +154,11 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
own_req, opt_child_init);
/* child must not inherit smc or its ops */
if (child) {
- rcu_assign_sk_user_data(child, NULL);
+ /* reserve sk_user_data so sockmap cannot claim the slot */
+ write_lock_bh(&child->sk_callback_lock);
+ __rcu_assign_sk_user_data_with_flags(child, NULL,
+ SK_USER_DATA_NOCOPY);
+ write_unlock_bh(&child->sk_callback_lock);
/* v4-mapped sockets don't inherit parent ops. Don't restore. */
if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
@@ -1773,6 +1777,7 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
/* new clcsock has inherited the smc listen-specific sk_data_ready
* function; switch it back to the original sk_data_ready function
*/
+ write_lock_bh(&new_clcsock->sk->sk_callback_lock);
new_clcsock->sk->sk_data_ready = lsmc->clcsk_data_ready;
/* if new clcsock has also inherited the fallback-specific callback
@@ -1786,6 +1791,9 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
if (lsmc->clcsk_error_report)
new_clcsock->sk->sk_error_report = lsmc->clcsk_error_report;
}
+ /* release the slot reserved in smc_tcp_syn_recv_sock() */
+ rcu_assign_sk_user_data(new_clcsock->sk, NULL);
+ write_unlock_bh(&new_clcsock->sk->sk_callback_lock);
(*new_smc)->clcsock = new_clcsock;
out:
--
2.43.0