From: Lang Xu The root cause of this bug is that when `bpf_link_put` reduces the refcount of `shim_link->link.link` to zero, the resource is considered released but may still be referenced via `tr->progs_hlist` in `cgroup_shim_find`. The actual cleanup of `tr->progs_hlist` in `bpf_shim_tramp_link_release` is deferred. During this window, another process can cause a use-after-free via `bpf_trampoline_link_cgroup_shim`. Based on Martin KaFai Lau's suggestions, I have created a simple patch. To fix this: Add an atomic non-zero check in `bpf_trampoline_link_cgroup_shim`. Only increment the refcount if it is not already zero. Optimized testing: I used a non-rigorous method to verify the fix by adding a delay in `bpf_shim_tramp_link_release` to make the bug easier to trigger: static void bpf_shim_tramp_link_release(struct bpf_link *link) { ... if (!shim_link->trampoline) return; + msleep(100); WARN_ON_ONCE(bpf_trampoline_unlink_prog(&shim_link->link, shim_link->trampoline, NULL)); bpf_trampoline_put(shim_link->trampoline); } Before the patch, running a PoC easily reproduced the crash(almost 100%) with a call trace similar to KaiyanM's report. After the patch, the bug no longer occurs even after millions of iterations. Fixes: 69fd337a975c ("bpf: per-cgroup lsm flavor") Signed-off-by: Lang Xu --- kernel/bpf/trampoline.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index dbe7754b4f4e..894cd6f205f5 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -749,10 +749,8 @@ int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog, mutex_lock(&tr->mutex); shim_link = cgroup_shim_find(tr, bpf_func); - if (shim_link) { + if (shim_link && !IS_ERR(bpf_link_inc_not_zero(&shim_link->link.link))) { /* Reusing existing shim attached by the other program. */ - bpf_link_inc(&shim_link->link.link); - mutex_unlock(&tr->mutex); bpf_trampoline_put(tr); /* bpf_trampoline_get above */ return 0; -- 2.51.0