kmemleak_scan() walks every thread and scans its kernel stack under a single rcu_read_lock() with no reschedule point. On a host with very many threads -- amplified by KASAN/lockdep in debug builds -- this loop can hog a CPU long enough to trip the soft lockup watchdog: watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kmemleak:537] scan_block kmemleak_scan kmemleak_scan_thread kthread A cond_resched() cannot be added directly: the loop runs inside an RCU read-side critical section. Borrow the rcu_lock_break() pattern from kernel/hung_task.c: when a reschedule is needed, pin the two iteration cursors, drop the RCU read lock, cond_resched(), then re-acquire it and continue only if both cursors are still hashed. If a cursor was unhashed while the lock was dropped, the thread list cannot be walked further, so the round is aborted. Such a round scans only part of the task stacks, which would make live objects look unreferenced, so reuse the existing "scan interrupted" path to skip reporting; the next full scan reports the real leaks. Fixes: c4b28963fd79 ("mm/kmemleak: rely on rcu for task stack scanning") Cc: stable@vger.kernel.org Signed-off-by: Breno Leitao --- Changes in v2: - Do not create the nasty array, but use the same pattern as kernel/hung_task.c. - Link to v1: https://lore.kernel.org/r/20260611-kmemleak-stack-resched-v1-1-d6248ade5f4a@debian.org --- mm/kmemleak.c | 42 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/mm/kmemleak.c b/mm/kmemleak.c index 7c7ba17ce7af0..d88274dc0c605 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -1695,6 +1695,32 @@ static void kmemleak_cond_resched(struct kmemleak_object *object) put_object(object); } +/* + * Briefly drop the RCU read lock to reschedule during the task stack scan. + * Both cursors are pinned across the gap; return false if either one was + * unhashed meanwhile, so the caller stops this round instead of walking a + * stale list. + */ +static bool kmemleak_stack_scan_break(struct task_struct *g, + struct task_struct *p) +{ + bool can_cont; + + get_task_struct(g); + get_task_struct(p); + + rcu_read_unlock(); + cond_resched(); + rcu_read_lock(); + + can_cont = pid_alive(g) && pid_alive(p); + + put_task_struct(p); + put_task_struct(g); + + return can_cont; +} + /* * Print one leak inline. The hex dump is gated on OBJECT_ALLOCATED so it * does not touch user memory that was freed concurrently; the rest of the @@ -1804,6 +1830,7 @@ static void kmemleak_scan(void) int __maybe_unused i; struct xarray dedup; int new_leaks = 0; + bool aborted = false; jiffies_last_scan = jiffies; @@ -1890,11 +1917,21 @@ static void kmemleak_scan(void) rcu_read_lock(); for_each_process_thread(g, p) { void *stack = try_get_task_stack(p); + if (stack) { scan_block(stack, stack + THREAD_SIZE, NULL); put_task_stack(p); } + /* + * This is an expensive loop, we must to call the + * scheduler to avoid lockups + */ + if (need_resched() && !kmemleak_stack_scan_break(g, p)) { + aborted = true; + goto unlock; + } } +unlock: rcu_read_unlock(); } @@ -1937,9 +1974,10 @@ static void kmemleak_scan(void) scan_gray_list(); /* - * If scanning was stopped do not report any new unreferenced objects. + * If scanning was stopped or a stack scan round was aborted, do not + * report any new unreferenced objects. */ - if (scan_should_stop()) + if (scan_should_stop() || aborted) return; /* --- base-commit: abe651837cb394f76d738a7a747322fca3bf17ba change-id: 20260611-kmemleak-stack-resched-01ed72858a7f Best regards, -- Breno Leitao