flush_rcu_sheaves_on_cache() calls queue_work_on() in a for_each_online_cpu() loop, which requires the cpu to stay online. But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache() and the set of "online cpus" is subject to change. There are two paths that call flush_rcu_sheaves_on_cache(): // has cpus_read_lock() flush_all_rcu_sheaves() -> flush_rcu_sheaves_on_cache() // no cpus_read_lock() kvfree_rcu_barrier_on_cache() -> flush_rcu_sheaves_on_cache() Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache(). Why not move cpus_read_lock() from flush_all_rcu_sheaves() into flush_rcu_sheaves_on_cache()? The reason is it would introduce a new lock order (slab_mutex -> cpu_hotplug_lock). The reverse order (cpu_hotplug_lock -> slab_mutex) is established by - cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...) - kmem_cache_destroy() The two orders together would form an AB-BA deadlock. Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache() to catch the same problem in the future. Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction") Signed-off-by: Qing Wang --- Changes in v2: - Deleted the unnecessary comment. - Added "Fixes" field in the commit message. Changes in v3: - Deleted the unnecessary comment. mm/slab_common.c | 2 ++ mm/slub.c | 1 + 2 files changed, 3 insertions(+) diff --git a/mm/slab_common.c b/mm/slab_common.c index d5a70a831a2a..8b661fff5eed 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -2110,7 +2110,9 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier); void kvfree_rcu_barrier_on_cache(struct kmem_cache *s) { if (cache_has_sheaves(s)) { + cpus_read_lock(); flush_rcu_sheaves_on_cache(s); + cpus_read_unlock(); rcu_barrier(); } diff --git a/mm/slub.c b/mm/slub.c index 161079ac5ba1..2a005d1e3a74 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4024,6 +4024,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s) struct slub_flush_work *sfw; unsigned int cpu; + lockdep_assert_cpus_held(); mutex_lock(&flush_lock); for_each_online_cpu(cpu) { -- 2.34.1