Kernel test robot has reported a regression in the patch "slab: refill sheaves from all nodes". When taken in isolation like this, there is indeed a tradeoff - we prefer to use remote objects prior to allocating new local slabs. It is replicating a behavior that existed before sheaves for replenishing cpu (partial) slabs - now called get_from_any_partial() to allocate a single object. So the possibility of allocating remote objects is intended even if remote accesses are then slower. But the profiles in the report also suggested a contention on the list_lock spinlock. And that's something we can try to avoid without much tradeoff - if someone else has the spin_lock, it's more likely they are allocating from the node than freeing to it, so we can skip it even if it means allocating a new local slab - contributing to that lock's contention isn't worth it. It should not result in partial slabs accumulating on the remote node. Thus add an allow_spin parameter to __refill_objects_node() and get_partial_node_bulk() to make the attempts from __refill_objects_any() use only a trylock. Reported-by: kernel test robot Link: https://lore.kernel.org/oe-lkp/202601132136.77efd6d7-lkp@intel.com Signed-off-by: Vlastimil Babka --- To be applied on top of: https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-7.0/sheaves --- mm/slub.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index eb1f52a79999..ca3db3ae1afb 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3378,7 +3378,8 @@ static inline bool pfmemalloc_match(struct slab *slab, gfp_t gfpflags); static bool get_partial_node_bulk(struct kmem_cache *s, struct kmem_cache_node *n, - struct partial_bulk_context *pc) + struct partial_bulk_context *pc, + bool allow_spin) { struct slab *slab, *slab2; unsigned int total_free = 0; @@ -3390,7 +3391,10 @@ static bool get_partial_node_bulk(struct kmem_cache *s, INIT_LIST_HEAD(&pc->slabs); - spin_lock_irqsave(&n->list_lock, flags); + if (allow_spin) + spin_lock_irqsave(&n->list_lock, flags); + else if (!spin_trylock_irqsave(&n->list_lock, flags)) + return false; list_for_each_entry_safe(slab, slab2, &n->partial, slab_list) { struct freelist_counters flc; @@ -6544,7 +6548,8 @@ EXPORT_SYMBOL(kmem_cache_free_bulk); static unsigned int __refill_objects_node(struct kmem_cache *s, void **p, gfp_t gfp, unsigned int min, - unsigned int max, struct kmem_cache_node *n) + unsigned int max, struct kmem_cache_node *n, + bool allow_spin) { struct partial_bulk_context pc; struct slab *slab, *slab2; @@ -6556,7 +6561,7 @@ __refill_objects_node(struct kmem_cache *s, void **p, gfp_t gfp, unsigned int mi pc.min_objects = min; pc.max_objects = max; - if (!get_partial_node_bulk(s, n, &pc)) + if (!get_partial_node_bulk(s, n, &pc, allow_spin)) return 0; list_for_each_entry_safe(slab, slab2, &pc.slabs, slab_list) { @@ -6650,7 +6655,8 @@ __refill_objects_any(struct kmem_cache *s, void **p, gfp_t gfp, unsigned int min n->nr_partial <= s->min_partial) continue; - r = __refill_objects_node(s, p, gfp, min, max, n); + r = __refill_objects_node(s, p, gfp, min, max, n, + /* allow_spin = */ false); refilled += r; if (r >= min) { @@ -6691,7 +6697,8 @@ refill_objects(struct kmem_cache *s, void **p, gfp_t gfp, unsigned int min, return 0; refilled = __refill_objects_node(s, p, gfp, min, max, - get_node(s, local_node)); + get_node(s, local_node), + /* allow_spin = */ true); if (refilled >= min) return refilled; --- base-commit: 6f1912181ddfcf851a6670b4fa9c7dfdaf3ed46d change-id: 20260129-b4-refill_any_trylock-160a31224193 Best regards, -- Vlastimil Babka