When alloc_slab_obj_exts() is called later in time (instead of at slab allocation & initialization step), slab->stride and slab->obj_exts are set when the slab is already accessible by multiple CPUs. The current implementation does not enforce memory ordering between slab->stride and slab->obj_exts. However, for correctness, slab->stride must be visible before slab->obj_exts, otherwise concurrent readers may observe slab->obj_exts as non-zero while stride is still stale, leading to incorrect reference counting of object cgroups. There has been a bug report [1] that showed symptoms of incorrect reference counting of object cgroups, which could be triggered by this memory ordering issue. Fix this by unconditionally initializing slab->stride in alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check. In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function. This ensures stride is set before the slab becomes visible to other CPUs via the per-node partial slab list (protected by spinlock with acquire/release semantics), preventing them from observing inconsistent stride value. Thanks to Shakeel Butt for pointing out this issue [2]. Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext") Reported-by: Venkat Rao Bagalkote Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1] Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2] Signed-off-by: Harry Yoo --- I tested this patch, but I could not confirm that this actually fixes the issue reported by [1]. It would be nice if Venkat could help confirm; but perhaps it's challenging to reliably reproduce... Since this logically makes sense, it would be worth fix it anyway. mm/slub.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 18c30872d196..afa98065d74f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, retry: old_exts = READ_ONCE(slab->obj_exts); handle_failed_objexts_alloc(old_exts, vec, objects); - slab_set_stride(slab, sizeof(struct slabobj_ext)); if (new_slab) { /* @@ -2272,6 +2271,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab) void *addr; unsigned long obj_exts; + /* Initialize stride early to avoid memory ordering issues */ + slab_set_stride(slab, sizeof(struct slabobj_ext)); + if (!need_slab_obj_exts(s)) return; @@ -2288,7 +2290,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab) obj_exts |= MEMCG_DATA_OBJEXTS; #endif slab->obj_exts = obj_exts; - slab_set_stride(slab, sizeof(struct slabobj_ext)); } else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) { unsigned int offset = obj_exts_offset_in_object(s); -- 2.43.0