On Wed, Jun 25, 2025 at 4:38 AM Harry Yoo <harry.yoo@xxxxxxxxxx> wrote: > > On Tue, Jun 24, 2025 at 10:13:49AM -0700, Alexei Starovoitov wrote: > > On Thu, May 8, 2025 at 6:03 PM Harry Yoo <harry.yoo@xxxxxxxxxx> wrote: > > > > + s = kmalloc_slab(size, NULL, alloc_gfp, _RET_IP_); > > > > + > > > > + if (!(s->flags & __CMPXCHG_DOUBLE)) > > > > + /* > > > > + * kmalloc_nolock() is not supported on architectures that > > > > + * don't implement cmpxchg16b. > > > > + */ > > > > + return NULL; > > > > > > Hmm when someone uses slab debugging flags (e.g., passing boot > > > parameter slab_debug=FPZ as a hardening option on production [1], or > > > just for debugging), __CMPXCHG_DOUBLE is not set even when the arch > > > supports it. > > > > > > Is it okay to fail all kmalloc_nolock() calls in such cases? > > > > I studied the code and the git history. > > Looks like slub doesn't have to disable cmpxchg mode when slab_debug is on. > > A slight correction; Debug caches do not use cmpxchg mode at all by > design. If a future change enables cmpxchg mode for them, it will cause > the same consistency issue. > > > The commit 41bec7c33f37 ("mm/slub: remove slab_lock() usage for debug > > operations") > > removed slab_lock from debug validation checks. > > An excellent point! > > Yes, SLUB does not maintain cpu slab and percpu partial slabs on > debug caches. Ohh. That was a crucial detail I was missing. Now I see that 90% of ___slab_alloc() logic is not executed for debug slabs :) > Alloc/free is done under n->list_lock, so no need for > cmpxchg double and slab_lock() at all :) > > > So right now slab_lock() only serializes slab->freelist/counter update. > > It's still necessary on arch-s that don't have cmpxchg, but that's it. > > Only __update_freelist_slow() is using it. > > Yes. > > > The comment next to SLAB_NO_CMPXCHG is obsolete as well. > > It's been there since days that slab_lock() was taken during > > consistency checks. > > Yes. > > > I think the following diff is appropriate: > > > > diff --git a/mm/slub.c b/mm/slub.c > > index 044e43ee3373..9d615cfd1b6f 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -286,14 +286,6 @@ static inline bool > > kmem_cache_has_cpu_partial(struct kmem_cache *s) > > #define DEBUG_DEFAULT_FLAGS (SLAB_CONSISTENCY_CHECKS | SLAB_RED_ZONE | \ > > SLAB_POISON | SLAB_STORE_USER) > > > > -/* > > - * These debug flags cannot use CMPXCHG because there might be consistency > > - * issues when checking or reading debug information > > - */ > > -#define SLAB_NO_CMPXCHG (SLAB_CONSISTENCY_CHECKS | SLAB_STORE_USER | \ > > - SLAB_TRACE) > > - > > - > > > > /* > > * Debugging flags that require metadata to be stored in the slab. These get > > * disabled when slab_debug=O is used and a cache's min order increases with > > @@ -6654,7 +6646,7 @@ int do_kmem_cache_create(struct kmem_cache *s, > > const char *name, > > } > > > > #ifdef system_has_freelist_aba > > - if (system_has_freelist_aba() && !(s->flags & SLAB_NO_CMPXCHG)) { > > + if (system_has_freelist_aba()) { > > /* Enable fast mode */ > > s->flags |= __CMPXCHG_DOUBLE; > > } > > > > It survived my stress tests. > > Thoughts? > > Perhaps it's better to change the condition in > kmalloc_nolock_noprof() from; > > if (!(s->flags & __CMPXCHG_DOUBLE)) > return NULL; > > to; > > if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s)) > return NULL; > > Because debug caches do not use cmpxchg double (and that's why > it survived your test), it is not accurate to set __CMPXCHG_DOUBLE. > > And it'll get into trouble anyway if debug caches use cmpxchg double. All makes sense. Tested this suggestion. Indeed behaves as you described :) It's quite a relief, since I was also worried that __CMPXCHG_DOUBLE limitation of kmalloc_nolock() might hurt debug_slab cases. Only slub_tiny case left to do before I can post v2.