On Tue, Aug 12, 2025 at 10:08 AM Harry Yoo <harry.yoo@xxxxxxxxxx> wrote: > Sorry for the delay. I addressed all other comments and will respin soon. Only below question remains.. > > > > > { > > > > > @@ -3732,9 +3808,13 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, > > > > > if (unlikely(!node_match(slab, node))) { > > > > > /* > > > > > * same as above but node_match() being false already > > > > > - * implies node != NUMA_NO_NODE > > > > > + * implies node != NUMA_NO_NODE. > > > > > + * Reentrant slub cannot take locks necessary to > > > > > + * deactivate_slab, hence ignore node preference. > > > > > > > > Now that we have defer_deactivate_slab(), we need to either update the > > > > code or comment? > > > > > > > > 1. Deactivate slabs when node / pfmemalloc mismatches > > > > or 2. Update comments to explain why it's still undesirable > > > > > > Well, defer_deactivate_slab() is a heavy hammer. > > > In !SLUB_TINY it pretty much never happens. > > > > > > This bit: > > > > > > retry_load_slab: > > > > > > local_lock_cpu_slab(s, flags); > > > if (unlikely(c->slab)) { > > > > > > is very rare. I couldn't trigger it at all in my stress test. > > > > > > But in this hunk the node mismatch is not rare, so ignoring node preference > > > for kmalloc_nolock() is a much better trade off. > > But users would have requested that specific node instead of > NUMA_NO_NODE because (at least) they think it's worth it. > (e.g., allocating kernel data structures tied to specified node) > > I don't understand why kmalloc()/kmem_cache_alloc() try harder > (by deactivating cpu slab) to respect the node parameter, > but kmalloc_nolock() does not. Because kmalloc_nolock() tries to be as least intrusive as possible to kmalloc slabs that the rest of the kernel is using. There won't be kmem_cache_alloc _nolock() version, because the algorithm retries from a different bucket when the primary one is locked. So it's only kmalloc_nolock() flavor and it takes from generic kmalloc slab buckets with or without memcg. My understanding that c->slab is effectively a cache and in the long run all c->slab-s should be stable. A given cpu should be kmalloc-ing the memory suitable for this local cpu. In that sense deactivate_slab is a heavy hammer. kmalloc_nolock() is for users who cannot control their running context. imo such users shouldn't affect the cache property of c->slab hence ignoring node preference for !allow_spin is not great, but imo it's a better trade off than defer_deactivate_slab. defer_deactivate_slab() is there for a rare race in retry_load_slab. It can be done for !node_match(c->slab, node) too, but it feels like a worse-r evil. Especially since kmalloc_nolock() doesn't support __GFP_THISNODE.