Re: [PATCH v3 5/6] slab: Introduce kmalloc_nolock() and kfree_nolock().

Harry Yoo <harry.yoo@xxxxxxxxxx> · Wed, 27 Aug 2025 14:13:14 +0900

On Tue, Aug 26, 2025 at 07:31:34PM -0700, Alexei Starovoitov wrote:
> On Sun, Aug 24, 2025 at 9:46 PM Harry Yoo <harry.yoo@xxxxxxxxxx> wrote:
> >
> > On Tue, Jul 15, 2025 at 07:29:49PM -0700, Alexei Starovoitov wrote:
> > > From: Alexei Starovoitov <ast@xxxxxxxxxx>
> > >
> > > kmalloc_nolock() relies on ability of local_lock to detect the situation
> > > when it's locked.
> > > In !PREEMPT_RT local_lock_is_locked() is true only when NMI happened in
> > > irq saved region that protects _that specific_ per-cpu kmem_cache_cpu.
> > > In that case retry the operation in a different kmalloc bucket.
> > > The second attempt will likely succeed, since this cpu locked
> > > different kmem_cache_cpu.
> > >
> > > Similarly, in PREEMPT_RT local_lock_is_locked() returns true when
> > > per-cpu rt_spin_lock is locked by current task. In this case re-entrance
> > > into the same kmalloc bucket is unsafe, and kmalloc_nolock() tries
> > > a different bucket that is most likely is not locked by the current
> > > task. Though it may be locked by a different task it's safe to
> > > rt_spin_lock() on it.
> > >
> > > Similar to alloc_pages_nolock() the kmalloc_nolock() returns NULL
> > > immediately if called from hard irq or NMI in PREEMPT_RT.
> > >
> > > kfree_nolock() defers freeing to irq_work when local_lock_is_locked()
> > > and in_nmi() or in PREEMPT_RT.
> > >
> > > SLUB_TINY config doesn't use local_lock_is_locked() and relies on
> > > spin_trylock_irqsave(&n->list_lock) to allocate while kfree_nolock()
> > > always defers to irq_work.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
> > > ---
> > >  include/linux/kasan.h |  13 +-
> > >  include/linux/slab.h  |   4 +
> > >  mm/Kconfig            |   1 +
> > >  mm/kasan/common.c     |   5 +-
> > >  mm/slab.h             |   6 +
> > >  mm/slab_common.c      |   3 +
> > >  mm/slub.c             | 454 +++++++++++++++++++++++++++++++++++++-----
> > >  7 files changed, 434 insertions(+), 52 deletions(-)
> >
> > > +static void defer_free(struct kmem_cache *s, void *head)
> > > +{
> > > +     struct defer_free *df = this_cpu_ptr(&defer_free_objects);
> > > +
> > > +     if (llist_add(head + s->offset, &df->objects))
> > > +             irq_work_queue(&df->work);
> > > +}
> > > +
> > > +static void defer_deactivate_slab(struct slab *slab)
> > > +{
> > > +     struct defer_free *df = this_cpu_ptr(&defer_free_objects);
> > > +
> > > +     if (llist_add(&slab->llnode, &df->slabs))
> > > +             irq_work_queue(&df->work);
> > > +}
> > > +
> > > +void defer_free_barrier(void)
> > > +{
> > > +     int cpu;
> > > +
> > > +     for_each_possible_cpu(cpu)
> > > +             irq_work_sync(&per_cpu_ptr(&defer_free_objects, cpu)->work);
> > > +}
> >
> > I think it should also initiate deferred frees, if kfree_nolock() freed
> > the last object in some CPUs?
> 
> I don't understand the question. "the last object in some CPU" ?
> Are you asking about the need of defer_free_barrier() ?

My bad. It slipped my mind.

I thought objects freed via kfree_nolock() are not freed before
a following kfree(), but since we've switched to IRQ work,
that's not the case anymore.

> PS
> I just got back from 2+ week PTO. Going through backlog.

Hope you enjoyed your PTO!

-- 
Cheers,
Harry / Hyeonggon