Re: [PATCH v2 6/6] slab: Introduce kmalloc_nolock() and kfree_nolock().

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 10 Jul 2025 12:13:20 -0700




On Thu, Jul 10, 2025 at 8:05 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> On 7/10/25 12:21, Harry Yoo wrote:
> > On Thu, Jul 10, 2025 at 11:36:02AM +0200, Vlastimil Babka wrote:
> >> On 7/9/25 03:53, Alexei Starovoitov wrote:
> >>
> >> Hm but this is leaking the slab we allocated and have in the "slab"
> >> variable, we need to free it back in that case.

ohh. sorry for the silly mistake.
Re-reading the diff again I realized that I made a similar mistake in
alloc_single_from_new_slab().
It has this bit:
if (!alloc_debug_processing(...))
  return NULL;

so I assumed that doing:
if (!spin_trylock_irqsave(&n->list_lock,..))
   return NULL;

is ok too. Now I see that !alloc_debug is purposefully leaking memory.

Should we add:
@@ -2841,6 +2841,7 @@ static void *alloc_single_from_new_slab(struct
kmem_cache *s, struct slab *slab,
                 * It's not really expected that this would fail on a
                 * freshly allocated slab, but a concurrent memory
                 * corruption in theory could cause that.
+                * Leak newly allocated slab.
                 */
                return NULL;

so the next person doesn't make the same mistake?

Also help me understand...
slab->objects is never equal to 1, right?
/proc/slabinfo agrees, but I cannot decipher it through slab init code.
Logically it makes sense.
If that's the case why alloc_single_from_new_slab()
has this part:
        if (slab->inuse == slab->objects)
                add_full(s, n, slab);
        else
                add_partial(n, slab, DEACTIVATE_TO_HEAD);

Shouldn't it call add_partial() only ?
since slab->inuse == 1 and slab->objects != 1

> >
> > But it might be a partial slab taken from the list?
>
> True.
>
> > Then we need to trylock n->list_lock and if that fails, oh...
>
> So... since we succeeded taking it from the list and thus the spin_trylock,
> it means it's safe to spinlock n->list_lock again - we might be waiting on
> other cpu to unlock it but we know we didn't NMI on our own cpu having the
> lock, right? But we'd probably need to convince lockdep about this somehow,
> and also remember if we allocated a new slab or taken on from the partial
> list... or just deal with this unlikely situation in another irq work :/

irq_work might be the least mind bending.
Good point about partial vs new slab.
For partial we can indeed proceed with deactivate_slab() and if
I'm reading the code correctly, it won't have new.inuse == 0,
so it won't go to discard_slab() (which won't be safe in this path)
But teaching lockdep that below bit in deactivate_slab() is safe:
        } else if (new.freelist) {
                spin_lock_irqsave(&n->list_lock, flags);
                add_partial(n, slab, tail);
is a challenge.

Since defer_free_work is there, I'm leaning to reuse it for
deactive_slab too. It will process
static DEFINE_PER_CPU(struct llist_head, defer_free_objects);
and
static DEFINE_PER_CPU(struct llist_head, defer_deactivate_slabs);

Shouldn't be too ugly. Better ideas?