Re: [PATCH slab v5 5/6] slab: Reuse first bit for OBJEXTS_ALLOC_FAIL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 12, 2025 at 12:27 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
>
> +Suren, Roman
>
> On Mon, Sep 08, 2025 at 06:00:06PM -0700, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@xxxxxxxxxx>
> >
> > Since the combination of valid upper bits in slab->obj_exts with
> > OBJEXTS_ALLOC_FAIL bit can never happen,
> > use OBJEXTS_ALLOC_FAIL == (1ull << 0) as a magic sentinel
> > instead of (1ull << 2) to free up bit 2.
> >
> > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
>
> Are we low on bits that we need to do this or is this good to have
> optimization but not required?

That's a good question. After this change MEMCG_DATA_OBJEXTS and
OBJEXTS_ALLOC_FAIL will have the same value and they are used with the
same field (page->memcg_data and slab->obj_exts are aliases). Even if
page_memcg_data_flags can never be used for slab pages I think
overlapping these bits is not a good idea and creates additional
risks. Unless there is a good reason to do this I would advise against
it.

>
> I do have some questions on the state of slab->obj_exts even before this
> patch for Suren, Roman, Vlastimil and others:
>
> Suppose we newly allocate struct slab for a SLAB_ACCOUNT cache and tried
> to allocate obj_exts for it which failed. The kernel will set
> OBJEXTS_ALLOC_FAIL in slab->obj_exts (Note that this can only be set for
> new slab allocation and only for SLAB_ACCOUNT caches i.e. vec allocation
> failure for memory profiling does not set this flag).
>
> Now in the post alloc hook, either through memory profiling or through
> memcg charging, we will try again to allocate the vec and before that we
> will call slab_obj_exts() on the slab which has:
>
>         unsigned long obj_exts = READ_ONCE(slab->obj_exts);
>
>         VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS), slab_page(slab));
>
> It seems like the above VM_BUG_ON_PAGE() will trigger because obj_exts
> will have OBJEXTS_ALLOC_FAIL but it should not, right? Or am I missing
> something? After the following patch we will aliasing be MEMCG_DATA_OBJEXTS
> and OBJEXTS_ALLOC_FAIL and will avoid this trigger though which also
> seems unintended.

You are correct. Current VM_BUG_ON_PAGE() will trigger if
OBJEXTS_ALLOC_FAIL is set and that is wrong. When
alloc_slab_obj_exts() fails to allocate the vector it does
mark_failed_objexts_alloc() and exits without setting
MEMCG_DATA_OBJEXTS (which it would have done if the allocation
succeeded). So, any further calls to slab_obj_exts() will generate a
warning because MEMCG_DATA_OBJEXTS is not set. I believe the proper
fix would not be to set MEMCG_DATA_OBJEXTS along with
OBJEXTS_ALLOC_FAIL because the pointer does not point to a valid
vector but to modify the warning to:

VM_BUG_ON_PAGE(obj_exts && !(obj_exts & (MEMCG_DATA_OBJEXTS |
OBJEXTS_ALLOC_FAIL)), slab_page(slab));

IOW, we expect the obj_ext to be either NULL or have either
MEMCG_DATA_OBJEXTS or OBJEXTS_ALLOC_FAIL set.
>
> Next question: OBJEXTS_ALLOC_FAIL is for memory profiling and we never
> set it when memcg is disabled and memory profiling is enabled or even
> with both memcg and memory profiling are enabled but cache does not have
> SLAB_ACCOUNT. This seems unintentional as well, right?

I'm not sure why you think OBJEXTS_ALLOC_FAIL is not set by memory
profiling (independent of CONFIG_MEMCG state).
__alloc_tagging_slab_alloc_hook()->prepare_slab_obj_exts_hook()->alloc_slab_obj_exts()
will attempt to allocate the vector and set OBJEXTS_ALLOC_FAIL if that
fails.

>
> Also I think slab_obj_exts() needs to handle OBJEXTS_ALLOC_FAIL explicitly.

Agree, so is my proposal to update the warning sounds right to you?

>
>
> > ---
> >  include/linux/memcontrol.h | 10 ++++++++--
> >  mm/slub.c                  |  2 +-
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index 785173aa0739..d254c0b96d0d 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -341,17 +341,23 @@ enum page_memcg_data_flags {
> >       __NR_MEMCG_DATA_FLAGS  = (1UL << 2),
> >  };
> >
> > +#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS
> >  #define __FIRST_OBJEXT_FLAG  __NR_MEMCG_DATA_FLAGS
> >
> >  #else /* CONFIG_MEMCG */
> >
> > +#define __OBJEXTS_ALLOC_FAIL (1UL << 0)
> >  #define __FIRST_OBJEXT_FLAG  (1UL << 0)
> >
> >  #endif /* CONFIG_MEMCG */
> >
> >  enum objext_flags {
> > -     /* slabobj_ext vector failed to allocate */
> > -     OBJEXTS_ALLOC_FAIL = __FIRST_OBJEXT_FLAG,
> > +     /*
> > +      * Use bit 0 with zero other bits to signal that slabobj_ext vector
> > +      * failed to allocate. The same bit 0 with valid upper bits means
> > +      * MEMCG_DATA_OBJEXTS.
> > +      */
> > +     OBJEXTS_ALLOC_FAIL = __OBJEXTS_ALLOC_FAIL,
> >       /* the next bit after the last actual flag */
> >       __NR_OBJEXTS_FLAGS  = (__FIRST_OBJEXT_FLAG << 1),
> >  };
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 212161dc0f29..61841ba72120 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -2051,7 +2051,7 @@ static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
> >        * objects with no tag reference. Mark all references in this
> >        * vector as empty to avoid warnings later on.
> >        */
> > -     if (obj_exts & OBJEXTS_ALLOC_FAIL) {
> > +     if (obj_exts == OBJEXTS_ALLOC_FAIL) {
> >               unsigned int i;
> >
> >               for (i = 0; i < objects; i++)
> > --
> > 2.47.3
> >





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux