On Fri, Sep 12, 2025 at 12:27 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote: > > +Suren, Roman > > On Mon, Sep 08, 2025 at 06:00:06PM -0700, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > Since the combination of valid upper bits in slab->obj_exts with > > OBJEXTS_ALLOC_FAIL bit can never happen, > > use OBJEXTS_ALLOC_FAIL == (1ull << 0) as a magic sentinel > > instead of (1ull << 2) to free up bit 2. > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > Are we low on bits that we need to do this or is this good to have > optimization but not required? That's a good question. After this change MEMCG_DATA_OBJEXTS and OBJEXTS_ALLOC_FAIL will have the same value and they are used with the same field (page->memcg_data and slab->obj_exts are aliases). Even if page_memcg_data_flags can never be used for slab pages I think overlapping these bits is not a good idea and creates additional risks. Unless there is a good reason to do this I would advise against it. > > I do have some questions on the state of slab->obj_exts even before this > patch for Suren, Roman, Vlastimil and others: > > Suppose we newly allocate struct slab for a SLAB_ACCOUNT cache and tried > to allocate obj_exts for it which failed. The kernel will set > OBJEXTS_ALLOC_FAIL in slab->obj_exts (Note that this can only be set for > new slab allocation and only for SLAB_ACCOUNT caches i.e. vec allocation > failure for memory profiling does not set this flag). > > Now in the post alloc hook, either through memory profiling or through > memcg charging, we will try again to allocate the vec and before that we > will call slab_obj_exts() on the slab which has: > > unsigned long obj_exts = READ_ONCE(slab->obj_exts); > > VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS), slab_page(slab)); > > It seems like the above VM_BUG_ON_PAGE() will trigger because obj_exts > will have OBJEXTS_ALLOC_FAIL but it should not, right? Or am I missing > something? After the following patch we will aliasing be MEMCG_DATA_OBJEXTS > and OBJEXTS_ALLOC_FAIL and will avoid this trigger though which also > seems unintended. You are correct. Current VM_BUG_ON_PAGE() will trigger if OBJEXTS_ALLOC_FAIL is set and that is wrong. When alloc_slab_obj_exts() fails to allocate the vector it does mark_failed_objexts_alloc() and exits without setting MEMCG_DATA_OBJEXTS (which it would have done if the allocation succeeded). So, any further calls to slab_obj_exts() will generate a warning because MEMCG_DATA_OBJEXTS is not set. I believe the proper fix would not be to set MEMCG_DATA_OBJEXTS along with OBJEXTS_ALLOC_FAIL because the pointer does not point to a valid vector but to modify the warning to: VM_BUG_ON_PAGE(obj_exts && !(obj_exts & (MEMCG_DATA_OBJEXTS | OBJEXTS_ALLOC_FAIL)), slab_page(slab)); IOW, we expect the obj_ext to be either NULL or have either MEMCG_DATA_OBJEXTS or OBJEXTS_ALLOC_FAIL set. > > Next question: OBJEXTS_ALLOC_FAIL is for memory profiling and we never > set it when memcg is disabled and memory profiling is enabled or even > with both memcg and memory profiling are enabled but cache does not have > SLAB_ACCOUNT. This seems unintentional as well, right? I'm not sure why you think OBJEXTS_ALLOC_FAIL is not set by memory profiling (independent of CONFIG_MEMCG state). __alloc_tagging_slab_alloc_hook()->prepare_slab_obj_exts_hook()->alloc_slab_obj_exts() will attempt to allocate the vector and set OBJEXTS_ALLOC_FAIL if that fails. > > Also I think slab_obj_exts() needs to handle OBJEXTS_ALLOC_FAIL explicitly. Agree, so is my proposal to update the warning sounds right to you? > > > > --- > > include/linux/memcontrol.h | 10 ++++++++-- > > mm/slub.c | 2 +- > > 2 files changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 785173aa0739..d254c0b96d0d 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -341,17 +341,23 @@ enum page_memcg_data_flags { > > __NR_MEMCG_DATA_FLAGS = (1UL << 2), > > }; > > > > +#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS > > #define __FIRST_OBJEXT_FLAG __NR_MEMCG_DATA_FLAGS > > > > #else /* CONFIG_MEMCG */ > > > > +#define __OBJEXTS_ALLOC_FAIL (1UL << 0) > > #define __FIRST_OBJEXT_FLAG (1UL << 0) > > > > #endif /* CONFIG_MEMCG */ > > > > enum objext_flags { > > - /* slabobj_ext vector failed to allocate */ > > - OBJEXTS_ALLOC_FAIL = __FIRST_OBJEXT_FLAG, > > + /* > > + * Use bit 0 with zero other bits to signal that slabobj_ext vector > > + * failed to allocate. The same bit 0 with valid upper bits means > > + * MEMCG_DATA_OBJEXTS. > > + */ > > + OBJEXTS_ALLOC_FAIL = __OBJEXTS_ALLOC_FAIL, > > /* the next bit after the last actual flag */ > > __NR_OBJEXTS_FLAGS = (__FIRST_OBJEXT_FLAG << 1), > > }; > > diff --git a/mm/slub.c b/mm/slub.c > > index 212161dc0f29..61841ba72120 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2051,7 +2051,7 @@ static inline void handle_failed_objexts_alloc(unsigned long obj_exts, > > * objects with no tag reference. Mark all references in this > > * vector as empty to avoid warnings later on. > > */ > > - if (obj_exts & OBJEXTS_ALLOC_FAIL) { > > + if (obj_exts == OBJEXTS_ALLOC_FAIL) { > > unsigned int i; > > > > for (i = 0; i < objects; i++) > > -- > > 2.47.3 > >