On 2025/7/23 19:31, Kent Overstreet wrote:
While In some scenarios, we may choose not to delve into SLUB allocation
details if initial triage indicates that SLUB memory usage is within
acceptable limits. To support this, a control knob is introduced to enable
or disable SLUB object tracking.
The "noslub" knob disables SLUB tracking, preventing further allocation of
slabobj_ext structures.
...Have there been actual scenarios where this would be useful?
We've already got a knob for memory allocation profiling as a whole;
most allocations are slub allocations, so if you're looking at memory
allocation profiling you probably want slub.
Hi Kent,
Let me elaborate a bit on the work we're doing. Some OEMs are interested
in enabling this lightweight debug feature to help identify potential
memory leaks on their devices. In the past, we depended on mechanisms
such as page owner for tracking, but due to their overhead, they were
not suitable for deployment on production devices. In response, our team
is developing a post-processing script(may need to parse source code as
well)—to classify memory usage accordingly.
One output example FYI:
version: 1.0
MemInfo : Size_KB Size_MB
slab : 440088 429.77
vmalloc : 71416 69.74
pgd : 888 0.87
pte : 104492 102.04
pmd : 12732 12.43
pageowner : 437760 427.50
module : 0 0.00
kernelStack : 54346 53.07
shmem : 18284 17.86
KDA : 188516 184.10
anon : 867120 846.80
ion : 420576 410.72
kgsl : 70328 68.68
CMA : 130992 127.92
file : 2037140 1989.39
zram : 156532 152.86
binder : 0 0.00
migrate : 0 0.00
Couldn't Parse : 17 0.02
slab_alone : 478939 467.71
In this case, we may not need to dive into slab-level details. Instead,
our initial focus should be on checking KDA(that is, pages that are
allocated but not tracked by any statistics). In other words, for a
quick snapshot, it's unnecessary to analyze slab internals. If we need
to debug specific slab leaks, we can even afford to enable slab_debug=U.
The key requirement is to make this feature suitable for deployment in
production devices, as requested by OEMs. The 16-byte per-object
overhead represents the highest cost in its current form, and we are
exploring options to optimize it.