Re: [PATCH v3] alloc_tag: add per-NUMA node stats

Kent Overstreet <kent.overstreet@xxxxxxxxx> · Thu, 10 Jul 2025 20:53:45 -0400

On Thu, Jul 10, 2025 at 05:42:05PM -0700, Casey Chen wrote:
> Hi All,
> 
> Thanks for reviewing my previous patches. I am replying some comments
> in our previous discussion
> https://lore.kernel.org/all/CAJuCfpHhSUhxer-6MP3503w6520YLfgBTGp7Q9Qm9kgN4TNsfw@xxxxxxxxxxxxxx/T/#u
> 
> Most people care about the motivations and usages of this feature.
> Internally, we used to have systems having asymmetric memory to NUMA
> nodes. Node 0 uses a lot of memory but node 1 is pretty empty.
> Requests to allocate memory on node 0 always fail. With this patch, we
> can find the imbalance and optimize the memory usage. Also, David
> Rientjes and Sourav Panda provide their scenarios in which this patch
> would be very useful. It is easy to turn on an off so I think it is
> nice to have, enabling more scenarios in the future.
> 
> Andrew / Kent,
> * I agree with Kent on using for_each_possible_cpu rather than
> for_each_online_cpu, considering CPU online/offline.
> * When failing to allocate counters for in-kernel alloc_tag, panic()
> is better than WARN(), eventually the kernel would panic at invalid
> memory access.
> * percpu stats would bloat data structures quite a bit.
> 
> David Wang,
> I don't really understand what is 'granularity of calling sites'. If
> NUMA imbalance is found, the calling site could request memory
> allocation from different nodes. Other factors can affect NUMA
> balance, those information can be implemented in a different patch.

Let's get this functionality in.

We've already got userspace parsing and consuming /proc/allocinfo, so we
just need to do it without changing that format.