On Thu, Jul 10, 2025 at 05:42:05PM -0700, Casey Chen wrote: > Hi All, > > Thanks for reviewing my previous patches. I am replying some comments > in our previous discussion > https://lore.kernel.org/all/CAJuCfpHhSUhxer-6MP3503w6520YLfgBTGp7Q9Qm9kgN4TNsfw@xxxxxxxxxxxxxx/T/#u > > Most people care about the motivations and usages of this feature. > Internally, we used to have systems having asymmetric memory to NUMA > nodes. Node 0 uses a lot of memory but node 1 is pretty empty. > Requests to allocate memory on node 0 always fail. With this patch, we > can find the imbalance and optimize the memory usage. Also, David > Rientjes and Sourav Panda provide their scenarios in which this patch > would be very useful. It is easy to turn on an off so I think it is > nice to have, enabling more scenarios in the future. > > Andrew / Kent, > * I agree with Kent on using for_each_possible_cpu rather than > for_each_online_cpu, considering CPU online/offline. > * When failing to allocate counters for in-kernel alloc_tag, panic() > is better than WARN(), eventually the kernel would panic at invalid > memory access. > * percpu stats would bloat data structures quite a bit. > > David Wang, > I don't really understand what is 'granularity of calling sites'. If > NUMA imbalance is found, the calling site could request memory > allocation from different nodes. Other factors can affect NUMA > balance, those information can be implemented in a different patch. Let's get this functionality in. We've already got userspace parsing and consuming /proc/allocinfo, so we just need to do it without changing that format.