On Thu, Jul 10, 2025 at 5:54 PM Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote: > > On Thu, Jul 10, 2025 at 05:42:05PM -0700, Casey Chen wrote: > > Hi All, > > > > Thanks for reviewing my previous patches. I am replying some comments > > in our previous discussion > > https://lore.kernel.org/all/CAJuCfpHhSUhxer-6MP3503w6520YLfgBTGp7Q9Qm9kgN4TNsfw@xxxxxxxxxxxxxx/T/#u > > > > Most people care about the motivations and usages of this feature. > > Internally, we used to have systems having asymmetric memory to NUMA > > nodes. Node 0 uses a lot of memory but node 1 is pretty empty. > > Requests to allocate memory on node 0 always fail. With this patch, we > > can find the imbalance and optimize the memory usage. Also, David > > Rientjes and Sourav Panda provide their scenarios in which this patch > > would be very useful. It is easy to turn on an off so I think it is > > nice to have, enabling more scenarios in the future. > > > > Andrew / Kent, > > * I agree with Kent on using for_each_possible_cpu rather than > > for_each_online_cpu, considering CPU online/offline. > > * When failing to allocate counters for in-kernel alloc_tag, panic() > > is better than WARN(), eventually the kernel would panic at invalid > > memory access. > > * percpu stats would bloat data structures quite a bit. > > > > David Wang, > > I don't really understand what is 'granularity of calling sites'. If > > NUMA imbalance is found, the calling site could request memory > > allocation from different nodes. Other factors can affect NUMA > > balance, those information can be implemented in a different patch. > > Let's get this functionality in. > > We've already got userspace parsing and consuming /proc/allocinfo, so we > just need to do it without changing that format. You mean keep the format without per-NUMA info the same as before ? My patch v3 changed the header and the alignment of bytes and calls. I can restore them back. - seq_buf_printf(buf, "# <size> <calls> <tag info>\n"); + seq_buf_printf(buf, "<size> <calls> <tag info>\n"); - seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); + seq_buf_printf(out, "%-12lli %-8llu ", bytes, counter.calls);