Re: [PATCH v3] alloc_tag: add per-NUMA node stats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 11, 2025 at 10:41:36AM -0700, Casey Chen wrote:
> On Thu, Jul 10, 2025 at 8:09 PM Kent Overstreet
> <kent.overstreet@xxxxxxxxx> wrote:
> >
> > On Thu, Jul 10, 2025 at 06:07:13PM -0700, Casey Chen wrote:
> > > On Thu, Jul 10, 2025 at 5:54 PM Kent Overstreet
> > > <kent.overstreet@xxxxxxxxx> wrote:
> > > >
> > > > On Thu, Jul 10, 2025 at 05:42:05PM -0700, Casey Chen wrote:
> > > > > Hi All,
> > > > >
> > > > > Thanks for reviewing my previous patches. I am replying some comments
> > > > > in our previous discussion
> > > > > https://lore.kernel.org/all/CAJuCfpHhSUhxer-6MP3503w6520YLfgBTGp7Q9Qm9kgN4TNsfw@xxxxxxxxxxxxxx/T/#u
> > > > >
> > > > > Most people care about the motivations and usages of this feature.
> > > > > Internally, we used to have systems having asymmetric memory to NUMA
> > > > > nodes. Node 0 uses a lot of memory but node 1 is pretty empty.
> > > > > Requests to allocate memory on node 0 always fail. With this patch, we
> > > > > can find the imbalance and optimize the memory usage. Also, David
> > > > > Rientjes and Sourav Panda provide their scenarios in which this patch
> > > > > would be very useful. It is easy to turn on an off so I think it is
> > > > > nice to have, enabling more scenarios in the future.
> > > > >
> > > > > Andrew / Kent,
> > > > > * I agree with Kent on using for_each_possible_cpu rather than
> > > > > for_each_online_cpu, considering CPU online/offline.
> > > > > * When failing to allocate counters for in-kernel alloc_tag, panic()
> > > > > is better than WARN(), eventually the kernel would panic at invalid
> > > > > memory access.
> > > > > * percpu stats would bloat data structures quite a bit.
> > > > >
> > > > > David Wang,
> > > > > I don't really understand what is 'granularity of calling sites'. If
> > > > > NUMA imbalance is found, the calling site could request memory
> > > > > allocation from different nodes. Other factors can affect NUMA
> > > > > balance, those information can be implemented in a different patch.
> > > >
> > > > Let's get this functionality in.
> > > >
> > > > We've already got userspace parsing and consuming /proc/allocinfo, so we
> > > > just need to do it without changing that format.
> > >
> > > You mean keep the format without per-NUMA info the same as before ?
> > > My patch v3 changed the header and the alignment of bytes and calls. I
> > > can restore them back.
> >
> > I mean an ioctl interface - so we can have a userspace program with
> > different switches for getting different types of output.
> >
> > Otherwise the existing programs people have already written for
> > consuming /proc/allocinfo are going to break.
> 
> What does this IOCTL interface do ? get bytes/calls per allocating
> site ? or get total bytes/calls per module ? or per-NUMA bytes/calls
> for each allocating site or module ?
> Would it be too much work for this patch ? If you can show me an
> example, it would be useful. I can try implementing it.

Since we're adding optional features the ioctl needs to pass in a flags
argument for which features we want - per numa node stats for now, but I
suspect more will come up (maybe we'll want to revisit number of calls
per callsite).

Return -EINVAL if we ask for something the running kernel doesn't
support...




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux