On Mon, Apr 28, 2025 at 11:11:35AM -0700, Wengang Wang wrote: > This patch introduces new fields to per-mount and global stats, > and export them to user space. > > @page_alloc -- number of pages allocated from buddy to buffer cache > @page_free -- number of pages freed to buddy from buffer cache > @kbb_alloc -- number of BBs allocated from kmalloc slab to buffer cache > @kbb_free -- number of BBs freed to kmalloc slab from buffer cache > @vbb_alloc -- number of BBs allocated from vmalloc system to buffer cache > @vbb_free -- number of BBs freed to vmalloc system from buffer cache This forms a permanent user API once created, so exposing internal implementation details like this doesn't make me feel good. We've changed how we allocate memory for buffers quite a bit recently to do things like support large folios and minimise vmap usage, then to use vmalloc instead of vmap, etc. e.g. we don't use pages at all in the buffer cache anymore.. I'm actually looking further simplifying the implementation - I think the custom folio/vmalloc stuff can be replaced entirely by a single call to kvmalloc() now, which means some stuff will come from slabs, some from the buddy and some from vmalloc. We won't know where it comes from at all, and if this stats interface already existed then such a change would render it completely useless. > By looking at above stats fields, user space can easily know the buffer > cache usage. Not easily - the implementation only aggregates alloc/free values so the user has to manually do the (alloc - free) calculation to determine how much memory is currenlty in use. And then we don't really know what size buffers are actually using that memory... i.e. buffers for everything other than xattrs are fixed sizes (single sector, single block, directory block, inode cluster), so it makes make more sense to me to dump a buffer size histogram for memory usage. We can infer things like inode cluster memory usage from such output, so not only would we get memory usage we also get some insight into what is consuming the memory. Hence I think it would be better to track a set of buffer size based buckets so we get output something like: buffer size count Total Bytes ----------- ----- ----------- < 4kB <n> <aggregate count of b_length> 4kB <= 8kB <= 16kB <= 32kB <= 64kB I also think that it might be better to dump this in a separate sysfs file rather than add it to the existing stats file. With this information on any given system, we can infer what allocated from slab based on the buffer sizes and system PAGE_SIZE. However, my main point is that for the general case of "how much memory is in use by the buffer cache", we really don't want to tie it to the internal allocation implementation. A histogram output like the above is not tied to the internal implementation, whilst giving additional insight into what size allocations are generating all the memory usage... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx