Re: [PATCH] xfs: export buffer cache usage via stats

Wengang Wang <wen.gang.wang@xxxxxxxxxx> · Wed, 30 Apr 2025 15:25:05 +0000

Hi Dave,

Thanks for advising. I will try to dump size histolgram in next drop.

Wengang 

> On Apr 29, 2025, at 7:38 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
> On Mon, Apr 28, 2025 at 11:11:35AM -0700, Wengang Wang wrote:
>> This patch introduces new fields to per-mount and global stats,
>> and export them to user space.
>> 
>> @page_alloc -- number of pages allocated from buddy to buffer cache
>> @page_free -- number of pages freed to buddy from buffer cache
>> @kbb_alloc -- number of BBs allocated from kmalloc slab to buffer cache
>> @kbb_free -- number of BBs freed to kmalloc slab from buffer cache
>> @vbb_alloc -- number of BBs allocated from vmalloc system to buffer cache
>> @vbb_free -- number of BBs freed to vmalloc system from buffer cache
> 
> This forms a permanent user API once created, so exposing internal
> implementation details like this doesn't make me feel good. We've
> changed how we allocate memory for buffers quite a bit recently
> to do things like support large folios and minimise vmap usage,
> then to use vmalloc instead of vmap, etc. e.g. we don't use pages
> at all in the buffer cache anymore..
> 
> I'm actually looking further simplifying the implementation - I
> think the custom folio/vmalloc stuff can be replaced entirely by a
> single call to kvmalloc() now, which means some stuff will come from
> slabs, some from the buddy and some from vmalloc. We won't know
> where it comes from at all, and if this stats interface already
> existed then such a change would render it completely useless.
> 
>> By looking at above stats fields, user space can easily know the buffer
>> cache usage.
> 
> Not easily - the implementation only aggregates alloc/free values so
> the user has to manually do the (alloc - free) calculation to
> determine how much memory is currenlty in use.  And then we don't
> really know what size buffers are actually using that memory...
> 
> i.e. buffers for everything other than xattrs are fixed sizes (single
> sector, single block, directory block, inode cluster), so it makes
> make more sense to me to dump a buffer size histogram for memory
> usage. We can infer things like inode cluster memory usage from such
> output, so not only would we get memory usage we also get some
> insight into what is consuming the memory.
> 
> Hence I think it would be better to track a set of buffer size based
> buckets so we get output something like:
> 
> buffer size count Total Bytes
> ----------- ----- -----------
> < 4kB <n> <aggregate count of b_length>
> 4kB
> <= 8kB
> <= 16kB
> <= 32kB
> <= 64kB
> 
> I also think that it might be better to dump this in a separate
> sysfs file rather than add it to the existing stats file.
> 
> With this information on any given system, we can infer what
> allocated from slab based on the buffer sizes and system PAGE_SIZE.
> 
> However, my main point is that for the general case of "how much
> memory is in use by the buffer cache", we really don't want to tie
> it to the internal allocation implementation. A histogram output like the
> above is not tied to the internal implementation, whilst giving
> additional insight into what size allocations are generating all the
> memory usage...
> 
> -Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx