On Fri, Aug 15, 2025 at 04:40:31PM -0700, Boris Burkov wrote: > Btrfs currently tracks its metadata pages in the page cache, using a > fake inode (fs_info->btree_inode) with offsets corresponding to where > the metadata is stored in the filesystem's full logical address space. > > A consequence of this is that when btrfs uses filemap_add_folio(), this > usage is charged to the cgroup of whichever task happens to be running > at the time. These folios don't belong to any particular user cgroup, so > I don't think it makes much sense for them to be charged in that way. > Some negative consequences as a result: > - A task can be holding some important btrfs locks, then need to lookup > some metadata and go into reclaim, extending the duration it holds > that lock for, and unfairly pushing its own reclaim pain onto other > cgroups. > - If that cgroup goes into reclaim, it might reclaim these folios a > different non-reclaiming cgroup might need soon. This is naturally > offset by LRU reclaim, but still. > > A very similar proposal to use the root cgroup was previously made by > Qu, where he eventually proposed the idea of setting it per > address_space. This makes good sense for the btrfs use case, as the > uncharged behavior should apply to all use of the address_space, not > select allocations. I.e., if someone adds another filemap_add_folio() > call using btrfs's btree_inode, we would almost certainly want the > uncharged behavior. > > Link: https://lore.kernel.org/linux-mm/b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@xxxxxxxx/ > Suggested-by: Qu Wenruo <wqu@xxxxxxxx> > Signed-off-by: Boris Burkov <boris@xxxxxx> Acked-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>