On Wed, Aug 27, 2025 at 5:08 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > On Fri, Aug 15, 2025 at 11:38 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > On Thu, Aug 14, 2025 at 9:38 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > > > > > On Thu, Jul 31, 2025 at 05:21:31PM -0700, Joanne Koong wrote: > > > > Add granular dirty and writeback accounting for large folios. These > > > > stats are used by the mm layer for dirty balancing and throttling. > > > > Having granular dirty and writeback accounting helps prevent > > > > over-aggressive balancing and throttling. > > > > > > > > There are 4 places in iomap this commit affects: > > > > a) filemap dirtying, which now calls filemap_dirty_folio_pages() > > > > b) writeback_iter with setting the wbc->no_stats_accounting bit and > > > > calling clear_dirty_for_io_stats() > > > > c) starting writeback, which now calls __folio_start_writeback() > > > > d) ending writeback, which now calls folio_end_writeback_pages() > > > > > > > > This relies on using the ifs->state dirty bitmap to track dirty pages in > > > > the folio. As such, this can only be utilized on filesystems where the > > > > block size >= PAGE_SIZE. > > > > > > Apologies for my slow responses this month. :) > > > > No worries at all, thanks for looking at this. > > > > > > I wonder, does this cause an observable change in the writeback > > > accounting and throttling behavior for non-fuse filesystems like XFS > > > that use large folios? I *think* this does actually reduce throttling > > > for XFS, but it might not be so noticeable because the limits are much > > > more generous outside of fuse? > > > > I haven't run any benchmarks on non-fuse filesystems yet but that's > > what I would expect too. Will run some benchmarks to see! > > I ran some benchmarks on xfs for the contrived test case I used for > fuse (eg writing 2 GB in 128 MB chunks and then doing 50k 50-byte > random writes) and I don't see any noticeable performance difference. > > I re-tested it on fuse but this time with strictlimiting disabled and > didn't notice any difference on that either, probably because with > strictlimiting off we don't run into the upper limit in that test so > there's no extra throttling that needs to be mitigated. > > It's unclear to me how often (if at all?) real workloads run up > against their dirty/writeback limits. > I benchmarked it again today but this time with manually setting /proc/sys/vm/dirty_bytes to 20% of 16 GiB and /proc/sys/vm/dirty_background_bytes to 10% of 16 GB and testing it on a more intense workload (the original test scenario but on 10+ threads) and and I see results now on xfs, around 3 seconds (with some variability of taking 0.3 seconds to 5 seconds sometimes) for writes prior to this patchset vs. a pretty consistent 0.14 seconds with this patchset. I ran the test scenario setup a few times but it'd be great if someone else could also run it to verify it shows up on their system too. I set up xfs by following the instructions in the xfstests readme: # xfs_io -f -c "falloc 0 10g" test.img # xfs_io -f -c "falloc 0 10g" scratch.img # mkfs.xfs test.img # losetup /dev/loop0 ./test.img # losetup /dev/loop1 ./scratch.img # mkdir -p /mnt/test && mount /dev/loop0 /mnt/test and then ran: sudo sysctl -w vm.dirty_bytes=$((3276 * 1024 * 1024)) # roughly 20% of 16GB sudo sysctl -w vm.dirty_background_bytes=$((1638*1024*1024)) # roughly 10% of 16 GB and then ran this test program (ai-generated) https://pastebin.com/CbcwTXjq I'll send out an updated v2 of this series. Thanks, Joanne