On Mon, Mar 03, 2025 at 08:35:57AM +1100, Dave Chinner wrote: > This does comparison one folio at a time and does no readahead. > Hence if the data isn't already in cache, it is doing synchronous > small reads and waiting for every single one of them. This really > should use an internal interface that is capable of issuing > readahead... Yes, I noticed that if I do dummy read() of each extent first, it becomes _massively_ faster. I'm not sure if I trust posix_fadvise() to just to MADV_WILLNEED given the manpage; would it work (and give roughly the same readahead that read() seems to be doing)? After 12 hours or so of this massive I/O, seemingly the page cache fragments really hard and I'm left using 99% in xas_* functions (on read()) until I do drop_caches and it clears up again. I'm not sure if this is deduplication-related or not. :-) /* Steinar */ -- Homepage: https://www.sesse.net/