Re: Slow deduplication

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 03, 2025 at 08:35:57AM +1100, Dave Chinner wrote:
> This does comparison one folio at a time and does no readahead.
> Hence if the data isn't already in cache, it is doing synchronous
> small reads and waiting for every single one of them. This really
> should use an internal interface that is capable of issuing
> readahead...

Yes, I noticed that if I do dummy read() of each extent first,
it becomes _massively_ faster. I'm not sure if I trust posix_fadvise()
to just to MADV_WILLNEED given the manpage; would it work (and give
roughly the same readahead that read() seems to be doing)?

After 12 hours or so of this massive I/O, seemingly the page cache fragments
really hard and I'm left using 99% in xas_* functions (on read()) until I do
drop_caches and it clears up again. I'm not sure if this is deduplication-related
or not. :-)

/* Steinar */
-- 
Homepage: https://www.sesse.net/




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux