Re: [PATCH v3 6/7] iomap: remove old partial eof zeroing optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 14, 2025 at 10:34:17PM -0700, Darrick J. Wong wrote:
> On Mon, Jul 14, 2025 at 04:41:21PM -0400, Brian Foster wrote:
> > iomap_zero_range() optimizes the partial eof block zeroing use case
> > by force zeroing if the mapping is dirty. This is to avoid frequent
> > flushing on file extending workloads, which hurts performance.
> > 
> > Now that the folio batch mechanism provides a more generic solution
> > and is used by the only real zero range user (XFS), this isolated
> > optimization is no longer needed. Remove the unnecessary code and
> > let callers use the folio batch or fall back to flushing by default.
> > 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > Reviewed-by: Christoph Hellwig <hch@xxxxxx>
> 
> Heh, I was staring at this last Friday chasing fuse+iomap bugs in
> fallocate zerorange and straining to remember what this does.
> Is this chunk still needed if the ->iomap_begin implementation doesn't
> (or forgets to) grab the folio batch for iomap?
> 

No, the hunk removed by this patch is just an optimization. The fallback
code here flushes the range if it's dirty and retries the lookup (i.e.
picking up unwritten conversions that were pending via dirty pagecache).
That flush logic caused a performance regression in a particular
workload, so this was introduced to mitigate that regression by just
doing the zeroing for the first block or so if the folio is dirty. [1]

The reason for removing it is more just for maintainability. XFS is
really the only user here and it is changing over to the more generic
batch mechanism, which effectively provides the same optimization, so
this basically becomes dead/duplicate code. If an fs doesn't use the
batch mechanism it will just fall back to the flush and retry approach,
which can be slower but is functionally correct.

> My bug turned out to be a bug in my fuse+iomap design -- with the way
> iomap_zero_range does things, you have to flush+unmap, punch the range
> and zero the range.  If you punch and realloc the range and *then* try
> to zero the range, the new unwritten extents cause iomap to miss dirty
> pages that fuse should've unmapped.  Ooops.
> 

I don't quite follow. How do you mean it misses dirty pages?

Brian

[1] Details described in the commit log of fde4c4c3ec1c ("iomap: elide
flush from partial eof zero range").

> --D
> 
> > ---
> >  fs/iomap/buffered-io.c | 24 ------------------------
> >  1 file changed, 24 deletions(-)
> > 
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index 194e3cc0857f..d2bbed692c06 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1484,33 +1484,9 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> >  		.private	= private,
> >  	};
> >  	struct address_space *mapping = inode->i_mapping;
> > -	unsigned int blocksize = i_blocksize(inode);
> > -	unsigned int off = pos & (blocksize - 1);
> > -	loff_t plen = min_t(loff_t, len, blocksize - off);
> >  	int ret;
> >  	bool range_dirty;
> >  
> > -	/*
> > -	 * Zero range can skip mappings that are zero on disk so long as
> > -	 * pagecache is clean. If pagecache was dirty prior to zero range, the
> > -	 * mapping converts on writeback completion and so must be zeroed.
> > -	 *
> > -	 * The simplest way to deal with this across a range is to flush
> > -	 * pagecache and process the updated mappings. To avoid excessive
> > -	 * flushing on partial eof zeroing, special case it to zero the
> > -	 * unaligned start portion if already dirty in pagecache.
> > -	 */
> > -	if (!iter.fbatch && off &&
> > -	    filemap_range_needs_writeback(mapping, pos, pos + plen - 1)) {
> > -		iter.len = plen;
> > -		while ((ret = iomap_iter(&iter, ops)) > 0)
> > -			iter.status = iomap_zero_iter(&iter, did_zero);
> > -
> > -		iter.len = len - (iter.pos - pos);
> > -		if (ret || !iter.len)
> > -			return ret;
> > -	}
> > -
> >  	/*
> >  	 * To avoid an unconditional flush, check pagecache state and only flush
> >  	 * if dirty and the fs returns a mapping that might convert on
> > -- 
> > 2.50.0
> > 
> > 
> 





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux