Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 18, 2025 at 08:17:03AM -0400, Jeff Layton wrote:
> On Wed, 2025-06-11 at 05:34 +0100, Matthew Wilcox wrote:
> > On Mon, Jun 09, 2025 at 08:59:53PM -0700, Christoph Hellwig wrote:
> > > On Mon, Jun 09, 2025 at 10:14:44AM -0700, Darrick J. Wong wrote:
> > > > > Where "folio laundering" means calling ->launder_folio, right?
> > > > 
> > > > What does fuse use folio laundering for, anyway?  It looks to me like
> > > > the primary users are invalidate_inode_pages*.  Either the caller cares
> > > > about flushing dirty data and has called filemap_write_and_wait_range;
> > > > or it doesn't and wants to tear down the pagecache ahead of some other
> > > > operation that's going to change the file contents and doesn't care.
> > > > 
> > > > I suppose it could be useful as a last-chance operation on a dirty folio
> > > > that was dirtied after a filemap_write_and_wait_range but before
> > > > invalidate_inode_pages*?  Though for xfs we just return EBUSY and let
> > > > the caller try again (or not).  Is there a subtlety to fuse here that I
> > > > don't know about?
> > > 
> > > My memory might be betraying me, but I think willy once launched an
> > > attempt to see if we can kill launder_folio.  Adding him, and the
> > > mm and nfs lists to check if I have a point :)
> > 
> > I ... got distracted with everything else.
> > 
> > Looking at the original addition of ->launder_page (e3db7691e9f3), I
> > don't understand why we need it.  invalidate_inode_pages2() isn't
> > supposed to invalidate dirty pages, so I don't understand why nfs
> > found it necessary to do writeback from ->releasepage() instead
> > of just returning false like iomap does.
> > 
> > There's now a new question of what the hell btrfs is up to with
> > ->launder_folio, which they just added recently.
> 
> IIRC...
> 
> The problem was a race where a task could could dirty a page in a
> mmap'ed file after it had been written back but before it was unmapped
> from the pagecache.
> 
> Bear in mind that the NFS client may need write back and then
> invalidate the pagecache for a file that is still in use if it
> discovers that the inode's attributes have changed on the server.
> 
> Trond's solution was to write the page out while holding the page lock
> in this situation. I think we'd all welcome a way to avoid this race
> that didn't require launder_folio().

I think the problem is that we've left it up to the filesystem to handle
"what do we do if we've dirtied a page^W folio between, say, calling
filemap_write_and_wait_range() and calling filemap_release_folio()".
Just to make sure we're all on the same page here, this is the sample
path I'm looking at:

__iomap_dio_rw
  kiocb_invalidate_pages
    filemap_invalidate_pages
      filemap_write_and_wait_range
      invalidate_inode_pages2_range
        unmap_mapping_pages
	folio_lock
	folio_wait_writeback
	folio_unmap_invalidate
	  unmap_mapping_folio
	folio_launder
	filemap_release_folio
	if (folio_test_dirty(folio))
	  return -EBUSY;

So some filesystems opt to write back the folio which has been dirtied
(by implementing ->launder_folio) and others opt to fail (and fall back to
buffered I/O when the user has requested direct I/O).  iomap filesystems
all just "return false" for dirty folios, so it's clearly an acceptable
outcome as far as xfstests go.

The question is whether this is acceptable for all the filesystem
which implement ->launder_folio today.  Because we could just move the
folio_test_dirty() to after the folio_lock() and remove all the testing
of folio dirtiness from individual filesystems.

Or have I missed something and picked the wrong sample path for
analysing why we do/don't need to writeback folios in
invalidate_inode_pages2_range()?




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux