On Thu, Jul 3, 2025 at 12:25 PM Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> wrote: > > FUSE relies on invalidate_inode_pages2() / invalidate_inode_pages2_range() > doing writeback by calling fuse_launder_folio(). While this works, it > is inefficient as each page is written back and waited for individually. > Far better to call filemap_invalidate_pages() which will do a bulk write > first, then remove the page cache. This logic makes sense to me but what about the case where writeback errors out? With invalidate_inode_pages2() / invalidate_inode_pages2_range(), the pages still get unmapped, but with filemap_invalidate_pages(), this will return an error without unmapping. AFAICT, I think we would still want the unmapping to be done if there's an error in writeback. I'm a bit confused about the difference between "each page is written back and waited for individually" vs bulk write. AFAICT, with the bulk write, which calls ->writepages(), in fuse this will still write each page/folio individually through write_cache_pages() and AFAICT, each page/folio is also waited for individually in __filemap_fdatawait_range(). With doing the bulk write, I wonder if this sometimes could be more inefficient than doing the writeback only after it's been unmapped, eg if there's writes after the filemap_invalidate_pages() writeback, we'd end up doing 2 writebacks. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > --- > fs/fuse/dax.c | 16 ++++------------ > fs/fuse/dir.c | 12 +++++++----- > fs/fuse/file.c | 16 +++++----------- > fs/fuse/inode.c | 17 +++++------------ > 4 files changed, 21 insertions(+), 40 deletions(-) > > diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c > index 2d817d7cab26..0151343d8393 100644 > --- a/fs/fuse/dir.c > +++ b/fs/fuse/dir.c > @@ -718,7 +718,8 @@ static int fuse_create_open(struct mnt_idmap *idmap, struct inode *dir, > if (fm->fc->atomic_o_trunc && trunc) > truncate_pagecache(inode, 0); > else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) > - invalidate_inode_pages2(inode->i_mapping); > + filemap_invalidate_pages(inode->i_mapping, 0, > + OFFSET_MAX, false); nit (here and below): in fuse, the line break spacing for args is aligned to be right underneath the first arg > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 2b04a142b493..eaa659c08132 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -1566,7 +1567,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > return -ENOMEM; > > if (fopen_direct_io && fc->direct_io_allow_mmap) { > - res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); > + res = filemap_invalidate_pages(mapping, pos, (pos + count - 1), nit: don't need the parentheses around pos + count - 1 > + false); > if (res) { > fuse_io_free(ia); > return res; > @@ -1580,14 +1582,6 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > inode_unlock(inode); > } > > - if (fopen_direct_io && write) { > - res = invalidate_inode_pages2_range(mapping, idx_from, idx_to); > - if (res) { > - fuse_io_free(ia); > - return res; > - } > - } I'm having trouble understanding why we can get rid of this logic here. Is this because we now do filemap_invalidate_pages() for the "if (fopen_direct_io && fc->direct_io_allow_mmap)" part above? AFAIS if fc->direct_io_allow_mmap is false and we're handling a write, we still need to call invalidate_inode_pages2_range()/filemap_invalidate_pages() here, no? > - > io->should_dirty = !write && user_backed_iter(iter); > while (count) { > ssize_t nres; > @@ -2358,7 +2352,7 @@ static int fuse_file_mmap(struct file *file, struct vm_area_struct *vma) > if ((vma->vm_flags & VM_MAYSHARE) && !fc->direct_io_allow_mmap) > return -ENODEV; > > - invalidate_inode_pages2(file->f_mapping); > + filemap_invalidate_pages(file->f_mapping, 0, OFFSET_MAX, false); > > if (!(vma->vm_flags & VM_MAYSHARE)) { > /* MAP_PRIVATE */ > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index ecb869e895ab..905b192fa12e 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -559,8 +560,6 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid, > { > struct fuse_inode *fi; > struct inode *inode; > - pgoff_t pg_start; > - pgoff_t pg_end; > > inode = fuse_ilookup(fc, nodeid, NULL); > if (!inode) > @@ -573,15 +572,9 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid, > > fuse_invalidate_attr(inode); > forget_all_cached_acls(inode); > - if (offset >= 0) { > - pg_start = offset >> PAGE_SHIFT; > - if (len <= 0) > - pg_end = -1; Should we preserve this len <= 0 logic? The len value comes from the server. If they pass in -1, that means they want the entire file invalidated. Thanks, Joanne > - else > - pg_end = (offset + len - 1) >> PAGE_SHIFT; > - invalidate_inode_pages2_range(inode->i_mapping, > - pg_start, pg_end); > - } > + if (offset >= 0) > + filemap_invalidate_pages(inode->i_mapping, offset, > + offset + len - 1, false); > iput(inode); > return 0; > } > -- > 2.47.2 >