Re: [PATCH 06/13] fuse: implement buffered IO with iomap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 18, 2025 at 05:10:14PM +0200, Amir Goldstein wrote:
> On Fri, Jul 18, 2025 at 1:32 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> >
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> >
> > Implement pagecache IO with iomap, complete with hooks into truncate and
> > fallocate so that the fuse server needn't implement disk block zeroing
> > of post-EOF and unaligned punch/zero regions.
> >
> > Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
> > ---
> >  fs/fuse/fuse_i.h          |   46 +++
> >  fs/fuse/fuse_trace.h      |  391 ++++++++++++++++++++++++
> >  include/uapi/linux/fuse.h |    5
> >  fs/fuse/dir.c             |   23 +
> >  fs/fuse/file.c            |   90 +++++-
> >  fs/fuse/file_iomap.c      |  723 +++++++++++++++++++++++++++++++++++++++++++++
> >  fs/fuse/inode.c           |   14 +
> >  7 files changed, 1268 insertions(+), 24 deletions(-)
> >
> >
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index 67e428da4391aa..f33b348d296d5e 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -161,6 +161,13 @@ struct fuse_inode {
> >
> >                         /* waitq for direct-io completion */
> >                         wait_queue_head_t direct_io_waitq;
> > +
> > +#ifdef CONFIG_FUSE_IOMAP
> > +                       /* pending io completions */
> > +                       spinlock_t ioend_lock;
> > +                       struct work_struct ioend_work;
> > +                       struct list_head ioend_list;
> > +#endif
> >                 };
> 
> This union member you are changing is declared for
> /* read/write io cache (regular file only) */
> but actually it is also for parallel dio and passthrough mode
> 
> IIUC, there should be zero intersection between these io modes and
>  /* iomap cached fileio (regular file only) */
> 
> Right?

Right.  iomap will get very very confused if you switch file IO paths on
a live file.  I think it's /possible/ to switch if you flush and
truncate the whole page cache while holding inode_lock() but I don't
think anyone has ever tried.

> So it can use its own union member without increasing fuse_inode size.
> 
> Just need to be carefull in fuse_init_file_inode(), fuse_evict_inode() and
> fuse_file_io_release() which do not assume a specific inode io mode.

Yes, I think it's possible to put the iomap stuff in a separate struct
within that union so that we're not increasing the fuse_inode size
unnecessarily.  That's desirable for something to do before merging,
but for now prototyping is /much/ easier if I don't have to do that.

Making that change will require a lot of careful auditing, first I want
to make sure you all agree with the iomap approach because it's much
different from what I see in the other fuse IO paths. :)

Eeeyiks, struct fuse_inode shrinks from 1272 bytes to 1152 if I push the
iomap stuff into its own union struct.

> Was it your intention to allow filesystems to configure some inodes to be
> in file_iomap mode and other inodes to be in regular cached/direct/passthrough
> io modes?

That was a deliberate design decision on my part -- maybe a fuse server
would be capable of serving up some files from a local disk, and others
from (say) a network filesystem.  Or maybe it would like to expose an
administrative fd for the filesystem (like the xfs_healer event stream)
that isn't backed by storage.

> I can't say that I see a big benefit in allowing such setups.
> It certainly adds a lot of complication to the test matrix if we allow that.
> My instinct is for initial version, either allow only opening files in
> FILE_IOMAP or
> DIRECT_IOMAP to inodes for a filesystem that supports those modes.

I was thinking about combining FUSE_ATTR_IOMAP_(DIRECTIO|FILEIO) for the
next RFC because I can't imagine any scenario where you don't want
directio support if you already use iomap for the pagecache.  fuse iomap
requires directio write support for writeback, so the server *must*
support IOMAP_WRITE|IOMAP_DIRECT.

> Perhaps later we can allow (and maybe fallback to) FOPEN_DIRECT_IO
> (without parallel dio) if a server does not configure IOMAP to some inode
> to allow a server to provide the data for a specific inode directly.

Hrmm.  Is FOPEN_DIRECT_IO the magic flag that fuse passes to the fuse
server to tell it that a file is open in directio mode?  There's a few
fstests that initiate aio+dio writes to a dm-error device that currently
fail in non-iomap mode because fuse2fs writes everything to the bdev
pagecache.

> fuse_file_io_open/release() can help you manage those restrictions and
> set ff->iomode = IOM_FILE_IOMAP when a file is opened for file iomap.
> I did not look closely enough to see if file_iomap code ends up setting
> ff->iomode = IOM_CACHED/UNCACHED or always remains IOM_NONE.

I don't touch ff->iomode because iomap is a per-inode property, not a
per-file property... but I suppose that would be a good place to look.

Thank you for the feedback!

--D




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux