Re: [PATCH 06/13] fuse: implement buffered IO with iomap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 18, 2025 at 09:45:17PM +0200, Amir Goldstein wrote:
> On Fri, Jul 18, 2025 at 8:01 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> >
> > On Fri, Jul 18, 2025 at 05:10:14PM +0200, Amir Goldstein wrote:
> > > On Fri, Jul 18, 2025 at 1:32 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> > > >
> > > > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > > >
> > > > Implement pagecache IO with iomap, complete with hooks into truncate and
> > > > fallocate so that the fuse server needn't implement disk block zeroing
> > > > of post-EOF and unaligned punch/zero regions.
> > > >
> > > > Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
> > > > ---
> > > >  fs/fuse/fuse_i.h          |   46 +++
> > > >  fs/fuse/fuse_trace.h      |  391 ++++++++++++++++++++++++
> > > >  include/uapi/linux/fuse.h |    5
> > > >  fs/fuse/dir.c             |   23 +
> > > >  fs/fuse/file.c            |   90 +++++-
> > > >  fs/fuse/file_iomap.c      |  723 +++++++++++++++++++++++++++++++++++++++++++++
> > > >  fs/fuse/inode.c           |   14 +
> > > >  7 files changed, 1268 insertions(+), 24 deletions(-)
> > > >
> > > >
> > > > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > > > index 67e428da4391aa..f33b348d296d5e 100644
> > > > --- a/fs/fuse/fuse_i.h
> > > > +++ b/fs/fuse/fuse_i.h
> > > > @@ -161,6 +161,13 @@ struct fuse_inode {
> > > >
> > > >                         /* waitq for direct-io completion */
> > > >                         wait_queue_head_t direct_io_waitq;
> > > > +
> > > > +#ifdef CONFIG_FUSE_IOMAP
> > > > +                       /* pending io completions */
> > > > +                       spinlock_t ioend_lock;
> > > > +                       struct work_struct ioend_work;
> > > > +                       struct list_head ioend_list;
> > > > +#endif
> > > >                 };
> > >
> > > This union member you are changing is declared for
> > > /* read/write io cache (regular file only) */
> > > but actually it is also for parallel dio and passthrough mode
> > >
> > > IIUC, there should be zero intersection between these io modes and
> > >  /* iomap cached fileio (regular file only) */
> > >
> > > Right?
> >
> > Right.  iomap will get very very confused if you switch file IO paths on
> > a live file.  I think it's /possible/ to switch if you flush and
> > truncate the whole page cache while holding inode_lock() but I don't
> > think anyone has ever tried.
> >
> > > So it can use its own union member without increasing fuse_inode size.
> > >
> > > Just need to be carefull in fuse_init_file_inode(), fuse_evict_inode() and
> > > fuse_file_io_release() which do not assume a specific inode io mode.
> >
> > Yes, I think it's possible to put the iomap stuff in a separate struct
> > within that union so that we're not increasing the fuse_inode size
> > unnecessarily.  That's desirable for something to do before merging,
> > but for now prototyping is /much/ easier if I don't have to do that.
> >
> 
> understood. you can deal with that later. I just wanted to leave a TODO note.

<nod> I'll leave an XXX comment then.

> > Making that change will require a lot of careful auditing, first I want
> > to make sure you all agree with the iomap approach because it's much
> > different from what I see in the other fuse IO paths. :)
> >
> 
> Indeed a good audit will be required, but
> *if* you can guarantee to configure iomap alway at inode initiation
> then in fuse_init_file_inode() it is clear, which member of the union
> is being initialized and this mode has to stick with the inode until
> evict anyway.
> 
> So basically, all you need to do is never allow configuring iomap on an
> already initialized inode.

Right.  iomap has to be initialized at INEW time and cannot be changed.

> > Eeeyiks, struct fuse_inode shrinks from 1272 bytes to 1152 if I push the
> > iomap stuff into its own union struct.
> >
> > > Was it your intention to allow filesystems to configure some inodes to be
> > > in file_iomap mode and other inodes to be in regular cached/direct/passthrough
> > > io modes?
> >
> > That was a deliberate design decision on my part -- maybe a fuse server
> > would be capable of serving up some files from a local disk, and others
> > from (say) a network filesystem.  Or maybe it would like to expose an
> > administrative fd for the filesystem (like the xfs_healer event stream)
> > that isn't backed by storage.
> >
> 
> Understood.
> 
> But the filesystem should be able to make the decision on inode
> initiation time (lookup)
> and once made, this decision sticks throughout the inode lifetime. Right?

Correct.

> > > I can't say that I see a big benefit in allowing such setups.
> > > It certainly adds a lot of complication to the test matrix if we allow that.
> > > My instinct is for initial version, either allow only opening files in
> > > FILE_IOMAP or
> > > DIRECT_IOMAP to inodes for a filesystem that supports those modes.
> >
> > I was thinking about combining FUSE_ATTR_IOMAP_(DIRECTIO|FILEIO) for the
> > next RFC because I can't imagine any scenario where you don't want
> > directio support if you already use iomap for the pagecache.  fuse iomap
> > requires directio write support for writeback, so the server *must*
> > support IOMAP_WRITE|IOMAP_DIRECT.
> >
> > > Perhaps later we can allow (and maybe fallback to) FOPEN_DIRECT_IO
> > > (without parallel dio) if a server does not configure IOMAP to some inode
> > > to allow a server to provide the data for a specific inode directly.
> >
> > Hrmm.  Is FOPEN_DIRECT_IO the magic flag that fuse passes to the fuse
> > server to tell it that a file is open in directio mode?  There's a few
> > fstests that initiate aio+dio writes to a dm-error device that currently
> > fail in non-iomap mode because fuse2fs writes everything to the bdev
> > pagecache.
> >
> 
> Not exactly, but nevermind, you can use a much simpler logic for what
> you described:
> iomap has to be configured on inode instantiation and never changed afterwards.
> Other inodes are not going to be affected by iomap at all from that point on.

<nod>

> > > fuse_file_io_open/release() can help you manage those restrictions and
> > > set ff->iomode = IOM_FILE_IOMAP when a file is opened for file iomap.
> > > I did not look closely enough to see if file_iomap code ends up setting
> > > ff->iomode = IOM_CACHED/UNCACHED or always remains IOM_NONE.
> >
> > I don't touch ff->iomode because iomap is a per-inode property, not a
> > per-file property... but I suppose that would be a good place to look.
> >
> 
> Right, with cached/direct/passthrough the inode may change the iomode
> after all files are closed, but we *do* keep the mode in the inode,
> so we know that files cannot be opened in conflicting modes on the same inode.
> 
> The purpose of ff->iomode is to know if the file contributes to cached mode
> positive iocachectr or to a negative passthrough mode refcount.
> 
> So setting ff->iomode = IOM_IOMAP just helps for annotating how the
> file was opened, in case we are tracing it. There is no functional need to
> define and set this mode on the file when the mode of the inode is const.

Ah ok.  I'll go add that for the next rfc, thanks!

--D

> Thanks,
> Amir.
> 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux