Re: [PATCH 08/14] libfuse: connect high level fuse library to fuse_reply_attr_iflags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 21, 2025 at 06:51:00PM +0000, Bernd Schubert wrote:
> On 7/18/25 17:55, Darrick J. Wong wrote:
> > On Fri, Jul 18, 2025 at 04:27:50PM +0200, Amir Goldstein wrote:
> >> On Fri, Jul 18, 2025 at 1:36 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> >>>
> >>> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> >>>
> >>> Create a new ->getattr_iflags function so that iomap filesystems can set
> >>> the appropriate in-kernel inode flags on instantiation.
> >>>
> >>> Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
> > 
> > <snip for brevity>
> > 
> >>> diff --git a/lib/fuse.c b/lib/fuse.c
> >>> index 8dbf88877dd37c..685d0181e569d0 100644
> >>> --- a/lib/fuse.c
> >>> +++ b/lib/fuse.c
> >>> @@ -3710,14 +3832,19 @@ static int readdir_fill_from_list(fuse_req_t req, struct fuse_dh *dh,
> >>>                         if (de->flags & FUSE_FILL_DIR_PLUS &&
> >>>                             !is_dot_or_dotdot(de->name)) {
> >>>                                 res = do_lookup(dh->fuse, dh->nodeid,
> >>> -                                               de->name, &e);
> >>> +                                               de->name, &e, &iflags);
> >>>                                 if (res) {
> >>>                                         dh->error = res;
> >>>                                         return 1;
> >>>                                 }
> >>>                         }
> >>>
> >>> -                       thislen = fuse_add_direntry_plus(req, p, rem,
> >>> +                       if (f->want_iflags)
> >>> +                               thislen = fuse_add_direntry_plus_iflags(req, p,
> >>> +                                                        rem, de->name, iflags,
> >>> +                                                        &e, pos);
> >>> +                       else
> >>> +                               thislen = fuse_add_direntry_plus(req, p, rem,
> >>>                                                          de->name, &e, pos);
> >>
> >>
> >> All those conditional statements look pretty moot.
> >> Can't we just force iflags to 0 if (!f->want_iflags)
> >> and always call the *_iflags functions?
> > 
> > Heh, it already is zero, so yes, this could be a straight call to
> > fuse_add_direntry_plus_iflags without the want_iflags check.  Will fix
> > up this and the other thing you mentioned in the previous patch.
> > 
> > Thanks for the code review!
> > 
> > Having said that, the significant difficulties with iomap and the
> > upper level fuse library still exist.  To summarize -- upper libfuse has
> > its own nodeids which don't necssarily correspond to the filesystem's,
> > and struct node/nodeid are duplicated for hardlinked files.  As a
> > result, the kernel has multiple struct inodes for an ondisk ext4 inode,
> > which completely breaks the locking for the iomap file IO model.
> > 
> > That forces me to port fuse2fs to the lowlevel library, so I might
> > remove the lib/fuse.c patches entirely.  Are there plans to make the
> > upper libfuse handle hardlinks better?
> 
> I don't have plans for high level improvements. To be honest, I didn't
> know about the hard link issue at all.

Assuming "I didn't know" means you're not familiar with what I'm
talking about, let me provide a brief overview:

So you know how fuse.c implements a directory entry cache in
fuse::name_table?  Every time someone uses the cache to walk a path and
misses a path, it'll alloc_node() a new struct node, hash it, and add it
to the name_table.

Allocating a node assigns a new nodeid, which is then passed into the
kernel and the kernel uses the nodeid to index the struct fuse_inode
objects.

Unfortunately, if the filesystem supports hardlinks, the name_table
creates two nodeids for the same ondisk inode.  IOWs, if the directory
tree is:

$ <mount fuse server>
$ mkdir /mnt/a /mnt/b
$ touch /mnt/a/foo
$ ln /mnt/a/foo /mnt/b/bar
$ umount /mnt
$ <mount fuse server>
$ ls /mnt/a/foo /mnt/b/bar

Then the fuse library will create one struct node for foo and another
one for bar.  They both refer to the same ondisk inode, but in memory
they have separate nodeids and hence separate struct fuse_inodes in the
kernel.

For a regular fuse server (no writeback caching, no iomap) this works
out because all the file IO requests get forwarded to the fuse server.
If the server is sane it'll coordinate access to its internal inode
structure to process the requests.  fuse is careful enough to revalidate
the cached file attributes very frequently, so out of date metadata is
barely noticeable.

For a fuse+iomap server, having separate fuse_inodes for the same ondisk
inode isn't going to work because iomap relies on i_rwsem in the kernel
struct fuse_inode to coordinate writes among all writer threads, no
matter what path they used to open the file.

> Also a bit surprising to see all your lowlevel work and then fuse high
> level coming ;)

Right now fuse2fs is a high level fuse server, so I hacked whatever I
needed into fuse.c to make it sort of work, awkwardly.  That stuff
doesn't need to live forever.

In the long run, the lowlevel server will probably have better
performance because fuse2fs++ can pass ext2 inode numbers to the kernel
as the nodeids, and libext2fs can look up inodes via nodeid.  No more
path construction overhead!

> Btw, I will go on vacation on Wednesday and still other things queued,
> going to try to review in the evenings (but not before next Saturday).

<nod> Enjoy your vacation!

--D

> 
> 
> Cheers,
> Bernd




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux