Re: [PATCH 2/7] fuse: flush pending fuse events before aborting the connection

Amir Goldstein <amir73il@xxxxxxxxx> · Sat, 19 Jul 2025 09:18:38 +0200

On Sat, Jul 19, 2025 at 12:23 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
>
> On Thu, Jul 17, 2025 at 4:26 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> >
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> >
> > generic/488 fails with fuse2fs in the following fashion:
> >
> > generic/488       _check_generic_filesystem: filesystem on /dev/sdf is inconsistent
> > (see /var/tmp/fstests/generic/488.full for details)
> >
> > This test opens a large number of files, unlinks them (which really just
> > renames them to fuse hidden files), closes the program, unmounts the
> > filesystem, and runs fsck to check that there aren't any inconsistencies
> > in the filesystem.
> >
> > Unfortunately, the 488.full file shows that there are a lot of hidden
> > files left over in the filesystem, with incorrect link counts.  Tracing
> > fuse_request_* shows that there are a large number of FUSE_RELEASE
> > commands that are queued up on behalf of the unlinked files at the time
> > that fuse_conn_destroy calls fuse_abort_conn.  Had the connection not
> > aborted, the fuse server would have responded to the RELEASE commands by
> > removing the hidden files; instead they stick around.
>
> Tbh it's still weird to me that FUSE_RELEASE is asynchronous instead
> of synchronous. For example for fuse servers that cache their data and
> only write the buffer out to some remote filesystem when the file gets
> closed, it seems useful for them to (like nfs) be able to return an
> error to the client for close() if there's a failure committing that
> data; that also has clearer API semantics imo, eg users are guaranteed
> that when close() returns, all the processing/cleanup for that file
> has been completed.  Async FUSE_RELEASE also seems kind of racy, eg if
> the server holds local locks that get released in FUSE_RELEASE, if a
> subsequent FUSE_OPEN happens before FUSE_RELEASE then depends on
> grabbing that lock, then we end up deadlocked if the server is
> single-threaded.
>

There is a very good reason for keeping FUSE_FLUSH and FUSE_RELEASE
(as well as those vfs ops) separate.

A filesystem can decide if it needs synchronous close() (not release).
And with FOPEN_NOFLUSH, the filesystem can decide that per open file,
(unless it conflicts with a config like writeback cache).

I have a filesystem which can do very slow io and some clients
can get stuck doing open;fstat;close if close is always synchronous.
I actually found the libfuse feature of async flush (FUSE_RELEASE_FLUSH)
quite useful for my filesystem, so I carry a kernel patch to support it.

The issue of racing that you mentioned sounds odd.
First of all, who runs a single threaded fuse server?
Second, what does it matter if release is sync or async,
FUSE_RELEASE will not be triggered by the same
task calling FUSE_OPEN, so if there is a deadlock, it will happen
with sync release as well.

Thanks,
Amir.