Re: [PATCH 2/7] fuse: flush pending fuse events before aborting the connection

Joanne Koong <joannelkoong@xxxxxxxxx> · Mon, 21 Jul 2025 13:05:02 -0700

On Sat, Jul 19, 2025 at 12:18 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Sat, Jul 19, 2025 at 12:23 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
> >
> > On Thu, Jul 17, 2025 at 4:26 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> > >
> > > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > >
> > > generic/488 fails with fuse2fs in the following fashion:
> > >
> > > Unfortunately, the 488.full file shows that there are a lot of hidden
> > > files left over in the filesystem, with incorrect link counts.  Tracing
> > > fuse_request_* shows that there are a large number of FUSE_RELEASE
> > > commands that are queued up on behalf of the unlinked files at the time
> > > that fuse_conn_destroy calls fuse_abort_conn.  Had the connection not
> > > aborted, the fuse server would have responded to the RELEASE commands by
> > > removing the hidden files; instead they stick around.
> >
> > Tbh it's still weird to me that FUSE_RELEASE is asynchronous instead
> > of synchronous. For example for fuse servers that cache their data and
> > only write the buffer out to some remote filesystem when the file gets
> > closed, it seems useful for them to (like nfs) be able to return an
> > error to the client for close() if there's a failure committing that
> > data; that also has clearer API semantics imo, eg users are guaranteed
> > that when close() returns, all the processing/cleanup for that file
> > has been completed.  Async FUSE_RELEASE also seems kind of racy, eg if
> > the server holds local locks that get released in FUSE_RELEASE, if a
> > subsequent FUSE_OPEN happens before FUSE_RELEASE then depends on
> > grabbing that lock, then we end up deadlocked if the server is
> > single-threaded.
> >
>
> There is a very good reason for keeping FUSE_FLUSH and FUSE_RELEASE
> (as well as those vfs ops) separate.

Oh interesting, I didn't realize FUSE_FLUSH gets also sent on the
release path. I had assumed FUSE_FLUSH was for the sync()/fsync()
case. But I see now that you're right, close() makes a call to
filp_flush() in the vfs layer. (and I now see there's FUSE_FSYNC for
the fsync() case)

>
> A filesystem can decide if it needs synchronous close() (not release).
> And with FOPEN_NOFLUSH, the filesystem can decide that per open file,
> (unless it conflicts with a config like writeback cache).
>
> I have a filesystem which can do very slow io and some clients
> can get stuck doing open;fstat;close if close is always synchronous.
> I actually found the libfuse feature of async flush (FUSE_RELEASE_FLUSH)
> quite useful for my filesystem, so I carry a kernel patch to support it.
>
> The issue of racing that you mentioned sounds odd.
> First of all, who runs a single threaded fuse server?
> Second, what does it matter if release is sync or async,
> FUSE_RELEASE will not be triggered by the same
> task calling FUSE_OPEN, so if there is a deadlock, it will happen
> with sync release as well.

If the server is single-threaded, I think the FUSE_RELEASE would have
to happen on the same task as FUSE_OPEN, so if the release is
synchronous, this would avoid the deadlock because that guarantees the
FUSE_RELEASE happens before the next FUSE_OPEN.

However now that you pointed out FUSE_FLUSH gets sent on the release
path, that addresses my worry about async FUSE_RELEASE returning
before the server has gotten a chance to write out their local buffer
cache.

Thanks,
Joanne
>
> Thanks,
> Amir.