Re: [PATCH 2/7] fuse: flush pending fuse events before aborting the connection

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Thu, 24 Jul 2025 15:34:54 -0700

On Wed, Jul 23, 2025 at 01:27:44PM -0700, Joanne Koong wrote:
> On Wed, Jul 23, 2025 at 10:06 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> >
> > On Mon, Jul 21, 2025 at 01:05:02PM -0700, Joanne Koong wrote:
> > > On Sat, Jul 19, 2025 at 12:18 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > > >
> > > > On Sat, Jul 19, 2025 at 12:23 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Jul 17, 2025 at 4:26 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > > > > >
> > > > > > generic/488 fails with fuse2fs in the following fashion:
> > > > > >
> > > > > > Unfortunately, the 488.full file shows that there are a lot of hidden
> > > > > > files left over in the filesystem, with incorrect link counts.  Tracing
> > > > > > fuse_request_* shows that there are a large number of FUSE_RELEASE
> > > > > > commands that are queued up on behalf of the unlinked files at the time
> > > > > > that fuse_conn_destroy calls fuse_abort_conn.  Had the connection not
> > > > > > aborted, the fuse server would have responded to the RELEASE commands by
> > > > > > removing the hidden files; instead they stick around.
> > > > >
> > > > > Tbh it's still weird to me that FUSE_RELEASE is asynchronous instead
> > > > > of synchronous. For example for fuse servers that cache their data and
> > > > > only write the buffer out to some remote filesystem when the file gets
> > > > > closed, it seems useful for them to (like nfs) be able to return an
> > > > > error to the client for close() if there's a failure committing that
> > > > > data; that also has clearer API semantics imo, eg users are guaranteed
> > > > > that when close() returns, all the processing/cleanup for that file
> > > > > has been completed.  Async FUSE_RELEASE also seems kind of racy, eg if
> > > > > the server holds local locks that get released in FUSE_RELEASE, if a
> > > > > subsequent FUSE_OPEN happens before FUSE_RELEASE then depends on
> > > > > grabbing that lock, then we end up deadlocked if the server is
> > > > > single-threaded.
> > > > >
> > > >
> > > > There is a very good reason for keeping FUSE_FLUSH and FUSE_RELEASE
> > > > (as well as those vfs ops) separate.
> > >
> > > Oh interesting, I didn't realize FUSE_FLUSH gets also sent on the
> > > release path. I had assumed FUSE_FLUSH was for the sync()/fsync()
> >
> > (That's FUSE_FSYNC)
> >
> > > case. But I see now that you're right, close() makes a call to
> > > filp_flush() in the vfs layer. (and I now see there's FUSE_FSYNC for
> > > the fsync() case)
> >
> > Yeah, flush-on-close (FUSE_FLUSH) is generally a good idea for
> > "unreliable" filesystems -- either because they're remote, or because
> > the local storage they're on could get yanked at any time.  It's slow,
> > but it papers over a lot of bugs and "bad" usage.
> >
> > > > A filesystem can decide if it needs synchronous close() (not release).
> > > > And with FOPEN_NOFLUSH, the filesystem can decide that per open file,
> > > > (unless it conflicts with a config like writeback cache).
> > > >
> > > > I have a filesystem which can do very slow io and some clients
> > > > can get stuck doing open;fstat;close if close is always synchronous.
> > > > I actually found the libfuse feature of async flush (FUSE_RELEASE_FLUSH)
> > > > quite useful for my filesystem, so I carry a kernel patch to support it.
> > > >
> > > > The issue of racing that you mentioned sounds odd.
> > > > First of all, who runs a single threaded fuse server?
> > > > Second, what does it matter if release is sync or async,
> > > > FUSE_RELEASE will not be triggered by the same
> > > > task calling FUSE_OPEN, so if there is a deadlock, it will happen
> > > > with sync release as well.
> > >
> > > If the server is single-threaded, I think the FUSE_RELEASE would have
> > > to happen on the same task as FUSE_OPEN, so if the release is
> > > synchronous, this would avoid the deadlock because that guarantees the
> > > FUSE_RELEASE happens before the next FUSE_OPEN.
> >
> > On a single-threaded server(!) I would hope that the release would be
> > issued to the fuse server before the open.  (I'm not sure I understand
> 
> I don't think this is 100% guaranteed if fuse sends the release
> request asynchronously rather than synchronously (eg the request gets
> stalled on the bg queue if active_background >= max_background)

Humm, that /is/ weird one.  I guess there's nothing to prevent an OPEN
from racing with a RELEASE, since those two operations concern
themselves with *files*.  I suppose that means that if a fuse server
wants to hold a lock across fuse commands, then it had better be really
careful about that.

> > where this part of the thread went, because why would that happen?  And
> > why would the fuse server hold a lock across requests?)
> 
> The fuse server holding a lock across requests example was a contrived
> one to illustrate that an async release could be racy if a fuse server
> implementation has the (standard?) expectation that release and opens
> are always received in order.

<nod> I think it's quite common, since each open() call in userspace
creates a new struct file, even though they all point to the same inode.
That might be why you can't normally open-and-lock a resource.  opens
shouldn't stall indefinitely...(?)

--D

> >
> > > However now that you pointed out FUSE_FLUSH gets sent on the release
> > > path, that addresses my worry about async FUSE_RELEASE returning
> > > before the server has gotten a chance to write out their local buffer
> > > cache.
> >
> > <nod>
> >
> > --D
> >
> > > Thanks,
> > > Joanne
> > > >
> > > > Thanks,
> > > > Amir.
> > >
>