On Thu, Jul 17, 2025 at 4:26 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > generic/488 fails with fuse2fs in the following fashion: > > generic/488 _check_generic_filesystem: filesystem on /dev/sdf is inconsistent > (see /var/tmp/fstests/generic/488.full for details) > > This test opens a large number of files, unlinks them (which really just > renames them to fuse hidden files), closes the program, unmounts the > filesystem, and runs fsck to check that there aren't any inconsistencies > in the filesystem. > > Unfortunately, the 488.full file shows that there are a lot of hidden > files left over in the filesystem, with incorrect link counts. Tracing > fuse_request_* shows that there are a large number of FUSE_RELEASE > commands that are queued up on behalf of the unlinked files at the time > that fuse_conn_destroy calls fuse_abort_conn. Had the connection not > aborted, the fuse server would have responded to the RELEASE commands by > removing the hidden files; instead they stick around. Tbh it's still weird to me that FUSE_RELEASE is asynchronous instead of synchronous. For example for fuse servers that cache their data and only write the buffer out to some remote filesystem when the file gets closed, it seems useful for them to (like nfs) be able to return an error to the client for close() if there's a failure committing that data; that also has clearer API semantics imo, eg users are guaranteed that when close() returns, all the processing/cleanup for that file has been completed. Async FUSE_RELEASE also seems kind of racy, eg if the server holds local locks that get released in FUSE_RELEASE, if a subsequent FUSE_OPEN happens before FUSE_RELEASE then depends on grabbing that lock, then we end up deadlocked if the server is single-threaded. I saw in your first patch that sending FUSE_RELEASE synchronously leads to a deadlock under AIO but AFAICT, that happens because we execute req->args->end() in fuse_request_end() synchronously; I think if we execute that release asynchronously on a worker thread then that gets rid of the deadlock. If FUSE_RELEASE must be asynchronous though, then your approach makes sense to me. > > Create a function to push all the background requests to the queue and > then wait for the number of pending events to hit zero, and call this > before fuse_abort_conn. That way, all the pending events are processed > by the fuse server and we don't end up with a corrupt filesystem. > > Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx> > --- > fs/fuse/fuse_i.h | 6 ++++++ > fs/fuse/dev.c | 38 ++++++++++++++++++++++++++++++++++++++ > fs/fuse/inode.c | 1 + > 3 files changed, 45 insertions(+) > > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > +/* > + * Flush all pending requests and wait for them. Only call this function when > + * it is no longer possible for other threads to add requests. > + */ > +void fuse_flush_requests(struct fuse_conn *fc, unsigned long timeout) It might be worth renaming this to something like 'fuse_flush_bg_requests' to make it more clear that this only flushes background requests