Re: [RFC] Another take at restarting FUSE servers

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Thu, 31 Jul 2025 10:29:46 -0700

On Thu, Jul 31, 2025 at 01:33:09PM +0200, Christian Brauner wrote:
> On Wed, Jul 30, 2025 at 03:04:00PM +0100, Luis Henriques wrote:
> > Hi Darrick,
> > 
> > On Tue, Jul 29 2025, Darrick J. Wong wrote:
> > 
> > > On Tue, Jul 29, 2025 at 02:56:02PM +0100, Luis Henriques wrote:
> > >> Hi!
> > >> 
> > >> I know this has been discussed several times in several places, and the
> > >> recent(ish) addition of NOTIFY_RESEND is an important step towards being
> > >> able to restart a user-space FUSE server.
> > >> 
> > >> While looking at how to restart a server that uses the libfuse lowlevel
> > >> API, I've created an RFC pull request [1] to understand whether adding
> > >> support for this operation would be something acceptable in the project.
> > >
> > > Just speaking for fuse2fs here -- that would be kinda nifty if libfuse
> > > could restart itself.  It's unclear if doing so will actually enable us
> > > to clear the condition that caused the failure in the first place, but I
> > > suppose fuse2fs /does/ have e2fsck -fy at hand.  So maybe restarts
> > > aren't totally crazy.
> > 
> > Maybe my PR lacks a bit of ambition -- it's goal wasn't to have libfuse do
> > the restart itself.  Instead, it simply adds some visibility into the
> > opaque data structures so that a FUSE server could re-initialise a session
> > without having to go through a full remount.
> > 
> > But sure, there are other things that could be added to the library as
> > well.  For example, in my current experiments, the FUSE server needs start
> > some sort of "file descriptor server" to keep the fd alive for the
> > restart.  This daemon could be optionally provided in libfuse itself,
> > which could also be used to store all sorts of blobs needed by the file
> > system after recovery is done.
> 
> Fwiw, for most use-cases you really just want to use systemd's file
> descriptor store to persist the /dev/fuse connection:
> https://systemd.io/FILE_DESCRIPTOR_STORE/

Very nice!  This is exactly what I was looking for to handle the initial
setup, so I'm glad I don't have to go design a protocol around that.

> > 
> > >> The PR doesn't do anything sophisticated, it simply hacks into the opaque
> > >> libfuse data structures so that a server could set some of the sessions'
> > >> fields.
> > >> 
> > >> So, a FUSE server simply has to save the /dev/fuse file descriptor and
> > >> pass it to libfuse while recovering, after a restart or a crash.  The
> > >> mentioned NOTIFY_RESEND should be used so that no requests are lost, of
> > >> course.  And there are probably other data structures that user-space file
> > >> systems will have to keep track as well, so that everything can be
> > >> restored.  (The parameters set in the INIT phase, for example.)
> > >
> > > Yeah, I don't know how that would work in practice.  Would the kernel
> > > send back the old connection flags and whatnot via some sort of
> > > FUSE_REINIT request, and the fuse server can either decide that it will
> > > try to recover, or just bail out?
> > 
> > That would be an option.  But my current idea would be that the server
> > would need to store those somewhere and simply assume they are still OK
> 
> The fdstore currently allows to associate a name with a file descriptor
> in the fdstore. That name would allow you to associate the options with
> the fuse connection. However, I would not rule it out that additional
> metadata could be attached to file descriptors in the fdstore if that's
> something that's needed.

Names are useful, I'd at least want "fusedev", "fsopen", and "device".

If someone passed "journal_dev=/dev/sdaX" to fuse2fs then I'd want it to
be able to tell mountfsd "Hey, can you also open /dev/sdaX and put it in
the store as 'journal_dev'?" Then it just has to wait until the fd shows
up, and it can continue with the mount process.

Though the "device" argument needn't be a path, so to be fully general
mountfsd and the fuse server would have to handshake that as well.

--D