Re: [PATCH] fanotify: support custom default close response

Amir Goldstein <amir73il@xxxxxxxxx> · Wed, 2 Jul 2025 18:56:27 +0200

On Wed, Jul 2, 2025 at 6:15 PM Jan Kara <jack@xxxxxxx> wrote:
>
> On Mon 30-06-25 17:56:00, Amir Goldstein wrote:
> > On Mon, Jun 30, 2025 at 5:36 PM Jan Kara <jack@xxxxxxx> wrote:
> > > On Tue 24-06-25 08:30:03, Amir Goldstein wrote:
> > > > On Mon, Jun 23, 2025 at 9:26 PM Ibrahim Jirdeh <ibrahimjirdeh@xxxxxxxx> wrote:
> > > > >
> > > > > Currently the default response for pending events is FAN_ALLOW.
> > > > > This makes default close response configurable. The main goal
> > > > > of these changes would be to provide better handling for pending
> > > > > events for lazy file loading use cases which may back fanotify
> > > > > events by a long-lived daemon. For earlier discussion see:
> > > > > https://lore.kernel.org/linux-fsdevel/6za2mngeqslmqjg3icoubz37hbbxi6bi44canfsg2aajgkialt@c3ujlrjzkppr/
> > > >
> > > > These lore links are typically placed at the commit message tail block
> > > > if related to a suggestion you would typically use:
> > > >
> > > > Suggested-by: Amir Goldstein <amir73il@xxxxxxxxx>
> > > > Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxi6PvAcT1vL0d0e+7YjvkfU-kwFVVMAN-tc-FKXe1wtSg@xxxxxxxxxxxxxx/
> > > > Signed-off-by: Ibrahim Jirdeh <ibrahimjirdeh@xxxxxxxx>
> > > >
> > > > This way reviewers whose response is "what a terrible idea!" can
> > > > point their arrows at me instead of you ;)
> > > >
> > > > Note that this is a more accurate link to the message where the default
> > > > response API was proposed, so readers won't need to sift through
> > > > this long thread to find the reference.
> > >
> > > I've reread that thread to remember how this is supposed to be used. After
> > > thinking about it now maybe we could just modify how pending fanotify
> > > events behave in case of group destruction? Instead of setting FAN_ALLOW in
> > > fanotify_release() we'd set a special event state that will make fanotify
> > > group iteration code bubble up back to fsnotify() and restart the event
> > > generation loop there?
> > >
> > > In the usual case this would behave the same way as setting FAN_ALLOW (just
> > > in case of multiple permission event watchers some will get the event two
> > > times which shouldn't matter). In case of careful design with fd store
> > > etc., the daemon can setup the new notification group as needed and then
> > > close the fd from the old notification group at which point it would
> > > receive all the pending events in the new group. I can even see us adding
> > > ioctl to the fanotify group which when called will result in the same
> > > behavior (i.e., all pending permission events will get the "retry"
> > > response). That way the new daemon could just take the old fd from the fd
> > > store and call this ioctl to receive all the pending events again.
> > >
> > > No need for the new default response. We probably need to provide a group
> > > feature flag for this so that userspace can safely query kernel support for
> > > this behavior but otherwise even that wouldn't be really needed.
> > >
> > > What do you guys think?
> >
> > With proper handover I am not sure why this is needed, because:
> > - new group gets fd from store and signals old group
> > - old group does not take any new event, completes in-flight events,
> >   signals back new group and exists
> > - new group starts processing events
> > - so why do we need a complex mechanism in kernel to do what can easily
> >   be done in usersapce?
>
> This works for clean handover (say service update) - no need for any
> mechanism (neither retry nor default response there). We are in agreement
> here. If retry is supported, it will make the handover somewhat simpler for
> userspace but that's not really substantial.
>
> > Also, regardless I think that we need the default response, because:
> > - groups starts, set default response to DENY and handsover fd to store
> > - if group crashes unexpectedly, access to all files is denied, which is
> >   exactly what we needed to do with the "moderated" mount
> > - it is similar to access to FUSE mount when server is down
>
> Yes, crashes are what I had in mind. With crashes you have nobody to
> cleanly handle events still pending for the old group and you have to solve
> it. Reporting FAN_DENY (through default response) is one way, what I
> suggest with retry has the advantage that userspace doesn't have to deal
> with spurious FAN_DENY errors in case of daemon crash. It is not a huge
> benefit (crashes should better be rare ;)) but it is IMO a benefit.
>
> Now regarding your comment about moderated mount: You are somewhat terse on
> details so let me try to fill in. First let's differentiate between service
> (daemon) and the notification group because they may have a different
> lifetime. So the service starts, creates a notification group, places mark
> on the sb with pre-content events. You didn't mention a mark in your
> description but until the mark is set, the group receives no events so
> there's nothing to respond to. Now if the service crashes there are two
> options:
>
> 1) You didn't save your group fd anywhere. In that case the group and mark
> is gone with the crash of the service, all accesses happening after this
> moment are allowed => not good. Whether we have default response or not
> doesn't really matter in this case for those few events that were possibly
> pending. Agreed so far?

Yes.

>
> 2) You've saved group fd to fd store. In this case the group is (from
> kernel POV) fully alive even after the service crash and the default
> response does not activate. The events will be queued to the group and
> waiting for reply. No access to the fs is allowed to happen which is good.

Well this was my mistake.
I somehow forgot that the events would be blocked in this case.
I had imagined the desired "moderated mount" behavior where
events would be auto denied.

But that would require more work.
It would require decoupling the "control" fd which identifies the group
and can be used to setup marks and for ioctls from the "queue" fd that
is used to reading events and writing responses.

> Eventually the new service starts and we are in the situation I describe 3
> paragraphs above about handling pending events.
>
> So if we'd implement resending of pending events after group closure, I
> don't see how default response (at least in its current form) would be
> useful for anything.
>
> Why I like the proposal of resending pending events:
> a) No spurious FAN_DENY errors in case of service crash
> b) No need for new concept (and API) for default response, just a feature
>    flag.
> c) With additional ioctl to trigger resending pending events without group
>    closure, the newly started service can simply reuse the
>    same notification group (even in case of old service crash) thus
>    inheriting all placed marks (which is something Ibrahim would like to
>    have).

Yes I see the benefits now.

>
> Regarding complexity of the approach I propose the attached (untested)
> patch should implement it and I don't think it is very complex logic...
> So what do you think now? :)
>

Does not look complex at all :)
so no objections.

Thanks,
Amir.