On Wed, Jul 16, 2025 at 10:55 AM Jan Kara <jack@xxxxxxx> wrote: > > On Tue 15-07-25 16:50:00, Amir Goldstein wrote: > > On Tue, Jul 15, 2025 at 2:11 PM Jan Kara <jack@xxxxxxx> wrote: > > > On Mon 14-07-25 21:59:22, Amir Goldstein wrote: > > > > On Mon, Jul 14, 2025 at 7:25 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > I don't think there is much to lose from this retry behavior. > > > > > > The only reason we want to opt-in for it is to avoid surprises of > > > > > > behavior change in existing deployments. > > > > > > > > > > > > While we could have FAN_RETRY_UNANSWERED as an > > > > > > independent feature without a handover ioctl, > > > > > > In order to avoid test matrix bloat, at least for a start (we can relax later), > > > > > > I don't think that we should allow it as an independent feature > > > > > > and especially not for legacy modes (i.e. for Anti-Virus) unless there > > > > > > is a concrete user requesting/testing these use cases. > > > > > > > > > > With queue-fd design I agree there's no reason not to have the "resend > > > > > pending events" behavior from the start. > > > > > > > > > > > Going on about feature dependency. > > > > > > > > > > > > Practically, a handover ioctl is useless without > > > > > > FAN_REPORT_RESPONSE_ID, so for sure we need to require > > > > > > FAN_REPORT_RESPONSE_ID for the handover ioctl feature. > > > > > > > > > > > > Because I do not see an immediate use case for > > > > > > FAN_REPORT_RESPONSE_ID without handover, > > > > > > I would start by only allowing them together and consider relaxing > > > > > > later if such a use case is found. > > > > > > > > > > We can tie them together but I think queue-fd design doesn't require > > > > > FAN_REPORT_RESPONSE_ID. Since we resend events anyway, we'd generate new > > > > > fds for events as well and things would just work AFAICT. > > > > > > > > > > > > > Right. hmm. > > > > I'd still like to consider the opportunity of the new-ish API for > > > > deprecating some old legacy API baggage. > > > > > > > > For example: do you consider the fact that async events are mixed > > > > together with permission/pre-content events in the same queue > > > > an historic mistake or a feature? > > > > > > So far I do consider it a feature although not a particularly useful one. > > > Given fanotify doesn't guarantee ordering of events and async events can > > > arbitrarily skip over the permission events there's no substantial > > > difference to having two notification groups. > > > > I am not sure I understand your claim. > > It sounds like you are arguing for my case, because mixing > > mergeable async events and non-mergeable permission events > > in the same queue can be very confusing. > > Therefore, I consider it an anti-feature. > > Users can get the same functionality from having two groups, > > with much more sane semantics. > > I'd say with the same semantics but less expectations about possibly nicer > semantics :). But we agree two groups are a cleaner way to achieve the > same and give userspace more flexibility with handling the events at the > same time. > > > > > I'm tempted, as we split the control fd from the queue fd to limit > > > > a queue to events of the same "type". > > > > Maybe an O_RDONLY async queue fd > > > > or an O_RDWR permission events queue fd > > > > or only allow the latter with the new API? > > > > > > There's value in restricting unneeded functionality and there's also value > > > in having features not influencing one another so the balance has to be > > > found :). > > > > <nod> > > > > > So far I don't find that disallowing async events with queue fd > > > or restricting fd mode for queue fd would simplify our life in a > > > significant way but maybe I'm missing something. > > > > > > > It only simplifies our life if we are going to be honest about test coverage. > > When we return a pending permission events to the head of the queue > > when queue fd is closed, does it matter if the queue has async events > > or other permission events or a mix of them? > > > > Logically, it shouldn't matter, but assuming that for tests is not the > > best practice. > > Agreed. > > > Thus, reducing the test matrix for a new feature by removing a > > configuration that I consider misguided feels like a good idea, but I am > > also feeling that my opinion on this matter is very biased, so I'd be > > happy if you can provide a more objective decision about the > > restrictions. > > Heh, nobody is objective :). We are all biased by our past experiences. > Which is why discussing things together is so beneficial. I agree about > the benefit of simplier testing and that hardly anybody is going to miss > the functionality, what still isn't 100% clear to me how the restrictions > would be implemented in the API. So user creates fanotify_group() with > control fd feature flag. Now will he have to specify in advance whether the > group is for permission events or not? Already when creating the group > (i.e., another feature flag? Or infered from group priority?)? Or when > creating queue fd? Or will the group switch based on marks being placed to > it? I think explicit selection is less confusing than implicit by placed > marks. Using group priority looks appealing at first sight but currently > group priorities also have implications about order in which we deliver > events (both async and permission) to groups and it would be also somewhat > confusing that FAN_CLASS_PRE_CONTENT groups may still likely use > FAN_OPEN_PERM event. So maybe it's not that great fit. Hum... Or we can > have a single feature flag (good name needed!) that will create group with > control fd *and* restricted to permission events only. That actually seems > like the least confusing option to me now. I was thinking inferred from group priority along with control fd feature flag, but I agree that being more explicit is better. How about tying the name to your FAN_RETRY patch and including its functionality along with the control fd: FAN_RESTARTABLE_EVENTS Events for fanotify groups initialized with this flag will be restarted if the group file descriptor is closed after reading the permission events and before responding to the event. This means that if an event handler program gets stuck, a new event handler can be started before killing the old event handler, without missing the permission event. The file descriptor returned from fanotify_init() cannot be used to read and respond to permission events, only to configure marks. The IOC_FAN_OPEN_QUEUE_FD ioctl is required to acquire a secondary event queue file descriptor. This event queue file descriptor is used to read the permission events and respond to them. When the queue file descriptor is closed, events that were read from the queue but not responded to, are returned to the queue and can be handled by another instance of the program that opens a new queue file descriptor. The use of either FAN_CLASS_CONTENT or FAN_CLASS_PRE_CONTENT is required along with this flag. Only permission events and pre-content events are allowed to be set in mark masks of a group that was initialized with this flag. Thanks, Amir.