On Wed 16-07-25 12:31:41, Amir Goldstein wrote: > On Wed, Jul 16, 2025 at 10:55 AM Jan Kara <jack@xxxxxxx> wrote: > > > > On Tue 15-07-25 16:50:00, Amir Goldstein wrote: > > > On Tue, Jul 15, 2025 at 2:11 PM Jan Kara <jack@xxxxxxx> wrote: > > > > On Mon 14-07-25 21:59:22, Amir Goldstein wrote: > > > > > On Mon, Jul 14, 2025 at 7:25 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > > I don't think there is much to lose from this retry behavior. > > > > > > > The only reason we want to opt-in for it is to avoid surprises of > > > > > > > behavior change in existing deployments. > > > > > > > > > > > > > > While we could have FAN_RETRY_UNANSWERED as an > > > > > > > independent feature without a handover ioctl, > > > > > > > In order to avoid test matrix bloat, at least for a start (we can relax later), > > > > > > > I don't think that we should allow it as an independent feature > > > > > > > and especially not for legacy modes (i.e. for Anti-Virus) unless there > > > > > > > is a concrete user requesting/testing these use cases. > > > > > > > > > > > > With queue-fd design I agree there's no reason not to have the "resend > > > > > > pending events" behavior from the start. > > > > > > > > > > > > > Going on about feature dependency. > > > > > > > > > > > > > > Practically, a handover ioctl is useless without > > > > > > > FAN_REPORT_RESPONSE_ID, so for sure we need to require > > > > > > > FAN_REPORT_RESPONSE_ID for the handover ioctl feature. > > > > > > > > > > > > > > Because I do not see an immediate use case for > > > > > > > FAN_REPORT_RESPONSE_ID without handover, > > > > > > > I would start by only allowing them together and consider relaxing > > > > > > > later if such a use case is found. > > > > > > > > > > > > We can tie them together but I think queue-fd design doesn't require > > > > > > FAN_REPORT_RESPONSE_ID. Since we resend events anyway, we'd generate new > > > > > > fds for events as well and things would just work AFAICT. > > > > > > > > > > > > > > > > Right. hmm. > > > > > I'd still like to consider the opportunity of the new-ish API for > > > > > deprecating some old legacy API baggage. > > > > > > > > > > For example: do you consider the fact that async events are mixed > > > > > together with permission/pre-content events in the same queue > > > > > an historic mistake or a feature? > > > > > > > > So far I do consider it a feature although not a particularly useful one. > > > > Given fanotify doesn't guarantee ordering of events and async events can > > > > arbitrarily skip over the permission events there's no substantial > > > > difference to having two notification groups. > > > > > > I am not sure I understand your claim. > > > It sounds like you are arguing for my case, because mixing > > > mergeable async events and non-mergeable permission events > > > in the same queue can be very confusing. > > > Therefore, I consider it an anti-feature. > > > Users can get the same functionality from having two groups, > > > with much more sane semantics. > > > > I'd say with the same semantics but less expectations about possibly nicer > > semantics :). But we agree two groups are a cleaner way to achieve the > > same and give userspace more flexibility with handling the events at the > > same time. > > > > > > > I'm tempted, as we split the control fd from the queue fd to limit > > > > > a queue to events of the same "type". > > > > > Maybe an O_RDONLY async queue fd > > > > > or an O_RDWR permission events queue fd > > > > > or only allow the latter with the new API? > > > > > > > > There's value in restricting unneeded functionality and there's also value > > > > in having features not influencing one another so the balance has to be > > > > found :). > > > > > > <nod> > > > > > > > So far I don't find that disallowing async events with queue fd > > > > or restricting fd mode for queue fd would simplify our life in a > > > > significant way but maybe I'm missing something. > > > > > > > > > > It only simplifies our life if we are going to be honest about test coverage. > > > When we return a pending permission events to the head of the queue > > > when queue fd is closed, does it matter if the queue has async events > > > or other permission events or a mix of them? > > > > > > Logically, it shouldn't matter, but assuming that for tests is not the > > > best practice. > > > > Agreed. > > > > > Thus, reducing the test matrix for a new feature by removing a > > > configuration that I consider misguided feels like a good idea, but I am > > > also feeling that my opinion on this matter is very biased, so I'd be > > > happy if you can provide a more objective decision about the > > > restrictions. > > > > Heh, nobody is objective :). We are all biased by our past experiences. > > Which is why discussing things together is so beneficial. I agree about > > the benefit of simplier testing and that hardly anybody is going to miss > > the functionality, what still isn't 100% clear to me how the restrictions > > would be implemented in the API. So user creates fanotify_group() with > > control fd feature flag. Now will he have to specify in advance whether the > > group is for permission events or not? Already when creating the group > > (i.e., another feature flag? Or infered from group priority?)? Or when > > creating queue fd? Or will the group switch based on marks being placed to > > it? I think explicit selection is less confusing than implicit by placed > > marks. Using group priority looks appealing at first sight but currently > > group priorities also have implications about order in which we deliver > > events (both async and permission) to groups and it would be also somewhat > > confusing that FAN_CLASS_PRE_CONTENT groups may still likely use > > FAN_OPEN_PERM event. So maybe it's not that great fit. Hum... Or we can > > have a single feature flag (good name needed!) that will create group with > > control fd *and* restricted to permission events only. That actually seems > > like the least confusing option to me now. > > I was thinking inferred from group priority along with control fd feature flag, > but I agree that being more explicit is better. > > How about tying the name to your FAN_RETRY patch > and including its functionality along with the control fd: > > FAN_RESTARTABLE_EVENTS > Events for fanotify groups initialized with this flag will be restarted > if the group file descriptor is closed after reading the > permission events > and before responding to the event. > This means that if an event handler program gets stuck, a new event > handler can be started before killing the old event handler, > without missing > the permission event. > The file descriptor returned from fanotify_init() cannot be > used to read and > respond to permission events, only to configure marks. > The IOC_FAN_OPEN_QUEUE_FD ioctl is required to acquire a secondary > event queue file descriptor. This event queue file descriptor is used to > read the permission events and respond to them. > When the queue file descriptor is closed, events that were read from the > queue but not responded to, are returned to the queue and can be handled > by another instance of the program that opens a new queue file > descriptor. > The use of either FAN_CLASS_CONTENT or FAN_CLASS_PRE_CONTENT > is required along with this flag. > Only permission events and pre-content events are allowed to be > set in mark > masks of a group that was initialized with this flag. Sounds very nice to me. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR