On Fri, Jul 4, 2025 at 11:56 AM Jan Kara <jack@xxxxxxx> wrote: > > On Thu 03-07-25 15:05:37, Miklos Szeredi wrote: > > This is to make it easier to debug issues with AV software, which time and > > again deadlocks with no indication of where the issue comes from, and the > > kernel being blamed for the deadlock. Then we need to analyze dumps to > > prove that the kernel is not in fact at fault. > > I share the pain. I had to do quite some of these analyses myself :). > Luckily our support guys have trained to do the analysis themselves over > the years so it rarely reaches my table anymore. > > > With this patch a warning is printed when permission event is received by > > userspace but not answered for more than 20 seconds. > > > > The timeout is very coarse (20-40s) but I guess it's good enough for the > > purpose. > > I'm not opposed to the idea (although I agree with Amir that it should be > tunable - we have /proc/sys/fs/fanotify/ for similar things). Just I'm not > sure it will have the desired deterring effect for fanotify users wanting > to blame the kernel. What usually convinces them is showing where their > tasks supposed to write reply to permission event (i.e., those that have > corresponding event fds in their fdtable) are blocked and hence they cannot > reply. But with some education I suppose it can work. After all the > messages you print contain the task responsible to answer which is already > helpful. > > > +config FANOTIFY_PERM_WATCHDOG > > + bool "fanotify permission event watchdog" > > + depends on FANOTIFY_ACCESS_PERMISSIONS > > + default n > > As Amir wrote, I don't think we need a kconfig for this, configuration > through /proc/sys/fs/fanotify/ will be much more flexible. > > > diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h > > index b44e70e44be6..8b60fbb9594f 100644 > > --- a/fs/notify/fanotify/fanotify.h > > +++ b/fs/notify/fanotify/fanotify.h > > @@ -438,10 +438,14 @@ FANOTIFY_ME(struct fanotify_event *event) > > struct fanotify_perm_event { > > struct fanotify_event fae; > > struct path path; > > - const loff_t *ppos; /* optional file range info */ > > + union { > > + const loff_t *ppos; /* optional file range info */ > > + pid_t pid; /* pid of task processing the event */ > > + }; > > I think Amir complained about the generic 'pid' name already. Maybe > processing_pid? Also I'd just get rid of the union. We don't have *that* > many permission events that 4 bytes would matter and possible interactions > between pre-content events and this watchdog using the same space make me > somewhat uneasy. fae.pid is not used after copy_event_to_user() so it can be reused to hold struct pid of processing task while waiting for response. But anyway, I think there is a hole in the struct after int fd. Thanks, Amir.