On Mon 26-05-25 23:12:20, Sergey Senozhatsky wrote: > On (25/05/26 14:52), Jan Kara wrote: > > > > We don't use exclusive waits with access_waitq so wake_up() and > > > > wake_up_all() should do the same thing? > > > > > > Oh, non-exclusive waiters, I see. I totally missed that, thanks. > > > > > > So... the problem is somewhere else then. I'm currently looking > > > at some crashes (across all LTS kernels) where group owner just > > > gets stuck and then hung-task watchdog kicks in and panics the > > > system. Basically just a single backtrace in the kernel logs: > > > > > > schedule+0x534/0x2540 > > > fsnotify_destroy_group+0xa7/0x150 > > > fanotify_release+0x147/0x160 > > > ____fput+0xe4/0x2a0 > > > task_work_run+0x71/0xb0 > > > do_exit+0x1ea/0x800 > > > do_group_exit+0x81/0x90 > > > get_signal+0x32d/0x4e0 > > > > > > My assumption was that it's this wait: > > > wait_event(group->notification_waitq, !atomic_read(&group->user_waits)); > > > > Well, you're likely correct we are sleeping in this wait. But likely > > there's some process that's indeed waiting for response to fanotify event > > from userspace. Do you have a reproducer? Can you dump all blocked tasks > > when this happens? > > Unfortunately, no. This happens on consumer devices, which are > not available for any sort of debugging, due to various privacy > protection reasons. We only get anonymized kernel ramoops/dmesg > on crashes. > > So my only option is to add something to the kernel, then roll-out > the patched kernel to the fleet and wait for new crash reports. The > problem is, all that I can think of sort of fixes the crash as far as > the hung-task watchdog is concerned. Let me think more about it. > > Another silly question: what decrements group->user_waits in case of > that race-condition? > > --- > > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > index 9dac7f6e72d2b..38b977fe37a71 100644 > --- a/fs/notify/fanotify/fanotify.c > +++ b/fs/notify/fanotify/fanotify.c > @@ -945,8 +945,10 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask, > if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) { > fsid = fanotify_get_fsid(iter_info); > /* Racing with mark destruction or creation? */ > - if (!fsid.val[0] && !fsid.val[1]) > - return 0; > + if (!fsid.val[0] && !fsid.val[1]) { > + ret = 0; > + goto finish; > + } > } This code is not present in current upstream kernel. This seems to have been inadvertedly fixed by commit 30ad1938326b ("fanotify: allow "weak" fsid when watching a single filesystem") which you likely don't have in your kernel. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR