Re: [RFC PATCH] fanotify: wake-up all waiters on release

Jan Kara <jack@xxxxxxx> · Mon, 26 May 2025 14:52:50 +0200

On Fri 23-05-25 16:18:19, Sergey Senozhatsky wrote:
> On (25/05/21 12:18), Jan Kara wrote:
> > On Tue 20-05-25 21:35:12, Sergey Senozhatsky wrote:
> > > Once reply response is set for all outstanding requests
> > > wake_up_all() of the ->access_waitq waiters so that they
> > > can finish user-wait.  Otherwise fsnotify_destroy_group()
> > > can wait forever for ->user_waits to reach 0 (which it
> > > never will.)
> > > 
> > > Signed-off-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>
> > 
> > We don't use exclusive waits with access_waitq so wake_up() and
> > wake_up_all() should do the same thing?
> 
> Oh, non-exclusive waiters, I see.  I totally missed that, thanks.
> 
> So... the problem is somewhere else then.  I'm currently looking
> at some crashes (across all LTS kernels) where group owner just
> gets stuck and then hung-task watchdog kicks in and panics the
> system.  Basically just a single backtrace in the kernel logs:
> 
>  schedule+0x534/0x2540
>  fsnotify_destroy_group+0xa7/0x150
>  fanotify_release+0x147/0x160
>  ____fput+0xe4/0x2a0
>  task_work_run+0x71/0xb0
>  do_exit+0x1ea/0x800
>  do_group_exit+0x81/0x90
>  get_signal+0x32d/0x4e0
> 
> My assumption was that it's this wait:
> 	wait_event(group->notification_waitq, !atomic_read(&group->user_waits));

Well, you're likely correct we are sleeping in this wait. But likely
there's some process that's indeed waiting for response to fanotify event
from userspace. Do you have a reproducer? Can you dump all blocked tasks
when this happens?

									Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR