On Tue 25-03-25 22:36:56, James Bottomley wrote: > On Tue, 2025-03-25 at 14:42 +0100, Jan Kara wrote: > [...] > > If I remember correctly, the problem in the past was, that if you > > leave userspace running while freezing filesystems, some processes > > may enter uninterruptible sleep waiting for fs to be thawed and in > > the past suspend code was not able to hibernate such processes. But I > > think this obstacle has been removed couple of years ago as now we > > could use TASK_FREEZABLE flag in sb_start_write() -> > > percpu_rwsem_wait and thus allow tasks blocked on frozen filesystem > > to be hibernated. > > I tested this and we do indeed deadlock hibernation on the processes > touching the filesystem (systemd-journald actually). But if I make > this change: > > diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c > index 6083883c4fe0..720418720bbc 100644 > --- a/kernel/locking/percpu-rwsem.c > +++ b/kernel/locking/percpu-rwsem.c > @@ -156,7 +156,7 @@ static void percpu_rwsem_wait(struct percpu_rw_semaphore *sem, bool reader) > spin_unlock_irq(&sem->waiters.lock); > > while (wait) { > - set_current_state(TASK_UNINTERRUPTIBLE); > + set_current_state(TASK_UNINTERRUPTIBLE|TASK_FREEZABLE); > if (!smp_load_acquire(&wq_entry.private)) > break; > schedule(); > > Then everything will work, with no lockdep problems (thanks, > Christian). Is that the change you want me to make or should > sb_start_write be using a special freezable version of > percpu_rwsem_wait()? I was thinking about this. The possible problem with this may be that a task waiting in percpu_rwsem_wait() is hibernated and if it holds another lock (e.g. some mutex) and there's another task waiting for this mutex, then hibernation fails because that other task cannot be hibernated. With sb_start_write() specifically, this is usually not a problem because this is the outermoust lock we take. The only catch here would be if a process is blocked in a write page fault for a frozen filesystem. Then we are holding mmap_sem for the process so hibernation could fail this way. But I'd guess this is rare enough that we could live with that possibility. So to summarize I think we may need to introduce freezable variant of percpu_rwsem_down_read() and use it in sb_start_write(). Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR