On Tue, 2025-03-25 at 07:50 +1100, Dave Chinner wrote: > On Mon, Mar 24, 2025 at 12:38:20PM +0100, Jan Kara wrote: > > On Fri 21-03-25 13:00:24, James Bottomley via Lsf-pc wrote: > > > On Fri, 2025-03-21 at 08:34 -0400, James Bottomley wrote: > > > [...] > > > > Let me digest all that and see if we have more hope this time > > > > around. > > > > > > OK, I think I've gone over it all. The biggest problem with > > > resurrecting the patch was bugs in ext3, which isn't a problem > > > now. Most of the suspend system has been rearchitected to > > > separate suspending user space processes from kernel ones. The > > > sync it currently does occurs before even user processes are > > > frozen. I think (as most of the original proposals did) that we > > > just do freeze all supers (using the reverse list) after user > > > processes are frozen but just before kernel threads are (this > > > shouldn't perturb the image allocation in hibernate, which was > > > another source of bugs in xfs). > > > > So as far as my memory serves the fundamental problem with this > > approach was FUSE - once userspace is frozen, you cannot write to > > FUSE filesystems so filesystem freezing of FUSE would block if > > userspace is already suspended. You may even have a setup like: > > > > bdev <- fs <- FUSE filesystem <- loopback file <- loop device <- > > another fs > > > > So you really have to be careful to freeze this stack without > > causing deadlocks. So you need to be freezing userspace after > > filesystems are frozen but then you have to deal with the fact that > > parts of your userspace will be blocked in the kernel (trying to do > > some write) waiting for the filesystem to thaw. But it might be > > tractable these days since I have a vague recollection that system > > suspend is now able to gracefully handle even tasks in > > uninterruptible sleep. > > I thought we largely solved this problem with userspace flusher > threads being able to call prctl(PR_IO_FLUSHER) to tell the kernel > they are part of the IO stack and so need to be considered > special from the POV of memory allocation and write (dirty page) > throttling. > > Maybe hibernate needs to be aware of these userspace flusher > tasks and only suspend them after filesystems are frozen instead > of when userspace is initially halted? I can confirm it's not. Its check for kernel thread is in kernel/power/process.c:try_to_freeze_tasks(). It really only uses the PF_KTHREAD flag in differentiating between user and kernel threads. But what I heard in the session was that we should freeze filesystems before any tasks because that means tasks touching the frozen fs freeze themselves. Regards, James