On Wed, 2025-04-02 at 09:46 +0200, Christian Brauner wrote: > On Tue, Apr 01, 2025 at 01:02:07PM -0400, James Bottomley wrote: > > On Tue, 2025-04-01 at 02:32 +0200, Christian Brauner wrote: > > > The whole shebang can also be found at: > > > https://web.git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=work.freeze > > > > > > I know nothing about power or hibernation. I've tested it as best > > > as I could. Works for me (TM). > > > > I'm testing the latest you have in work.freeze and it doesn't > > currently work for me. Patch 7b315c39b67d ("power: freeze > > filesystems during suspend/resume") doesn't set > > filesystems_freeze_ptr so it ends up being NULL and tripping over > > this check > > I haven't pushed the new version there. Sorry about that. I only have > it locally. > > > > > +static inline bool may_unfreeze(struct super_block *sb, enum > > freeze_holder who, > > + const void *freeze_owner) > > +{ > > + WARN_ON_ONCE((who & ~FREEZE_FLAGS)); > > + WARN_ON_ONCE(hweight32(who & FREEZE_HOLDERS) > 1); > > + > > + if (who & FREEZE_EXCL) { > > + if (WARN_ON_ONCE(sb->s_writers.freeze_owner == > > NULL)) > > + return false; > > > > > > in f15a9ae05a71 ("fs: add owner of freeze/thaw") and failing to > > resume from hibernate. Setting it to __builtin_return_address(0) > > in filesystems_freeze() makes everything work as expected, so > > that's what I'm testing now. > > +1 > > I'll send the final version out in a bit. I've now done some extensive testing on loop nested filesystems with fio load on the upper. I've tried xfs on ext4 and ext4 on ext4. Hibernate/Resume has currently worked on these without a hitch (and the fio load burps a bit but then starts running at full speed within a few seconds). What I'm doing is a single round of hibernate/resume followed by a reboot. I'm relying on the fschecks to detect any filesystem corruption. I've also tried doing a couple of fresh starts of the hibernated image to check that we did correctly freeze the filesystems. The problems I've noticed are: 1. I'm using 9p to push host directories throught and that completely hangs after a resume. This is expected because the virtio server is out of sync, but it does indicate a need to address Jeff's question of what we should be doing for network filesystems (and is also the reason I have to reboot after resuming). 2. Top doesn't show any CPU activity after resume even though fio is definitely running. This seems to be a suspend issue and unrelated to filesystems, but I'll continue investigating. Regards, James