On Thu, 2025-03-27 at 19:20 +0100, Jan Kara wrote: > On Thu 27-03-25 10:06:13, James Bottomley wrote: > > Introduce a freeze function, which iterates superblocks in reverse > > order freezing filesystems. The indicator a filesystem is > > freezable is either possessing a s_bdev or a freeze_super method. > > So this can be used in efivarfs, whether the freeze is for > > hibernate is also passed in via the new FREEZE_FOR_HIBERNATE flag. > > > > Thawing is done opposite to freezing (so superblock traversal in > > regular order) and the whole thing is plumbed into power > > management. > > The original ksys_sync() is preserved so the whole freezing step is > > optional (if it fails we're no worse off than we are today) so it > > doesn't inhibit suspend/hibernate if there's a failure. > > > > Signed-off-by: James Bottomley > > <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > > OK, I've seen you are setting the new FREEZE_FOR_HIBERNATE flag but I > didn't find anything using that flag. What do you plan to use it for? > Does you efivars usecase need it? I find passing down this detail > about the caller down to all filesystems a bit awkward. Isn't it > possible to extract the information "hibernate is ongoing" from PM > subsystem? That's right. I'm happy to post my patch below, but it depends on Al accepting the simple_next_child() proposal, so it doesn't apply to any tree. > > +/* > > + * Kernel freezing and thawing is only done in the power > > management > > + * subsystem and is thus single threaded (so we don't have to > > worry > > + * here about multiple calls to filesystems_freeze/thaw(). > > + */ > > + > > +static int freeze_flags; > > Frankly, the global variable to propagate flags is pretty ugly... If > we really have to propagate some context into the iterator callback, > rather do it explicitly like iterate_supers() does it. Christian said the same thing. I can do it, but if you look in the power management subsystem, it relies on single threading and has a lot of global variables like this, so I thought of this as a simplification. > > +static void filesystems_freeze_callback(struct super_block *sb) > > +{ > > + /* errors don't fail suspend so ignore them */ > > + if (sb->s_op->freeze_super) > > + sb->s_op->freeze_super(sb, FREEZE_MAY_NEST > > + | FREEZE_HOLDER_KERNEL > > + | freeze_flags); > > + else if (sb->s_bdev) > > + freeze_super(sb, FREEZE_MAY_NEST | > > FREEZE_HOLDER_KERNEL > > + | freeze_flags); > > Style nit - braces around above blocks would be IMHO appropriate. > > > + else { > > + pr_info("Ignoring filesystem %s\n", sb->s_type- > > >name); > > + return; > > + } > > + > > + pr_info("frozen %s, now syncing block ...", sb->s_type- > > >name); > > + sync_blockdev(sb->s_bdev); > > + pr_info("done."); > > +} > > Generally this callback is not safe because it can race with > filesystem unmount and calling ->freeze_super() after the > filesystem's ->put_super() was called may have all sorts of > interesting effects (freeze_super() itself will just bail with a > warning, which is better but not great either). > > The cleanest way I see how to make the iteration safe is to grab > active sb reference (like grab_super() does it) for the duration of > freeze_super() calls. Another possibility would be to grab sb- > >s_umount rwsem exclusively as Luis does it in his series but that > requires a bit of locking surgery and ->freeze_super() handlers make > this particularly nasty these days so I think active sb reference is > going to be nicer these days. Before getting into the how of this, could you just confirm my understanding of what the various locks do: https://lore.kernel.org/cd5c3d8aab9c5fb37fa018cb3302ecf7d2bdb140.camel@xxxxxxxxxxxxxxxxxxxxx Is correct? Regards, James