Re: [RFC PATCH 4/4] vfs: add filesystem freeze/thaw callbacks for power management

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Fri, 28 Mar 2025 10:14:22 -0400

On Fri, 2025-03-28 at 11:08 +0100, Christian Brauner wrote:
> On Thu, Mar 27, 2025 at 10:06:13AM -0400, James Bottomley wrote:
[...]
> > +
> > +static void filesystems_freeze_callback(struct super_block *sb)
> > +{
> > +	/* errors don't fail suspend so ignore them */
> > +	if (sb->s_op->freeze_super)
> > +		sb->s_op->freeze_super(sb, FREEZE_MAY_NEST
> > +				       | FREEZE_HOLDER_KERNEL
> > +				       | freeze_flags);
> > +	else if (sb->s_bdev)
> > +		freeze_super(sb, FREEZE_MAY_NEST |
> > FREEZE_HOLDER_KERNEL
> > +			     | freeze_flags);
> > +	else {
> > +		pr_info("Ignoring filesystem %s\n", sb->s_type-
> > >name);
> > +		return;
> > +	}
> > +
> > +	pr_info("frozen %s, now syncing block ...", sb->s_type-
> > >name);
> > +	sync_blockdev(sb->s_bdev);
> > +	pr_info("done.");
> > +}
> > +
> > +/**
> > + * filesystems_freeze - freeze callback for power management
> > + *
> > + * Freeze all active filesystems (in reverse superblock order)
> > + */
> > +void filesystems_freeze(bool for_hibernate)
> > +{
> > +	freeze_flags = for_hibernate ? FREEZE_FOR_HIBERNATE : 0;
> > +	__iterate_supers_rev(filesystems_freeze_callback);
> > +}
> > +
> > +static void filesystems_thaw_callback(struct super_block *sb)
> > +{
> > +	if (sb->s_op->thaw_super)
> > +		sb->s_op->thaw_super(sb, FREEZE_MAY_NEST
> > +				     | FREEZE_HOLDER_KERNEL
> > +				     | freeze_flags);
> > +	else if (sb->s_bdev)
> > +		thaw_super(sb,	FREEZE_MAY_NEST |
> > FREEZE_HOLDER_KERNEL
> > +			   | freeze_flags);
> > +}
> > +
> > +/**
> > + * filesystems_thaw - thaw callback for power management
> > + *
> > + * Thaw all active filesystems (in forward superblock order)
> > + */
> > +void filesystems_thaw(bool for_hibernate)
> > +{
> > +	freeze_flags = for_hibernate ? FREEZE_FOR_HIBERNATE : 0;
> > +	__iterate_supers(filesystems_thaw_callback);
> 
> This doesn't work and I've explained in my reply to Luis how this
> doesn't work and what the alternative are:
> 
> A concurrent umount() can wipe the filesystem behind your back. So
> you either need an active superblock reference or you need to
> communicate that the superblock is locked through the new flag I
> proposed (naming irrelevant for now).

Since this is a hybrid thread between power management and VFS, could I
just summarize what I think the various superblock locks are before
discussing the actual problem (important because the previous threads
always gave the impression of petering out for fear of vfs locking).

s_count: outermost of the superblock locks refcounting the superblock
structure itself, making no guarantee that any of the underlying
filesystem superblock structures are attached (i.e. kill_sb() may have
been called).  Taken by incrementing under the global sb_lock and
decremented using a put_super() variant.

s_active: an atomic reference counting the underlying filesystem
specific superblock structures.  if you hold s_active, kill_sb cannot
be called.  Acquired by atomic_inc_not_zero() with a possible failure
if it is zero and released by deactivate_super() and its variants.

s_umount: rwsem and innermost of the superblock locks. Used to protect
various operations from races.  Taken exclusively with down_write and
shared with down_read. Private functions internal to super.c wrap this
with grab_super and super_lock_shared/excl() wrappers.

The explicit freeze/thaw_super() functions require the s_umount rwsem
in down_write or exclusive mode and take it as the first step in their
operation.  Looking at the locking in fs_bdev_freeze/thaw() implies
that the super_operations freeze_super/thaw_super *don't* need this
taken (presumably they handle it internally).

Regards,

James