Re: [PATCH] btrfs: scrub: wip, pause on fs freeze

Christian Brauner <brauner@xxxxxxxxxx> · Mon, 14 Jul 2025 12:27:37 +0200

On Sat, Jul 12, 2025 at 08:39:51AM +0930, Qu Wenruo wrote:
> 
> 
> 在 2025/7/12 08:21, Qu Wenruo 写道:
> > 
> > 
> > 在 2025/7/12 04:45, David Sterba 写道:
> > > On Wed, Jul 09, 2025 at 07:33:56AM +0930, Qu Wenruo wrote:
> > > > 
> > > > 
> > > > 在 2025/7/8 22:55, David Sterba 写道:
> > > > > Implement sb->freeze_super that can instruct our threads to pause
> > > > > themselves. In case of (read-write) scrub this means to undo
> > > > > mnt_want_write, implemented as sb_start_write()/sb_end_write().
> > > > > The freeze_super callback is necessary otherwise the call
> > > > > sb_want_write() inside the generic implementation hangs.
> > > > 
> > > > I don't this this is really going to work.
> > > > 
> > > > The main problem is out of btrfs, it's all about the s_writer.rw_sem.
> > > > 
> > > > If we have a running scrub, it holds the mnt_want_write_file(), which is
> > > > a read lock on the rw_sem.
> > > > 
> > > > Then we start freezing, which will call sb_wait_write(), which will do a
> > > > write lock on the rw_sem, waiting for the scrub to finish.
> > > > 
> > > > However the ->freeze() callback is only called when freeze_super() got
> > > > the write lock on the rw_sem.
> > > 
> > > Note there are 2 callbacks for freeze, sb::freeze_super and
> > > sb::freeze_fs.
> > > 
> > > ioctl_fsfreeze
> > >    if fs->freeze_super
> > >      call fs_freeze_super()
> > 
> > Oh I forgot you implemented a new callback, ->freeze_super() to do the
> > extra btrfs specific handling.
> > 
> > But this means, we're exposing ourselves to all the extra VFS handling,
> > I'm not sure if this is the preferred way, nor if the extra btrfs
> > handling is good enough to cover what's done inside freeze_super()
> > function (I guess not though).
> 
> My bad, it's calling back into freeze_super() function, so it's fine.

You have to expose yourself to all the VFS handling and you must call
freeze_super() at some point. You can't just do freeze on your own. You
have roughly the following classes of freeze calls:

(1) initiated from userspace
    (1.1) initiated from the block layer => upcall into the filesystem
    (1.2) initiated from the filesystem
(2) initiated from the kernel
    (2.1) initiated from the filesystem
    (2.2) suspend/hibernation

All of that requires synchronization and you cannot do this without
going through the VFS layer's locking safely. At least not without
causing royal pain for yourself and the block and VFS layer.

(Fwiw, I would still really like to see support for fs_holder_ops so
btrfs freezing works correctly when called from the block layer.)

> 
> The remaining concern is just should we do this by using
> ->freeze_super() callback, or do the f2fs' way by poking into
> s_writer.frozen.
> 
> I'd prefer the later one, as that's already exposed to filesystems and we do
> not need extra callback functions especially when it's not widely used.

This would be simpler. You should probably add a helper that you can
share with gfs2 to sample the state instead of raw accesses to
s_writers->frozen. I find the latter a bit ugly and error prone if we
ever change this. I'd rather have it encapsulated in the VFS layer as
best as we can.