Re: Should seed device be allowed to be mounted multiple times?

Christian Brauner <brauner@xxxxxxxxxx> · Fri, 8 Aug 2025 16:14:55 +0200

On Wed, Aug 06, 2025 at 07:50:06AM +0930, Qu Wenruo wrote:
> 
> 
> 在 2025/8/5 22:13, Christian Brauner 写道:
> > On Tue, Aug 05, 2025 at 10:22:49AM +0930, Qu Wenruo wrote:
> > > 
> > > 
> > > 在 2025/8/5 10:06, Anand Jain 写道:
> > > > 
> > > > 
> > > > > > Thanks for the comments.
> > > > > > Our seed block device use-case doesn’t fall under the kind of risk that
> > > > > > BLK_OPEN_RESTRICT_WRITES is meant to guard against—it’s not a typical
> > > > > > multi-FS RW setup. Seed devices are readonly, so it might be reasonable
> > > > > > to handle this at the block layer—or maybe it’s not feasible.
> > > > 
> > > > 
> > > > > Read-only doesn't prevent the device from being removed suddenly.
> > > > 
> > > > I don't see how this is related to the BLK_OPEN_RESTRICT_WRITES flag.
> > > > Can you clarify?
> > > 
> > > It's not related to that flag, I'm talking about the fs_bdev_mark_dead(),
> > > and the remaining 3 callbacks.
> > > 
> > > Those call backs are all depending on the bdev holder to grab a super block.
> > > 
> > > Thus a block device should and can not have multiple super blocks.
> > 
> > I'm pretty sure you can't just break the seed device sharing use-case
> > without causing a lot of regressions...
> 
> It's not that widely affecting, we can still share the same seed device for
> all different sprout fses, just only one of them can be mounted at the same
> time.
> 
> And even with that limitation, it won't affect most (or any) real world use
> cases.
> 
> Even the most complex case like using seed devices as rootfs, and we want to
> sprout the rootfs again, just remove the seed device from the current
> rootfs, then one can mount the seed device again.
> 
> > 
> > If you know what the seed devices are than you can change the code to
> > simply use the btrfs filesystem type as the holder without any holder
> > operations but just for seed devices. Then seed devices can be opened
> > by/shared with any btrfs filesystem.
> 
> But we will lose all the bdev related events.
> 
> We still want to sync/freeze/thaw the real sprouted fs in the end.
> 
> > 
> > The only restriction is that you cannot use a device as a seed device
> > that another btrfs filesystem uses as a non-seed device because then it
> > will be fully owned by the other btrfs filesystem. But Josef tells me
> > you can only use it as a seed device anyway.
> > 
> > IOW, if you have a concept of shareable devices between different btrfs
> > filesystems then it's fine to reflect that in the code. If really needed
> > you can later add custom block holder ops for seed devices so you can
> > e.g., iterate through all filesystems that share the device.
> 
> Sure it's possible, with a lot of extra code looking up where the seed
> device belongs, and all the extra bdev event proxy.
> 
> 
> But I'd say, the seed device specification is not well specified in the very
> beginning, thus it results a lot of "creative" but not practical use cases.
> 
> Yes, this will result some regression, but I'd prefer a more sounding and
> simpler logic for the whole seed device, with minimal impact to the most
> common existing use cases.

Ok, I'm not in a position to argue this effectively. If you think you an
reasonably get away with this regression so be it. But if this ends up
in a total revert of the conversion even though we'd have alternative
solution I'm not going to be happy...