在 2025/8/5 10:06, Anand Jain 写道:
Thanks for the comments.
Our seed block device use-case doesn’t fall under the kind of risk that
BLK_OPEN_RESTRICT_WRITES is meant to guard against—it’s not a typical
multi-FS RW setup. Seed devices are readonly, so it might be reasonable
to handle this at the block layer—or maybe it’s not feasible.
Read-only doesn't prevent the device from being removed suddenly.
I don't see how this is related to the BLK_OPEN_RESTRICT_WRITES flag.
Can you clarify?
It's not related to that flag, I'm talking about the
fs_bdev_mark_dead(), and the remaining 3 callbacks.
Those call backs are all depending on the bdev holder to grab a super block.
Thus a block device should and can not have multiple super blocks.
------
/* open is exclusive wrt all other BLK_OPEN_WRITE opens to the device */
#define BLK_OPEN_RESTRICT_WRITES ((__force blk_mode_t)(1 << 5))
------
You still didn't know that the whole fs_holder_ops is based on the
assumption that one block device should only belong to one mounted fs.
You're missing the point: after a sprout, Btrfs internally becomes a new
filesystem with a new FSID. Some may call it insane—but it's different,
useful, and it works.
I totally know that, it's you don't understand how bdev holder works,
nor willing to spend any time reading the details about bdev_open().
Just search the @holder inside that function, even without
RESTRICT_WRITES flag, it will still fail at bd_may_reclaim() due to the
holder (super block) mismatch.
During that transition, fs_holder (or equivalent) needs to be updated to
reflect the change. If that's not currently possible, we may need to add
support for it.
The problem is that fs_holder_ops still sees it as a seed device, which
is risky—we don’t know what else could break if the FSID change isn’t
properly handled.
Nope, it's super simple, you just can not mount have a block device with
two different holders.
And I see that assumption completely valid.
I didn't see any reason why any sane people want to mount the sported
fs and the seed device at the same time.
Neither of us has data on how it’s being used.
Just read all the other filesystems' code.
Either it's pushing super block as bdev holder, so that we can easily
grab the fs from bdev through bdev_super_lock(), or it's bcachefs doing
the similar thing, but without using the existing helpers.
And as I’ve hinted, it
does violate kABI from a technical standpoint.
If the use case is to sprout a fs based on the seed device multiple
times, it's still possible, just unmount the sprout fs before mounting
the seed device again.
In a datacenter environment, unmounting isn’t always a viable option.
If you're mounting the fs already, why you can not umount suddenly?
If you're talking about rootfs, it's no deal breaker, just remove the
seed device from the sprout fs, then mount the seed device again.
Now that there’s a regression and a feature has been broken, let’s not
shift the discussion to whether that feature was useful. I prefer to
keep things technical—not personal—and I expect respectful communication
to be mutual, not taken for granted.
I have explained the technical details enough. If you are not willing to
understand, sure call it whatever you want.
Btrfs has some unique behaviors, and it’s possible we’ll need changes in
the block layer or fs_holder_ops. That still needs to be figured out.
Unique doesn't mean correct nor sane.
And seed device is nothing special. If you don't want to accept that one
mounted block device should only belong to one mounted fs, sure go ahead
and see what everyone else thinks.
Thanks, Anand