Re: md regression caused by commit 9e59d609763f70a992a8f3808dabcce60f14eb5c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2025/08/08 13:28, Xiao Ni 写道:
On Thu, Aug 7, 2025 at 10:18 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:



On Thu, 7 Aug 2025, Luca Boccassi wrote:

On Thu, 7 Aug 2025 at 01:04, Xiao Ni <xni@xxxxxxxxxx> wrote:

Hi all

It needs to use the latest upstream mdadm
https://github.com/md-raid-utilities/mdadm/ which has fixed this
problem. And for fedora, it hasn't updated to the latest upstream. So
it has this problem. I'll update fedora mdadm to latest upstream.

Best Regards
Xiao

Thank you for looking into it and providing a solution - however,
isn't it against the rules to break existing released userspace
components and requiring new versions to be released in order to use a
new kernel version? Is there any way this kernel patch could be
amended to avoid breaking the existing userspace as it is?

Thanks

I also think that the misbehavior should be fixed in the kernel.

We shouldn't use arbitrary timeouts to clean up the sysfs entries, because
it would introduce race conditions.

What about destroying the sysfs entries when the file descriptor is
closed? (instead of on the STOP_ARRAY ioctl) That wouldn't interfere with
other code trying to stop the array and it would make it work with the
buggy mdadm that calls STOP_ARRAY and then tries to find the sysfs entries
and then calls SET_ARRAY_INFO.

Mikulas


Hi all

The assemble process is:
1. create array
2. stop it (STOP_ARRAY). Before the kernel change, del_gendisk is
called at the last release of mddev rather than in STOP_ARRAY ioctl
3. access /sys/block/md0/md

The kernel change tries to call del_gendisk in STOP_ARRAY. So /dev/md0
can be removed and no one can access it. If not, the array can be
created again because md supports create on open.

After the kernel change, the assemble process is:
1. create array
2. stop it (del_gendisk runs and /sys/block/md0 is removed)
3. acces /sys/block/md0/xx (it fails)

So del_gendisk destroys sysfs entries. If we destroy sysfs entries at
the last release of mddev, it will return to the old state that
/dev/md0 can be opened after stop. I don't want to return back.
Because some customers encounter bugs that shutdown is stuck because
/dev/md0 can't be stopped and the regression test usually fails
because of this too.

Yes, from kernel side, we think after succeed stop_array ioct, the
kernel disk should be removed in the end. We used to call del_gendisk
asynchronously, leaves a race window that sysfs entries still visible
to user.

We decide to fix this in the last merge window, however, it's true mdadm
has to be fixed together.


I know it's not good to break mdadm by a kernel change. But sometimes
it needs userspace tool and kernel work together to fix a problem,
right?
Sorry for bringing the problem, and thanks for the suggestions. Any
more good suggestions?


Idealy, we should fix mdadm first, then after a release, fix kernel.
Sadly the transition stage is missing now. :(

If we want to just avoid this problem in kernel, what I can think of is
adding a switch and mark it deprecated for now. And in new mdadm
releases enable that switch, and after sometime, remove mdadm legacy
code to stop array, and finally remove the deprecated switch in kernel
then everyone will be happy :)

Thanks,
Kuai

Best Regards
Xiao


.






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux