Re: md regression caused by commit 9e59d609763f70a992a8f3808dabcce60f14eb5c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 8, 2025 at 2:41 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
>
>
> 在 2025/08/08 13:28, Xiao Ni 写道:
> > On Thu, Aug 7, 2025 at 10:18 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On Thu, 7 Aug 2025, Luca Boccassi wrote:
> >>
> >>> On Thu, 7 Aug 2025 at 01:04, Xiao Ni <xni@xxxxxxxxxx> wrote:
> >>>>
> >>>> Hi all
> >>>>
> >>>> It needs to use the latest upstream mdadm
> >>>> https://github.com/md-raid-utilities/mdadm/ which has fixed this
> >>>> problem. And for fedora, it hasn't updated to the latest upstream. So
> >>>> it has this problem. I'll update fedora mdadm to latest upstream.
> >>>>
> >>>> Best Regards
> >>>> Xiao
> >>>
> >>> Thank you for looking into it and providing a solution - however,
> >>> isn't it against the rules to break existing released userspace
> >>> components and requiring new versions to be released in order to use a
> >>> new kernel version? Is there any way this kernel patch could be
> >>> amended to avoid breaking the existing userspace as it is?
> >>>
> >>> Thanks
> >>
> >> I also think that the misbehavior should be fixed in the kernel.
> >>
> >> We shouldn't use arbitrary timeouts to clean up the sysfs entries, because
> >> it would introduce race conditions.
> >>
> >> What about destroying the sysfs entries when the file descriptor is
> >> closed? (instead of on the STOP_ARRAY ioctl) That wouldn't interfere with
> >> other code trying to stop the array and it would make it work with the
> >> buggy mdadm that calls STOP_ARRAY and then tries to find the sysfs entries
> >> and then calls SET_ARRAY_INFO.
> >>
> >> Mikulas
> >>
> >
> > Hi all
> >
> > The assemble process is:
> > 1. create array
> > 2. stop it (STOP_ARRAY). Before the kernel change, del_gendisk is
> > called at the last release of mddev rather than in STOP_ARRAY ioctl
> > 3. access /sys/block/md0/md
> >
> > The kernel change tries to call del_gendisk in STOP_ARRAY. So /dev/md0
> > can be removed and no one can access it. If not, the array can be
> > created again because md supports create on open.
> >
> > After the kernel change, the assemble process is:
> > 1. create array
> > 2. stop it (del_gendisk runs and /sys/block/md0 is removed)
> > 3. acces /sys/block/md0/xx (it fails)
> >
> > So del_gendisk destroys sysfs entries. If we destroy sysfs entries at
> > the last release of mddev, it will return to the old state that
> > /dev/md0 can be opened after stop. I don't want to return back.
> > Because some customers encounter bugs that shutdown is stuck because
> > /dev/md0 can't be stopped and the regression test usually fails
> > because of this too.
>
> Yes, from kernel side, we think after succeed stop_array ioct, the
> kernel disk should be removed in the end. We used to call del_gendisk
> asynchronously, leaves a race window that sysfs entries still visible
> to user.
>
> We decide to fix this in the last merge window, however, it's true mdadm
> has to be fixed together.
>
> >
> > I know it's not good to break mdadm by a kernel change. But sometimes
> > it needs userspace tool and kernel work together to fix a problem,
> > right?
> > Sorry for bringing the problem, and thanks for the suggestions. Any
> > more good suggestions?
> >
>
> Idealy, we should fix mdadm first, then after a release, fix kernel.
> Sadly the transition stage is missing now. :(
>
> If we want to just avoid this problem in kernel, what I can think of is
> adding a switch and mark it deprecated for now. And in new mdadm
> releases enable that switch, and after sometime, remove mdadm legacy
> code to stop array, and finally remove the deprecated switch in kernel
> then everyone will be happy :)
>
> Thanks,
> Kuai

Hi Kuai

Thanks for the suggestion. I'll use this way.

Regards
Xiao
>
> > Best Regards
> > Xiao
> >
> >
> > .
> >
>






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux