Re: [PATCH 00/15] Improve mpathpersist's unavailable path handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/25/25 21:56, Benjamin Marzinski wrote:
On Mon, Aug 25, 2025 at 08:38:38AM +0200, Hannes Reinecke wrote:
On 7/10/25 20:10, Benjamin Marzinski wrote:
[ .. ]>>>
This patchset deals with both of these problems. libmpathpersist always
had code to handle releasing a reservation held by an unavailable path,
but the existing method is broken. It relies on poorly supported
optional features of SCSI Persistent Reservations: the READ FULL STATUS
command and specifying Initiator Ports with the REGISTER command
(SIP_C). Also, fixing its current issues would additionally require
supporting the All Target Ports option (ATP_C). This existing workaround
has been redesigned to use the PREEMPT command instead. Key changes
where the path holding the reservation is unavailable were not
previously handled by libmpathpersist. This patchset also handles them
using the PREEMPT command.

I wish we had a testcase for all of that. Persistent reservation
handling is tricky at the best of times, but throwing in multipathing
it really gets into the arcane knowledge area.
Ben, do you have something which we could turn into some blktest
scenarios?

It wouldn't be hard to use the LIO target to setup these scenarios, and
verify that mpathpersist is handling them. The bigger issue is that I'm
still occassionally running into new ones. I've got a couple more
patches to send to deal with them, but what this actually wants (and
what I plan to write after I think I've handled all the issues) is a
test that will write to the devices while randomly failing and restoring
paths and doing various PR commands, both to check that commands succeed
and fail when expected given the state of the devices when they were
run, and that we don't end up with active paths that either don't have
reservations when they should, or do have them when they shouldn't.

I can look at adding something like that to blktest.

That would be awesome. I am looking into updating/modifying PR handling
in qemu by using the in-kernel generic PR support, but to validate that
testcases would be really helpful.

Cheers,

Hannes
--
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@xxxxxxx                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux