On Mon, 2025-08-25 at 20:51 -0400, Benjamin Marzinski wrote: > On Sun, Aug 24, 2025 at 05:26:50PM +0200, Martin Wilck wrote: > > > > > > + /* > > > + * Cannot free the reservation because the path that is > > > holding it > > > + * is not usable. Workaround this by: > > > + * 1. Suspending the device > > > + * 2. Preempting the reservation to move it to a usable > > > path > > > + * (this removes the registered keys on all paths > > > except > > > the > > > + * preempting one. Since the device is suspended, no > > > IO > > > can > > > + * go to these unregistered paths and fail). > > > + * 3. Releasing the reservation on the path that now > > > holds > > > it. > > > + * 4. Resuming the device (since it no longer matters > > > that > > > most of > > > + * that paths no longer have a registered key) > > > + * 5. Reregistering keys on all the paths > > > + */ > > > + > > > + if (!dm_simplecmd_noflush(DM_DEVICE_SUSPEND, mpp->alias, > > > 0)) > > > { > > > + condlog(0, "%s: release: failed to suspend dm > > > device.", > > > > Why do you use dm_simplecmd_noflush() here? Shouldn't queued IO be > > flushed from the dm device to avoid it being sent to paths that are > > going to be unregistered? > > > > I'm pretty certain that DM will still flush all the IO from the > target > to DM core before suspending, even with dm_simplecmd_noflush() set. > In > request based multipath, queued IOs are never stored in the target. > In > bio based multipath, they are, but they will get flushed back up to > DM > core when suspending and queued there. No IO should happen through > the > target after the suspend, until the resume. dm_simplecmd_noflush() > just > keeps multipath from failing any IO that it had queueing, and it's > only > really necessary when we resize the device, because if we shrink the > device, outstanding IO might be outside the new bounds. OK, thanks for the clarification. I guess I've never fully understood the way queueing works in dm. What about queueing in the path devices? We'll be removing registration keys, so IO sent by the SCSI layer may end up with RESERVATION CONFLICT errors. To my understanding, without the DM_NOFLUSH_FLAG the kernel will freeze the queue and flush everything, as if the device was closed during shutdown. If DM_NOFLUSH_FLAG is set, this won't happen. What's preventing the SCSI layer from sending IO while we're modifying the registrations? Martin