Hi, This patch series introduces improvements to NVMe multipath handling by refining the removal behavior of the multipath head node and simplifying configuration options. The idea/POC for this change was originally proposed by Christoph[1] and Keith[2]. I worked upon their original idea/POC and implemented this series. The first patch in the series addresses an issue where the multipath head node of a PCIe NVMe disk is removed immediately when all disk paths are lost. This can cause problems in scenarios such as: - Hot removal and re-addition of a disk. - Transient PCIe link failures that trigger re-enumeration, briefly removing and restoring the disk. In such cases, premature removal of the head node may result in a device node name change, requiring applications to reopen device handles if they were performing I/O during the failure. To mitigate this, we introduce a delayed removal mechanism. Instead of removing the head node immediately, the system waits for a configurable timeout, allowing the disk to recover. If the disk comes back online within this window, the head node remains unchanged, ensuring uninterrupted workloads. A new sysfs attribute, delayed_removal_secs, allows users to configure this timeout. By default, it is set to 0 seconds, preserving the existing behavior unless explicitly changed. The second patch in the series introduced multipath_head_always module param. When this option is set, it force creating multipath head disk node even for single ported NVMe disks or private namespaces and thus allows delayed head node removal. This would help handle transient PCIe link failures transparently even in case of single ported NVMe disk or a private namespace The third patch in the series doesn't make any functional changes but just renames few of the function name which improves code readability and it better aligns function names with their actual roles. These changes should help improve NVMe multipath reliability and simplify configuration. Feedback and testing are welcome! [1] https://lore.kernel.org/linux-nvme/Y9oGTKCFlOscbPc2@xxxxxxxxxxxxx/ [2] https://lore.kernel.org/linux-nvme/Y+1aKcQgbskA2tra@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ Changes from v1: - Renamed delayed_shutdown_sec to delayed_removal_secs as "shutdown" has a special meaning when used with NVMe device (Martin Petersen) - Instead of adding mpath head disk node always by default, added new module option nvme_core.multipath_head_always which when set creates mpath head disk node (even for a private namespace or a namespace backed by single ported nvme disk). This way we can preserve the default old behavior.(hch) - Renamed nvme_mpath_shutdown_disk function as shutdown as in the NVMe context, the term "shutdown" has a specific technical meaning. (hch) - Undo changes which removed multipath module param as this param is still useful and used for many different things. Link to v1: https://lore.kernel.org/all/20250321063901.747605-1-nilay@xxxxxxxxxxxxx/ Nilay Shroff (3): nvme-multipath: introduce delayed removal of the multipath head node nvme: introduce multipath_head_always module param nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk drivers/nvme/host/core.c | 18 ++-- drivers/nvme/host/multipath.c | 193 ++++++++++++++++++++++++++++++---- drivers/nvme/host/nvme.h | 20 +++- drivers/nvme/host/sysfs.c | 13 +++ 4 files changed, 211 insertions(+), 33 deletions(-) -- 2.49.0