On 4/28/25 09:39, Nilay Shroff wrote:
On 4/28/25 12:27 PM, Hannes Reinecke wrote:
On 4/25/25 12:33, Nilay Shroff wrote:
Currently, a multipath head disk node is not created for single-ported
NVMe adapters or private namespaces. However, creating a head node in
these cases can help transparently handle transient PCIe link failures.
Without a head node, features like delayed removal cannot be leveraged,
making it difficult to tolerate such link failures. To address this,
this commit introduces nvme_core module parameter multipath_head_always.
When this param is set to true, it forces the creation of a multipath
head node regardless NVMe disk or namespace type. So this option allows
the use of delayed removal of head node functionality even for single-
ported NVMe disks and private namespaces and thus helps transparently
handling transient PCIe link failures.
By default multipath_head_always is set to false, thus preserving the
existing behavior. Setting it to true enables improved fault tolerance
in PCIe setups. Moreover, please note that enabling this option would
also implicitly enable nvme_core.multipath.
Signed-off-by: Nilay Shroff <nilay@xxxxxxxxxxxxx>
---
drivers/nvme/host/multipath.c | 70 +++++++++++++++++++++++++++++++----
1 file changed, 63 insertions(+), 7 deletions(-)
I really would model this according to dm-multipath where we have the
'fail_if_no_path' flag.
This can be set for PCIe devices to retain the current behaviour
(which we need for things like 'md' on top of NVMe) whenever the
this flag is set.
Okay so you meant that when sysfs attribute "delayed_removal_secs"
under head disk node is _NOT_ configured (or delayed_removal_secs
is set to zero) we have internal flag "fail_if_no_path" is set to
true. However in other case when "delayed_removal_secs" is set to
a non-zero value we set "fail_if_no_path" to false. Is that correct?
Don't make it overly complicated.
'fail_if_no_path' (and the inverse 'queue_if_no_path') can both be
mapped onto delayed_removal_secs; if the value is '0' then the head
disk is immediately removed (the 'fail_if_no_path' case), and if it's
-1 it is never removed (the 'queue_if_no_path' case).
Question, though: How does it interact with the existing
'ctrl_loss_tmo'? Both describe essentially the same situation...
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich