On 7/1/25 08:48, Damien Le Moal wrote:
On 7/1/25 3:23 PM, Hannes Reinecke wrote:
On 6/30/25 08:26, Damien Le Moal wrote:
In ata_eh_revalidate_and_attach(), a link LPM policy is always
set to ATA_LPM_MAX_POWER before calling ata_dev_revalidate() to ensure
that the call to ata_phys_link_offline() does not return true, thus
causing an unnecessary device reset. This change was introduced
with commit 71d7b6e51ad3 ("ata: libata-eh: avoid needless hard reset
when revalidating link").
However, setting the link LPM policy to ATA_LPM_MAX_POWER may be
visible only after some time, depending on the power state the link was
in. E.g. transitioning out of the Partial state should take no longer
than a few microseconds, but transitioning out of the Slumber or
DevSleep state may take several milliseconds. So despite the changes
introduced with commit 71d7b6e51ad3 ("ata: libata-eh: avoid needless
hard reset when revalidating link"), we can still endup with
ata_phys_link_offline() seeing a link SCR_STATUS register signaling that
the device is present (DET is equal to 1h) but that the link PHY is
still in a low power mode (e.g. IPM is 2h, signaling "Interface in
Partial power management state"). In such cases, ata_phys_link_offline()
returns true, causing an EIO return for ata_eh_revalidate_and_attach()
and a device reset.
Avoid such unnecessary device resets by introducing a relaxed version
of the link offline test implemented by ata_phys_link_offline() with
the new helper function ata_eh_link_established(). This functions
returns true if for the link SCR_STATUS register we see that:
- A device is still present, that is, the DET field is 1h (Device
presence detected but Phy communication not established) or 3h
(Device presence detected and Phy communication established).
- Communication is established, that is, the IPM field is not 0h,
indicating that the PHY is online or in a low power state.
Signed-off-by: Damien Le Moal <dlemoal@xxxxxxxxxx>
---
drivers/ata/libata-eh.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index f98d5123e1e4..7f5d13f9ca73 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2071,6 +2071,33 @@ static void ata_eh_get_success_sense(struct ata_link
*link)
ata_eh_done(link, dev, ATA_EH_GET_SUCCESS_SENSE);
}
+/*
+ * Check if a link is established. This is a relaxed version of
+ * ata_phys_link_online() which accounts for the fact that this is potentially
+ * called after changing the link power management policy, which may not be
+ * reflected immediately in the SSTAUS register (e.g., we may still be seeing
+ * the PHY in partial, slumber or devsleep Partial power management state.
+ * So check that:
+ * - A device is still present, that is, DET is 1h (Device presence detected
+ * but Phy communication not established) or 3h (Device presence detected and
+ * Phy communication established)
+ * - Communication is established, that is, IPM is not 0h, indicating that PHY
+ * is online or in a low power state.
+ */
+static bool ata_eh_link_established(struct ata_link *link)
+{
+ u32 sstatus;
+ u8 det, ipm;
+
+ if (sata_scr_read(link, SCR_STATUS, &sstatus))
+ return false;
+
+ det = sstatus & 0x0f;
+ ipm = (sstatus >> 8) & 0x0f;
+
+ return (det & 0x01) && ipm;
+}
+
/**
* ata_eh_link_set_lpm - configure SATA interface power management
* @link: link to configure
@@ -3275,7 +3302,7 @@ static int ata_eh_revalidate_and_attach(struct ata_link
*link,
goto err;
}
- if (ata_phys_link_offline(ata_dev_phys_link(dev))) {
+ if (!ata_eh_link_established(ata_dev_phys_link(dev))) {
rc = -EIO;
goto err;
}
Makes me wonder: if the phy is taking some time, don't we need to wait
at some point for the transition to complete?
There is a 10ms wait already in sata_link_scr_lpm() but it seems to not always
be enough. The specs say that transitions out of HIPM "shall not take more than
10ms", but hey, we all know how devices always follow the specs, right ? :)
From a cursory glance we just continue, and (apparently) hope that
everything will be well eventually.
Hmm?
It is fine to continue because transitions out of DIPM/HIPM/DevSleep are
automatic if you send a command. So we actually do not need to wait at all and
probably can remove that 10ms sleep in sata_link_scr_lpm(). But I have not for now.
Ah. Maybe adding that to the description.
... or maybe not, as we seemed to be the only ones caring about this
kinda stuff :-)
Reviewed-by: Hannes Reinecke <hare@xxxxxxx.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich