[PATCH 0/1] PCI: pcie_failed_link_retrain() return if dev is not ASM2824

Matthew W Carlis <mattc@xxxxxxxxxxxxxxx> · Thu, 26 Jun 2025 18:26:50 -0600

  I have reached out several times about issues caused by the
pcie_failed_link_retrain() quirk. Initially there were some additional changes
that we made to try and reduce the occurrences, but I have continued to
observe issues where hot-plug slots end up at Gen1 speed when they should not
have been or the quirk invoked when the link is not actually training at all.
Realistically speaking this quirk is a large regression to hot-plug
functionality in the kernel & therefore I am submitting this patch to restrict
the quirk to the ASMedia device where the LTSSM problem was actually observed
in the first place.
  The comment above the quirk states that the bad behavior was observed when the
Asmedia ASM2824 switch was upstream of some other device. Then further asserts
that the issue could in theory happen if the ASM2824 was downstream &
therefore I believe this is why it was concluded that the quirk should be
invoked on any device. i.e If the ASM2824 is the downstream device then we
could not use its device ID to trigger the quirk action.
  This is flawed in the sense that it was not actually observed and may not even
be an actual configuration anywhere in any device. It very well may not be the
case that ASM2824 could have this issue when it is the downstream device
because the issue was never root caused as far as I can tell & there is no
analyzer trace. The author had a noble goal, but it seems quite difficult to
capture the correct trigger & sequence of actions for every device considering
we're trying to address an issue beyond compliance with the spec afaik.

  In my testing I have encountered alarming rates of the quirk being invoked
when it should not be invoked & frequently degrading a link that would have
otherwise trained to full speed/width. The impact to hot-plug reliability is
observed to be extreme & therefore we believe cannot be justified to be
broadly applied. In the case of hot-insert the rate of being incorrectly forced
to Gen1 has been observed to be as high as 15% with some U.2 NVMe drives.
  It has been observed in several different system configurations with several
different U.2 NVMe drives from different vendors. All of the systems that we
have reproduced this issue on comply with PCI Express® Base Specification
Revision 6.0 Appendix I. Async Hot-Plug Reference Model for OS controlled DPC
or are very near to it. None of the systems I have tested implement Power
Controller capability.
  The largest occurrence of this issue has been observed on systems with OOB PD
(out-of-band presence detect), but it has also been observed in systems that
do not have OOB PD (Using Inband-PD or DLL State Changed).
  Actions likely to trigger the condition where the quirk forces the link to
Gen1 include physical hot-insert, slot power cycle, slot power on, toggle of
fundamental reset. By observation I believe there are several timing hazards
with DPC especially when using EDR. The expectation by DPC that the link should
recover before returning from DPC handling work is additionally a questionable
expectation in the case of a port being Hot Plug capable. Further, it appears
that the quirk can be invoked two times by the DPC driver. In the case of not
using HotPlug- with DPC it appears even more likely for invocation of the quirk
due to different handling around SDES (Surprise down error). In my mind this
makes a very complicated set of interactions even more complicated...

  In the case of hot-insert it appears that the power-up sequencing of drives &
their boot times directly contribute to the invocation of the quirk & the link
being forced to Gen1. For example, presence interrupt comes quickly
(first-to-mate in U.2) however the power pins are last-to-mate (ground pins
second to mate). Therefore presence can be seen even before power-up
sequencing in the drive is complete. If the drive powers-up, boots and the
link becomes active just after the quirk has written TLS (target link speed)
to Gen1 then your drive is forced to Gen1. If the sequence takes even longer
then you would see in the log "broken device" & "retraining failed", but
then later DLLSC would initiate the pciehp device add sequence again which
creates extreme confusion for most readers.
  In the case of power cycling the slot (without power controller capability)
then there are differences between OOB-PD enabled systems vs systems using DLL
State Change interrupts. In the case of OOB-PD the kernel will declare "Link
Down", set the ctrl state to OFF_STATE, remove the device, but then
immediately declare "Card present" & run down the pciehp_enable_slot() path,
but would run into the quirk since the slot power be off & it not see the link
train before timing out. Disabling OOB-PD & using recently deprecated
Inband-PD avoids the trap more often since presence is synthesized by LTSSM &
only asserted when the link is active however link degradation was still
observed in pcie resilience torture testing. Unfortunately I don't have a
meaningful characterization of the Inband-PD reproductions.
  With & without HotPlug capability the quirk becomes harder to hit after
pulling in the pci/pcie/bwctrl.c changes, but is still observable in several
circumstances. Those being observed around the handling of DPC with and with
EDR. The bottom line from my perspective is that even with bwctrl.c we still
observe a significant regression in hot-plug reliability in terms of arriving
at the correct speed. In my experience the link issue observed by the author
of the quirk is most likely an incompatibility between specific devices
as opposed to being something that could result from degraded link integrity or
device again & therefore should be restricted to the particular device where
observed.

Matthew W Carlis (1):
  PCI: pcie_failed_link_retrain() return if dev is not ASM2824

 drivers/pci/quirks.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

-- 
2.46.0