[PATCH v2 0/1] PCI: pcie_failed_link_retrain() return if dev is not ASM2824

Matthew W Carlis <mattc@xxxxxxxxxxxxxxx> · Wed, 9 Jul 2025 12:52:57 -0600

On Wed, 9 Jul 2025, Ilpo Järvinen wrote:
> Are you saying there's still a problem in hpc? Since the introduction of 
> bwctrl, remove_board() in pciehp has had pcie_reset_lbms() (or it's 
> equivalent).
I think my concern with hpc or the current mechanism in general is that the
condition is basically binary. Across a large fleet I expect to see momentary
issues. For example a device might start to link up, have an issue & then
try to link up again and from there be working correctly. However if that
were to trigger an LBMS it might result in the quirk forcing the link to Gen1.

For example if the quirk first guided the link to Gen1 & then if the device
linked up at Gen1 it tried to guide it to Gen2 & then if it linked up at Gen2
it continued towards the maximum speed falling back down when it found the
device not able to achieve a certain higher speed that would be more ideal.
Or perhaps starting at the second highest speed & working its way down.
Its quite a large fall in performance for a device to go from Gen4/5 to Gen1
whereas the ASMedia/Pericom combination was only capable of Gen2 as a pair.
If the SI is marginal for Gen4/5 I would tend to think the device has a fairly
high chance of being able to run at the next lower speed for example.

Actually I also wonder in the case of the ASMedia & Pericom combination
would we just see a LBMS interrupt every time the device loop between
speeds? Maybe the quirk should have been invoked by bwctrl.c when a certain
rate of LBMS assertions is detected instead? Is it better to give a device
a few chances or to catch it right away on the first issue? (some value
judgements here)

> As I already mentioned, for DPC I agree, it likely should reset LBMS 
> somewhere.
...
> If you'd try this on different generations of Intel RP, you'd likely see 
> variations there too, that's my experience when testing bwctrl.

Yes agree about DPC especially given that there is likely vendor/device specific
variations in assertions of the bit. There is another patch that came into the
DPC driver which suppresses surprise down error reporting which I would like to
challenge/remove. My feeling is that the DPC driver should clear LBMS in all cases
before clearing DPC status.

>> Should it not matter how long ago LBMS
>> was asserted before we invoke a TLS modification?
>
> To some extent, yes, which is why we call pcie_reset_lbms() in a few 
> places.

Maybe there should even be a config or sysfs file to disable the quirk because
it kind of takes control away from users in some ways. i.e - doesn't obviously
interact well with callers of setpci etc.

>> I wonder if it shouldn't have to see some kind of actual link activity 
>> as a prereq to entering the quirk.
>
> How would you observe that "link activity"? Doesn't LBMS itself imply 
> "link activity" occurred?

I was thinking literally not entering the quirk function unless the kernel
had witnessed LNKSTA_DLLLA or LNKSTA_LT in the last second.

Does this preclude us from declaring a device as "broken" as done by the quirk
without having seen DLLA within 1s after DLLSC Event?
* PCI Express Base Revision - 6.7.3.3 Data Link Layer State Changed Events
"Software must allow 1 second after the Data Link Layer Link Active bit reads 1b
before it is permitted to determine that a hot plugged device which fails to return
a Successful Completion for a Valid Configuration Request is a broken device."

> > One thing that honestly doesn't make any sense to me is the ID list in the
> > quirk. If the link comes up after forcing to Gen1 then it would only restore
> > TLS if the device is the ASMedia switch, but also ignoring what device is
> > detected downstream. If we allow ASMedia to restore the speed for any downstream
> > device when we only saw the initial issue with the Pericom switch then why
> > do we exclude Intel Root Ports or AMD Root Ports or any other bridge from the
> > list which did not have any issues reported.
> 
> I think it's because the restore has been tested on that device 
> (whitelist).
> 
> Your reasoning is based on assumption that TLS quirk setting Link Speed 
> to 2.5GT/s is part of "normal" operation. My view is that those 
> triggerings are caused by not clearing stale LBMS in the right places. If 
> LBMS is not wrongly kept, the quirk is no-op on all but that ID listed 
> device.

I'm making a slightly different assumption which is "something is working
until proven otherwise". We only know that the restore works on the ASMedia
when the downstream device is the Pericom switch. In fact we only know
it works for very specific layout & configuration of these two devices.
It seems wrong in my mind to be more restrictive on devices that don't have
a reported issue from, but then be less restrictive on the devices that had an
out of spec interaction in the first place. Until reported we don't know
how many devices might see LBMS get set during the course of linking up, but
then still arrive at the maximum speed.