[PATCH v2 0/1] PCI: pcie_failed_link_retrain() return if dev is not ASM2824

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 4 Jul 2025, Ilpo Järvinen wrote:
> The other question still stands though, why is LBMS is not reset? Perhaps 
> DPC should clear LBMS in some places (that is, call pcie_reset_lbms()). 
> Have you consider that?

Initially we started to observe this when physically removing and
reinserting devices in a kernel version with the quirk, but without the bandwidth
controller driver. I think there is a problem with any place where the link
would be expected to go down (dpc, hpc, etc) & then carrying forward LBMS
into the next time the link comes up. Should it not matter how long ago LBMS
was asserted before we invoke a TLS modification? It also looks like card
presence is enough for the kernel to believe the link should train & enter
the quirk function without ever having seen LNKSTA_DLLLA or LNKSTA_LT. I
wonder if it shouldn't have to see some kind of actual link activity as a
prereq to entering the quirk.

> (It sound to me you're having this occur in multiple scenarios and I've 
> some trouble on figuring those out from your long descriptions what those 
> exactly are so it's bit challenging for me to suggest where it should be 
> done but I the surprise down certainly seems like case where LBMS 
> information must have become stale so it should be reset which would 
> prevent quirk from setting 2.5GT/s)

Something I found recently that was interesting - when I power off
a slot (triggering DPC via SDES) the LBMS becomes set on Intel Root Ports,
but in another server with a PCIe switch LBMS does not become set on the
switch DSP if I perform the same action. I don't have any explanation for
this difference other than "vendor specific" behavior.

One thing that honestly doesn't make any sense to me is the ID list in the
quirk. If the link comes up after forcing to Gen1 then it would only restore
TLS if the device is the ASMedia switch, but also ignoring what device is
detected downstream. If we allow ASMedia to restore the speed for any downstream
device when we only saw the initial issue with the Pericom switch then why
do we exclude Intel Root Ports or AMD Root Ports or any other bridge from the
list which did not have any issues reported.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux