On Thu, 3 Jul 2025, Ilpo Järvinen wrote: > Is this mainly related to some artificial test that rapidly fires event > after another (which is known to confuse the quirk)? ...I mean, you say > "extremely likely". I wouldn't describe the test as "rapidly fires" of events because we have given conservative delays between injections (waiting for DLLA & being able to perform IO to the nvme block device before potentially injecting again). In any case the testing results are clearly worse when moving from a kernel that didn't have the quirk to a kernel that does which is a regression in my mind. > I suppose when the problem occurs and the bridge remains at 2.5GT/s, is it > possible to restore the higher speed using the pcie_cooling device > associated with the bridge / bwctrl? You can find the correct cooling > device with this: Yes the problem is when a device is forced to 2.5GT/s and it should not have been. I did not test with the patches for CONFIG_PCIE_THERMAL because our drives would not need thermal management by the kernel, but if I use "setpci" to restore TLS & then write the link retrain bit the link would arrive at the maximum speed (Gen3/Gen4/Gen5 depending). I have other vendor drives as well, but we design and build our own drives with our own firmware & therefore are able to determine from firmware logging in the drive when the link was most likely guided to 2.5GT/s by TLS. We are also able to see the 2.5GT/s value in the TLS register when it happens. I have less visibility into drives from other vendors in terms of ltssm transitions without hooking up an analyzer.