Re: [Bug 219984] New: [BISECTED] High power usage since 'PCI/ASPM: Correct LTR_L1.2_THRESHOLD computation'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 08, 2025 at 09:02:46PM +0100, Sergey Dolgov wrote:
> Dear Bjorn,
> 
> here are both dmesg from the kernels with your info patch.

Thanks again!  Here's the difference:

  - pre  7afeb84d14ea
  + post 7afeb84d14ea

   pci 0000:02:00.0: parent CMRT 0x28 child CMRT 0x00
   pci 0000:02:00.0: parent T_POWER_ON 0x2c usec (val 0x16 scale 0)
   pci 0000:02:00.0: child  T_POWER_ON 0x0a usec (val 0x5 scale 0)
   pci 0000:02:00.0: t_common_mode 0x28 t_power_on 0x2c l1_2_threshold 0x5a
  -pci 0000:02:00.0: encoded LTR_L1.2_THRESHOLD value 0x02 scale 3
  +pci 0000:02:00.0: encoded LTR_L1.2_THRESHOLD value 0x58 scale 2

We computed LTR_L1.2_THRESHOLD == 0x5a == 90 usec == 90000 nsec.

Prior to 7afeb84d14ea, we computed *scale = 3, *value = (90000 >> 15)
== 0x2.  But per PCIe r6.0, sec 6.18, this is a latency value of only
0x2 * 32768 == 65536 ns, which is less than the 90000 ns we requested.

After 7afeb84d14ea, we computed *scale = 2, *value =
roundup(threshold_ns, 1024) / 1024 == 0x58, which is a latency value
of 90112 ns, which is almost exactly what we requested.

In essence, before 7afeb84d14ea we tell the Root Port that it can
enter L1.2 and get back to L0 in 65536 ns or less, and after
7afeb84d14ea, we tell it that it may take up to 90112 ns.

It's possible that the calculation of LTR_L1.2_THRESHOLD itself in
aspm_calc_l12_info() is too conservative, and we don't actually need
90 usec, but I think the encoding done by 7afeb84d14ea itself is more
correct.  I don't have any information about how to improve 90 usec
estimate.  (If you happen to have Windows on that box, it would be
really interesting to see how it sets LTR_L1.2_THRESHOLD.)

If the device has sent LTR messages indicating a latency requirement
between 65536 ns and 90112 ns, the pre-7afeb84d14ea kernel would allow
L1.2 while post 7afeb84d14ea would not.  I don't think we can actually
see the LTR messages sent by the device, but my guess is they must be
in that range.  I don't know if that's enough to account for the major
difference in power consumption you're seeing.  

The AX200 at 6f:00.0 is in exactly the same situation as the
Thunderbolt bridge at 02:00.0 (LTR_L1.2_THRESHOLD 90 usec, RP set to
65536 ns before 7afeb84d14ea and 90112 ns after).

For the NVMe devices at 6d:00.0 and 6e:00.0, LTR_L1.2_THRESHOLD is
3206 usec (!), and we set the RP to 3145728 ns (slightly too small)
before, 3211264 ns after.

For the RTS525A at 70:00.0, LTR_L1.2_THRESHOLD is 126 usec, and we set
the RP to 98304 ns before, 126976 ns after.

Sorry, no real answers here yet, still puzzled.

Bjorn




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux