On Thu, Aug 21, 2025 at 08:54:52AM +0800, Ethan Zhao wrote: > On 8/21/2025 1:26 AM, Brian Norris wrote: > > From: Brian Norris <briannorris@xxxxxxxxxx> > > > > max_link_speed, max_link_width, current_link_speed, current_link_width, > > secondary_bus_number, and subordinate_bus_number all access config > > registers, but they don't check the runtime PM state. If the device is > > in D3cold, we may see -EINVAL or even bogus values. > My understanding, if your device is in D3cold, returning of -EINVAL is > the right behavior. That's not the guaranteed result though. Some hosts don't properly return PCIBIOS_DEVICE_NOT_FOUND, for one. But also, it's racy -- because we don't even try to hold a pm_runtime reference, the device could possibly enter D3cold while we're in the middle of reading from it. If you're lucky, that'll get you a completion timeout and an all-1's result, and we'll return a garbage result. So if we want to purposely not resume the device and retain "I can't give you what you asked for" behavior, we'd at least need a pm_runtime_get_noresume() or similar. > > Wrap these access in pci_config_pm_runtime_{get,put}() like most of the > > rest of the similar sysfs attributes. > > > > Fixes: 56c1af4606f0 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") > > Cc: stable@xxxxxxxxxxxxxxx > > Signed-off-by: Brian Norris <briannorris@xxxxxxxxxx> > > Signed-off-by: Brian Norris <briannorris@xxxxxxxxxxxx> > > --- > > > > drivers/pci/pci-sysfs.c | 32 +++++++++++++++++++++++++++++--- > > 1 file changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > index 5eea14c1f7f5..160df897dc5e 100644 > > --- a/drivers/pci/pci-sysfs.c > > +++ b/drivers/pci/pci-sysfs.c > > @@ -191,9 +191,16 @@ static ssize_t max_link_speed_show(struct device *dev, > > struct device_attribute *attr, char *buf) > > { > > struct pci_dev *pdev = to_pci_dev(dev); > > + ssize_t ret; > > + > > + pci_config_pm_runtime_get(pdev); > This function would potentially change the power state of device, > that would be a complex process, beyond the meaning of > max_link_speed_show(), given the semantics of these functions ( > max_link_speed_show()/max_link_width_show()/current_link_speed_show()/ > ....), > this cannot be done ! What makes this different than the 'config' attribute (i.e., "read config register")? Why shouldn't that just return -EINVAL? I don't really buy your reasoning -- "it's a complex process" is not a reason not to do something. The user asked for the link speed; why not give it? If the user wanted to know if the device was powered, they could check the 'power_state' attribute instead. (Side note: these attributes don't show up anywhere in Documentation/, so it's also a bit hard to declare "best" semantics for them.) To flip this question around a bit: if I have a system that aggressively suspends devices when there's no recent activity, how am I supposed to check what the link speed is? Probabilistically hammer the file while hoping some other activity wakes the device, so I can find the small windows of time where it's RPM_ACTIVE? Disable runtime_pm for the device while I check? Brian