On 8/25/2025 2:54 PM, Alex Williamson wrote:
On Mon, 25 Aug 2025 10:12:19 -0700
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
If a device is in an error state, then any reads of device registers can
return error value. Add addtional checks to validate if a device is in an
error state before doing an flr or pm reset.
I think the thing we see in practice for a device that's wedged and
returning -1 from config space is that the FLR will timeout waiting for
a pending transaction. So this should fix that, but should we log
something?
I guess it makes sense to add a warn log.
I'm assuming AF FLR is not needed here because we don't cache the
offset and therefore won't find the capability when we search the chain
for it.
Yes, based on my understanding of the when we search for the capability
offset, we would return 0 if the config space read returns a -1
(https://elixir.bootlin.com/linux/v6.16.3/source/drivers/pci/pci.c#L441).
Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
---
drivers/pci/pci.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0dd95d782022..a07bdb287cf3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4560,12 +4560,17 @@ EXPORT_SYMBOL_GPL(pcie_flr);
*/
int pcie_reset_flr(struct pci_dev *dev, bool probe)
{
+ u32 reg;
+
if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
return -ENOTTY;
if (!(dev->devcap & PCI_EXP_DEVCAP_FLR))
return -ENOTTY;
+ if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, ®))
+ return -ENOTTY;
+
if (probe)
return 0;
@@ -4640,6 +4645,8 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
return -ENOTTY;
pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
+ if (PCI_POSSIBLE_ERROR(csr))
+ return -ENOTTY;
Doesn't this turn out to be redundant to the test below?
Yup, I guess i was being extra cautious. Will remove the check.
Thanks
Farhan
if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
return -ENOTTY;
Thanks,
Alex