On 8/25/2025 2:35 PM, Alex Williamson wrote:
On Mon, 25 Aug 2025 10:12:18 -0700
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
The current reset process saves the device's config space state before
reset and restores it afterward. However, when a device is in an error
state before reset, config space reads may return error values instead of
valid data. This results in saving corrupted values that get written back
to the device during state restoration. Add validation to prevent writing
error values to the device when restoring the config space state after
reset.
Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
---
drivers/pci/pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0f4d98036cd..0dd95d782022 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1825,6 +1825,9 @@ static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
if (!force && val == saved_val)
return;
+ if (PCI_POSSIBLE_ERROR(saved_val))
+ return;
+
for (;;) {
pci_dbg(pdev, "restore config %#04x: %#010x -> %#010x\n",
offset, val, saved_val);
The commit log makes this sound like more than it is. We're really
only error checking the first 64 bytes of config space before restore,
the capabilities are not checked. I suppose skipping the BARs and
whatnot is no worse than writing -1 to them, but this is only a
complete solution in the narrow case where we're relying on vfio-pci to
come in and restore the pre-open device state.
I had imagined that pci_save_state() might detect the error state of
the device, avoid setting state_saved, but we'd still perform the
restore callouts that only rely on internal kernel state, maybe adding a
fallback to restore the BARs from resource information.
I initially started with pci_save_state(), and avoid saving the state
altogether. But that would mean we don't go restore the msix state and
for s390 don't call arch_restore_msi_irqs(). Do you prefer to avoid
saving the state at all? This change was small and sufficient enough to
avoid breaking the device in my testing.
This implementation serves a purpose, but the commit log should
describe the specific, narrow scenario this solves, and probably also
add a comment in the code about why we're not consistently checking the
saved state for errors. Thanks,
Alex
Yes, I can re-word the commit message.
Thanks
Farhan