On Thu, Jul 03, 2025 at 08:05:05AM +0800, Hui Wang wrote: > On 7/2/25 17:43, Hui Wang wrote: > > On 7/2/25 07:23, Bjorn Helgaas wrote: > > > On Tue, Jun 24, 2025 at 08:58:57AM +0800, Hui Wang wrote: > > > > Sorry for late response, I was OOO the past week. > > > > > > > > This is the log after applied your patch: > > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111521/comments/61 > > > > > > > > > > > > Looks like the "retry" makes the nvme work. > > > Thank you! It seems like we get 0xffffffff (probably PCIe error) for > > > a long time after we think the device should be able to respond with > > > RRS. > > > > > > I always thought the spec required that after the delays, a device > > > should respond with RRS if it's not ready, but now I guess I'm not > > > 100% sure. Maybe it's allowed to just do nothing, which would lead to > > > the Root Port timing out and logging an Unsupported Request error. > > > > > > Can I trouble you to try the patch below? I think we might have to > > > start explicitly checking for that error. That probably would require > > > some setup to enable the error, check for it, and clear it. I hacked > > > in some of that here, but ultimately some of it should go elsewhere. > > > > OK, built a testing kernel, wait for bug reporter to test it and collect > > the log. > > > This is the testing result and log. > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111521/comments/65 Thanks! This looks like an Intel S2600WFT, and I assume it has a BMC that maintains a System Event Log. Any chance you check or keep that log? > > > @@ -1305,14 +1321,33 @@ static int pci_dev_wait(struct pci_dev *dev, > > > char *reset_type, int timeout) > > > if (root && root->config_rrs_sv) { > > > pci_read_config_dword(dev, PCI_VENDOR_ID, &id); > > > - if (!pci_bus_rrs_vendor_id(id)) > > > - break; > > > + > > > + if (pci_bus_rrs_vendor_id(id)) { > > > + pci_info(dev, "%s: read %#06x (RRS)\n", > > > + __func__, id); > > > + goto retry; > > > + } > > > + > > > + if (PCI_POSSIBLE_ERROR(id)) { > > > + pcie_capability_read_word(root, PCI_EXP_DEVSTA, > > > + &devsta); > > > + if (devsta & PCI_EXP_DEVSTA_URD) > > > + pcie_capability_write_word(root, > > > + PCI_EXP_DEVSTA, > > > + PCI_EXP_DEVSTA_URD); > > > + pci_info(root, "%s: read %#06x DEVSTA %#06x\n", > > > + __func__, id, devsta); We're waiting for 01:00.0, and we're seeing the poll message for about 375 ms: [ 10.334786] pci 10000:01:00.0: pci_dev_wait: VF- bus reset timeout 59900 [ 10.334792] pci 10000:00:02.0: pci_dev_wait: read 0xffffffff DEVSTA 0x0000 ... [ 10.708367] pci 10000:00:02.0: pci_dev_wait: read 0xffffffff DEVSTA 0x0000 The 00:02.0 Root Port has RRS enabled, but the config reads of the 01:00.0 Vendor ID did not return the RRS value (0x0001). Instead, they returned 0xffffffff, which typically means an error on PCIe. If an error occurred, I think it *should* set one of the Error Detected bits in the Device Status register, but we always see 0 there. I think the platform enabled firmware-first error handling and declined to give Linux control of AER, so I'm wondering if BIOS is capturing and clearing those errors before Linux would see them, hence my question about the SEL. [ 6.565996] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC. [ 6.702329] acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug AER LTR DPC] [ 6.702463] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] Even if this is the case and the SEL has error info, I don't know how that would help us, other than maybe to understand why Linux doesn't see the errors. Bjorn