On Fri, Jul 18, 2025 at 03:21:30PM +0800, Weili Qian wrote: > On 2025/7/15 3:19, Bjorn Helgaas wrote: > > On Sat, Jul 12, 2025 at 07:30:28PM +0800, Weili Qian wrote: > >> Prior to commit d591f6804e7e ("PCI: Wait for device readiness with > >> Configuration RRS"), pci_dev_wait() polls PCI_COMMAND register until > >> its value is not ~0(i.e., PCI_ERROR_RESPONSE). After d591f6804e7e, > >> if the Configuration Request Retry Status Software Visibility (RRS SV) > >> is enabled, pci_dev_wait() polls PCI_VENDOR_ID register until its value > >> is not the reserved Vendor ID value 0x0001. > >> > >> On Kunpeng accelerator devices, RRS SV is enabled. However, > >> when the virtual function's FLR (Function Level Reset) is not > >> ready, the pci_dev_wait() reads the PCI_VENDOR_ID register and gets > >> the value 0xffff instead of 0x0001. It then incorrectly assumes this > >> is a valid Vendor ID and concludes the device is ready, returning > >> successfully. In reality, the function may not be fully ready, leading > >> to the device becoming unavailable. > >> > >> A 100ms wait period is already implemented before calling pci_dev_wait(). > >> In most cases, FLR completes within 100ms. However, to eliminate the > >> risk of function being unavailable due to an incomplete FLR, a > >> device-specific reset is added. After pcie_flr(), the function continues > >> to poll PCI_COMMAND register until its value is no longer ~0. > > > > As far as I can tell, there's nothing specific to Kungpeng devices > > here. We've seen a similar issue with Intel NVMe devices [1], and I > > don't want a whole mess of quirks and device-specific reset methods. > > > > We need some sort of generic solution for this. My understanding was > > that if devices are not ready 100ms after a reset, they are required > > to respond with RRS. Maybe these devices are defective. Or maybe my > > understanding is incorrect. Either way, I think we should at least > > check for a PCIe error before assuming that 0xffff is a valid > > response. > > A generic solution for this would be even better. > > I tested the patch in [1], but it did not resolve the issue. > According to the PCIe specification(e.g., Section 7.5.1 > PCI-Compatible Configuration Registers), the Vendor ID and Device ID > for VFs are 0xffff. However, if VFs cannot respond using the RRS > mechanism, the pci_dev_wait() function reads the PCI_VENDOR_ID > register and gets the value 0xffffffff. This value is considered an > error on PCIe, causing pci_dev_wait() to continuously poll the > register until a timeout occurs. This is one of the problems: reading the VF Vendor ID gets ~0 in two cases: (1) the config read was successful (the VF Vendor ID is actually 0xffff), and (2) the config read failed and the RC synthesized ~0 return data. I hoped to be able to distinguish these with the debug patch at [2], but so far it doesn't work as I expected, so more debug is required. [1] is an example of another way of doing a device-specific quirk. It doesn't really matter whether that way works for the Kunpeng device, because I want to avoid any of these device-specific things if possible. > > [1] https://lore.kernel.org/linux-pci/20250611101442.387378-1-hui.wang@xxxxxxxxxxxxx/ [2] https://lore.kernel.org/r/20250701232341.GA1859056@bhelgaas > >> Fixes: d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS") > >> Signed-off-by: Weili Qian <qianweili@xxxxxxxxxx> > >> --- > >> drivers/pci/quirks.c | 36 ++++++++++++++++++++++++++++++++++++ > >> 1 file changed, 36 insertions(+) > >> > >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > >> index d7f4ee634263..1df1756257d2 100644 > >> --- a/drivers/pci/quirks.c > >> +++ b/drivers/pci/quirks.c > >> @@ -4205,6 +4205,36 @@ static int reset_hinic_vf_dev(struct pci_dev *pdev, bool probe) > >> return 0; > >> } > >> > >> +#define KUNPENG_OPERATION_WAIT_CNT 3000 > >> +#define KUNPENG_RESET_WAIT_TIME 20 > >> + > >> +/* Device-specific reset method for Kunpeng accelerator virtual functions */ > >> +static int reset_kunpeng_acc_vf_dev(struct pci_dev *pdev, bool probe) > >> +{ > >> + u32 wait_cnt = 0; > >> + u32 cmd; > >> + > >> + if (probe) > >> + return 0; > >> + > >> + pcie_flr(pdev); > >> + > >> + do { > >> + pci_read_config_dword(pdev, PCI_COMMAND, &cmd); > >> + if (!PCI_POSSIBLE_ERROR(cmd)) > >> + break; > >> + > >> + if (++wait_cnt > KUNPENG_OPERATION_WAIT_CNT) { > >> + pci_warn(pdev, "wait for FLR ready timeout; giving up\n"); > >> + return -ENOTTY; > >> + } > >> + > >> + msleep(KUNPENG_RESET_WAIT_TIME); > >> + } while (true); > >> + > >> + return 0; > >> +} > >> + > >> static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { > >> { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF, > >> reset_intel_82599_sfp_virtfn }, > >> @@ -4220,6 +4250,12 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { > >> reset_chelsio_generic_dev }, > >> { PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HINIC_VF, > >> reset_hinic_vf_dev }, > >> + { PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_ZIP_VF, > >> + reset_kunpeng_acc_vf_dev }, > >> + { PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_SEC_VF, > >> + reset_kunpeng_acc_vf_dev }, > >> + { PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_HPRE_VF, > >> + reset_kunpeng_acc_vf_dev }, > >> { 0 } > >> }; > >> > >> -- > >> 2.33.0 > >> > > . > >