On 6/16/25 19:55, Hui Wang wrote:
On 6/13/25 00:48, Bjorn Helgaas wrote:
[+cc VMD folks]
On Wed, Jun 11, 2025 at 06:14:42PM +0800, Hui Wang wrote:
Prior to commit d591f6804e7e ("PCI: Wait for device readiness with
Configuration RRS"), this Intel nvme [8086:0a54] works well. Since
that patch is merged to the kernel, this nvme stops working.
Through debugging, we found that commit introduces the RRS polling in
the pci_dev_wait(), for this nvme, when polling the PCI_VENDOR_ID, it
will return ~0 if the config access is not ready yet, but the polling
expects a return value of 0x0001 or a valid vendor_id, so the RRS
polling doesn't work for this nvme.
Sorry for breaking this, and thanks for all your work in debugging
this! Issues like this are really hard to track down.
I would think we would have heard about this earlier if the NVMe
device were broken on all systems. Maybe there's some connection with
VMD? From the non-working dmesg log in your bug report
(https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111521/+attachment/5879970/+files/dmesg-60.txt):
DMI: ASUSTeK COMPUTER INC. ESC8000 G4/Z11PG-D24 Series, BIOS 5501
04/17/2019
vmd 0000:d7:05.5: PCI host bridge to bus 10000:00
pci 10000:00:02.0: [8086:2032] type 01 class 0x060400 PCIe Root Port
pci 10000:00:02.0: PCI bridge to [bus 01]
pci 10000:00:02.0: bridge window [mem 0xf8000000-0xf81fffff]:
assigned
pci 10000:01:00.0: [8086:0a54] type 00 class 0x010802 PCIe Endpoint
pci 10000:01:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit]
<I think vmd_enable_domain() calls pci_reset_bus() here>
Yes, and the pci_dev_wait() is called here. With the RRS polling, will
get a ~0 from PCI_VENDOR_ID, then will get 0xfffffff when configuring
the BAR0 subsequently. With the original polling method, it will get
enough delay in the pci_dev_wait(), so nvme works normally.
The line "[ 10.193589] hhhhhhhhhhhhhhhhhhhhhhhhhhhh dev->device =
0a54 id = ffffffff" is output from pci_dev_wait(), please refer to
https://launchpadlibrarian.net/798708446/LP2111521-dmesg-test9.txt
pci 10000:01:00.0: BAR 0 [mem 0xf8010000-0xf8013fff 64bit]: assigned
pci 10000:01:00.0: BAR 0: error updating (high 0x00000000 !=
0xffffffff)
pci 10000:01:00.0: BAR 0 [mem 0xf8010000-0xf8013fff 64bit]: assigned
pci 10000:01:00.0: BAR 0: error updating (0xf8010004 != 0xffffffff)
nvme nvme0: pci function 10000:01:00.0
nvme 10000:01:00.0: enabling device (0000 -> 0002)
Things I notice:
- The 10000:01:00.0 NVMe device is behind a VMD bridge
- We successfully read the Vendor & Device IDs (8086:0a54)
- The NVMe device is uninitialized. We successfully sized the BAR,
which included successful config reads and writes. The BAR
wasn't assigned by BIOS, which is normal since it's behind VMD.
- We allocated space for BAR 0 but the config writes to program the
BAR failed. The read back from the BAR was 0xffffffff; probably a
PCIe error, e.g., the NVMe device didn't respond.
- The device *did* respond when nvme_probe() enabled it: the
"enabling device (0000 -> 0002)" means pci_enable_resources() read
PCI_COMMAND and got 0x0000.
- The dmesg from the working config doesn't include the "enabling
device" line, which suggests that pci_enable_resources() saw
PCI_COMMAND_MEMORY (0x0002) already set and didn't bother setting
it again. I don't know why it would already be set.
d591f6804e7e really only changes pci_dev_wait(), which is used after
device resets. I think vmd_enable_domain() resets the VMD Root Ports
after pci_scan_child_bus(), and maybe we're not waiting long enough
afterwards.
My guess is that we got the ~0 because we did a config read too soon
after reset and the device didn't respond. The Root Port would time
out, log an error, and synthesize ~0 data to complete the CPU read
(see PCIe r6.0, sec 2.3.2 implementation note).
It's *possible* that we waited long enough but the NVMe device is
broken and didn't respond when it should have, but my money is on a
software defect.
There are a few pci_dbg() calls about these delays; can you set
CONFIG_DYNAMIC_DEBUG=y and boot with dyndbg="file drivers/pci/* +p" to
collect that output? Please also collect the "sudo lspci -vv" output
from a working system.
Already passed the testing request to bug reporters, wait for their
feedback.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111521/comments/55
Thanks,
Hui.
This is the dmesg with dyndbg="file drivers/pci/* +p"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111521/comments/56
And this is the lspci output:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111521/comments/57
Bjorn