Hello Hans, On Fri, Apr 25, 2025 at 06:56:53PM +0800, Hans Zhang wrote: > > But I discovered a problem: > > 0001:90:00.0 PCI bridge: Device 1f6c:0001 (prog-if 00 [Normal decode]) > ...... > Capabilities: [c0] Express (v2) Root Port (Slot-), MSI 00 > DevCap: MaxPayload 512 bytes, PhantFunc 0 > ExtTag- RBE+ > DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ > MaxPayload 512 bytes, MaxReadReq 1024 bytes > > > > Should the DevCtl MaxPayload be 256B? > > But I tested that the file reading and writing were normal. Is the display > of 512B here what we expected? > > Root Port 0003:30:00.0 has the same problem. May I ask what your opinion is? > > > ...... > 0001:91:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd > NVMe SSD Controller PM9A1/PM9A3/980PRO (prog-if 02 [NVM Express]) > ...... > Capabilities: [70] Express (v2) Endpoint, MSI 00 > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s > unlimited, L1 unlimited > ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ > SlotPowerLimit 0W > DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ > RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ > FLReset- > MaxPayload 256 bytes, MaxReadReq 512 bytes > ...... Here we see that the bridge has a higher DevCtl.MPS than the DevCap.MPS of the endpoint. Let me quote Bjorn from the previous mail thread: """ - I don't think it's safe to set MPS higher in all cases. If we set the Root Port MPS=256, and an Endpoint only supports MPS=128, the Endpoint may do a 256-byte DMA read (assuming its MRRS>=256). In that case the RP may respond with a 256-byte payload the Endpoint can't handle. """ I think the problem with this patch is that pcie_write_mps() call in pci_host_probe() is done after the pci_scan_root_bus_bridge() call in pci_host_probe(). So pci_configure_mps() (called by pci_configure_device()), which does the limiting of the bus to what the endpoint supports, is actually called before the pcie_write_mps() call added by this patch (which increases DevCtl.MPS for the bridge). So I think the code added in this patch needs to be executed before pci_configure_device() is done for the EP. It appears that pci_configure_device() is called for each device during scan, first for the bridges and then for the EPs. So I think something like this should work (totally untested): --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -45,6 +45,8 @@ struct pci_domain_busn_res { int domain_nr; }; +static void pcie_write_mps(struct pci_dev *dev, int mps); + static struct resource *get_pci_domain_busn_res(int domain_nr) { struct pci_domain_busn_res *r; @@ -2178,6 +2180,11 @@ static void pci_configure_mps(struct pci_dev *dev) return; } + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT && + pcie_bus_config != PCIE_BUS_TUNE_OFF) { + pcie_write_mps(dev, 128 << dev->pcie_mpss); + } + if (!bridge || !pci_is_pcie(bridge)) return; But we would probably need to move some code to avoid the forward declaration. Kind regards, Niklas