Re: reset_slot() callback not respecting MPS config

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 22, 2025 at 06:19:56PM +0200, Niklas Cassel wrote:
> Hello there,
> 
> 
> As you know the reset_slot() callback patches were merged recently.
> 
> Wilfred and I (mostly Wilfred), have been debugging DMA issues after the
> reset_slot() callback has been invoked. The issue is reproduced when MPS
> configuration is set to performance, but might be applicable for other
> MPS configurations as well. The problem appears to be that reset_slot()
> feature does not respect/restore the MPS configuration.
> 
> 
> When having two Rock 5Bs, one in RC mode, one in EP mode:
> 
> # Boot with MPS bus performance
> pci=pcie_bus_perf on kernel command line
> 
> # lspci RC
> lspci -vvvs 0000:00:00.0 | grep -E MaxPayload
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0
>                 DevCtl: MaxPayload 256 bytes, MaxReadReq 256 bytes
> 
> # Run pcitests
> All tests pass
> 
> # Perform hot reset of EP
> echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset
> 
> # lspci RC
> lspci -vvvs 0000:00:00.0 | grep -E MaxPayload
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0
>                 DevCtl: MaxPayload 128 bytes, MaxReadReq 512 bytes
> 
> # Run pcitests
> FAIL  pci_ep_data_transfer.dma.READ_TEST
> Compared to before, DMA read test now fails.
> 
> Wilfred has been able to track this down to being because MPS in the RC
> is different before/after a ->reset_slot(), and simply re-storing MPS using
> a dirty hack, the DMA read issues go away.
> 
> 
> 
> My first thinking was that pci_configure_device() (and thus pci_configure_mps())
> was not called,
> so I added some prints in pci_configure_mps().
> pci_configure_mps() was not called for RC nor EP during ->reset_slot()
> (I did not check if pcie_bus_configure_settings() was called.)
> 
> 
> I tried Mani's earlier patch instead of what is queued up:
> https://lore.kernel.org/linux-pci/20250221172309.120009-2-manivannan.sadhasivam@xxxxxxxxxx/
> 
> And in this version pci_configure_mps() does get called,
> for both the RC and EP during ->retrain_link()
> (I did not check if pcie_bus_configure_settings() was called.)
> 
> But... MaxPayload was still incorrectly set after retrain_link().
> 
> 
> I think the solution is to add a call to add a pcie_bus_configure_settings()
> call in pcie_do_recovery() / pci_host_recover_slots() / pci_host_reset_slot() /
> pcibios_reset_secondary_bus().
> 
> 
> Or possibly a:
>         list_for_each_entry(child, &bus->children, node)
>                 pcie_bus_configure_settings(child);
> 
> as done in pci_host_probe().
> 
> I'm not sure if we need to make sure that pci_configure_device() /
> pci_configure_mps() gets called as well. Since we did reset the endpoint,
> it seems that calling pci_configure_mps() does make sense.
> 
> 
> Normally, I would wait for Wilfred and myself to come up with a nice fix,
> and test it, but, considering that the ->reset_slot() patches are queued
> up already, and the merge window is opening soon, I'm sharing our findings
> on the list, as it might take some time to come up with a nice solution.
> 

Thanks Niklas for reporting and Wilfred for debugging!

I will submit the patches for fixing this issue along with couple of other
comments raised by Lukas.

- Mani

-- 
மணிவண்ணன் சதாசிவம்




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux