Re: [PATCH v4 5/7] PCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up

Niklas Cassel <cassel@xxxxxxxxxx> · Wed, 2 Jul 2025 18:17:41 +0200

Hello Mani,

On Wed, Jul 02, 2025 at 08:17:44PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Jul 02, 2025 at 11:43:11AM GMT, Niklas Cassel wrote:
> > However, my main concern is not that qcom waits twice, it is those drivers
> > that do not have a 100 ms delay after PERST# has been deasserted, because
> > before commit 470f10f18b48 ("PCI: Reduce PCIE_LINK_WAIT_SLEEP_MS"), those
> > drivers might have been "saved" by the ridiculously long
> > PCIE_LINK_WAIT_SLEEP_MS.
> > 
> > However, now when we sleep less in each iteration when polling for link up,
> > those drivers that do not have a 100 ms delay after PERST# has been
> > deasserted might actually see regressions, because (the now reduced)
> > PCIE_LINK_WAIT_SLEEP_MS time is no longer "saving" them.
>
> 
> Why can't you just add the delay to those drivers now? They can be consolidated
> later.

Right now, I don't have the extra cycles needed to read all of these
drivers:

$ git grep -l gpio drivers/pci/controller/dwc/
drivers/pci/controller/dwc/pci-dra7xx.c
drivers/pci/controller/dwc/pci-imx6.c
drivers/pci/controller/dwc/pci-keystone.c
drivers/pci/controller/dwc/pci-meson.c
drivers/pci/controller/dwc/pcie-amd-mdb.c
drivers/pci/controller/dwc/pcie-bt1.c
drivers/pci/controller/dwc/pcie-designware-plat.c
drivers/pci/controller/dwc/pcie-designware.c
drivers/pci/controller/dwc/pcie-designware.h
drivers/pci/controller/dwc/pcie-dw-rockchip.c
drivers/pci/controller/dwc/pcie-fu740.c
drivers/pci/controller/dwc/pcie-histb.c
drivers/pci/controller/dwc/pcie-intel-gw.c
drivers/pci/controller/dwc/pcie-keembay.c
drivers/pci/controller/dwc/pcie-kirin.c
drivers/pci/controller/dwc/pcie-qcom-ep.c
drivers/pci/controller/dwc/pcie-qcom.c
drivers/pci/controller/dwc/pcie-rcar-gen4.c
drivers/pci/controller/dwc/pcie-tegra194.c
drivers/pci/controller/dwc/pcie-visconti.c

then understanding them properly to feel that I am confident in my changes,
waiting for reviews from each glue driver maintainer, and then waiting for
someone to pick it up.

Also, all of the above has to be done when consolidating the PERST#
handling anyway.

This whole thing started because someone reported a regression on a random
Plextor NVMe drive. Since it was my commit ec9fd499b9c6 ("PCI: dw-rockchip:
Don't wait for link since we can detect Link Up") that introduced the
regression, I obviously helped debugging the issue.

The regression was solved by adding a 100 ms delay after receiving the link
up IRQ, before enumerating the bus.

This fix was sent May 5th:
https://lore.kernel.org/linux-pci/20250505092603.286623-7-cassel@xxxxxxxxxx/
(This series had up to v2, the series was then renamed and had up to v4,
so basically v6 in total.)

While my responsibility was done two months ago, I still wanted to make
sure that no one else got bit by the same bug, thus I also improved the
generic dw_pcie_wait_for_link() (for those drivers not using link up IRQ):
https://lore.kernel.org/linux-pci/20250625102347.1205584-14-cassel@xxxxxxxxxx/

Sure, that will only help PCIe controllers that support Link speeds greater
than 5.0 GT/s, and do not use a link up IRQ, but still something.

PCIe controllers that only support Link speeds <= 5.0 GT/s, and do not use
a link up IRQ, and do not have a delay after deasserting PERST#, can still
send config requests too early.

However, if we drop 470f10f18b48 ("PCI: Reduce PCIE_LINK_WAIT_SLEEP_MS"),
PCIe controllers that only support Link speeds <= 5.0 GT/s, and do not use
a link up IRQ, and do not have a delay after deasserting PERST#, will be no
worse off than what is already in mainline.

PCIe controllers that support Link speeds greater than 5.0 GT/s, and do not
use a link up IRQ, will still be more robust than ever :)
(Rome wasn't built in a day.)

Kind regards,
Niklas