On Wed, Jun 04, 2025 at 10:40:09PM +0530, Manivannan Sadhasivam wrote: > On Wed, Jun 04, 2025 at 01:40:52PM +0200, Niklas Cassel wrote: > > On Tue, Jun 03, 2025 at 01:12:50PM -0500, Bjorn Helgaas wrote: > > > > > > Hmmm, sorry, I misinterpreted both 1/4 and 2/4. I read them as "add > > > this delay so the PLEXTOR device works", but in fact, I think in both > > > cases, the delay is actually to enforce the PCIe r6.0, sec 6.6.1, > > > requirement for software to wait 100ms before issuing a config > > > request, and the fact that it makes PLEXTOR work is a side effect of > > > that. > > > > Well, the Plextor NVMe drive used to work with previous kernels, > > but regressed. > > > > But yes, the delay was added to enforce "PCIe r6.0, sec 6.6.1" > > requirement for software to wait 100ms, which once again makes > > the Plextor NVMe drive work. > > > > > The beginning of that 100ms delay is "exit from Conventional Reset" > > > (ports that support <= 5.0 GT/s) or "link training completes" (ports > > > that support > 5.0 GT/s). > > > > > > I think we lack that 100ms delay in dwc drivers in general. The only > > > generic dwc delay is in dw_pcie_host_init() via the LINK_WAIT_SLEEP_MS > > > in dw_pcie_wait_for_link(), but that doesn't count because it's > > > *before* the link comes up. We have to wait 100ms *after* exiting > > > Conventional Reset or completing link training. > > > > In dw_pcie_wait_for_link(), in the first iteration of the loop, the link > > will never be up (because the link was just started), > > dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90 ms), > > before trying again. > > > > Most likely the link training took way less than 100 ms, so most of those > > 90 ms will probably be after link training has completed. > > > > That is most likely why Plextor worked on older kernels (which does not > > use the link up IRQ). Definitely seems plausible. > > If we add a 100 ms sleep after wait_for_link(), then I suggest that we > > also reduce LINK_WAIT_SLEEP_MS to something shorter. > > No. The 900ms sleep is to make sure that we wait 1s before erroring out > assuming that the device is not present. This is mandated by the spec. So > irrespective of the delay we add *after* link up, we should try to detect the > link up for ~1s. I think it would be sensible for dw_pcie_wait_for_link() to check for link up more frequently, i.e., reduce LINK_WAIT_SLEEP_MS and increase LINK_WAIT_MAX_RETRIES. If LINK_WAIT_SLEEP_MS * LINK_WAIT_MAX_RETRIES is for the 1.0s mentioned in sec 6.6.1, seems like maybe we should make a generic #define for it so we could include the spec reference and use it across all drivers. And resolve the question of 900ms vs 1000ms. Bjorn