----- Original Message ----- > From: "Bjorn Helgaas" <helgaas@xxxxxxxxxx> > To: "Timothy Pearson" <tpearson@xxxxxxxxxxxxxxxxxxxxx> > Cc: "linuxppc-dev" <linuxppc-dev@xxxxxxxxxxxxxxxx>, "linux-kernel" <linux-kernel@xxxxxxxxxxxxxxx>, "linux-pci" > <linux-pci@xxxxxxxxxxxxxxx>, "Madhavan Srinivasan" <maddy@xxxxxxxxxxxxx>, "Michael Ellerman" <mpe@xxxxxxxxxxxxxx>, > "christophe leroy" <christophe.leroy@xxxxxxxxxx>, "Naveen N Rao" <naveen@xxxxxxxxxx>, "Bjorn Helgaas" > <bhelgaas@xxxxxxxxxx>, "Shawn Anastasio" <sanastasio@xxxxxxxxxxxxxxxxxxxxx>, "Lukas Wunner" <lukas@xxxxxxxxx> > Sent: Wednesday, June 18, 2025 3:17:22 PM > Subject: Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken > On Wed, Jun 18, 2025 at 02:50:04PM -0500, Timothy Pearson wrote: >> ----- Original Message ----- >> > From: "Bjorn Helgaas" <helgaas@xxxxxxxxxx> >> > To: "Timothy Pearson" <tpearson@xxxxxxxxxxxxxxxxxxxxx> >> > Cc: "linuxppc-dev" <linuxppc-dev@xxxxxxxxxxxxxxxx>, "linux-kernel" >> > <linux-kernel@xxxxxxxxxxxxxxx>, "linux-pci" >> > <linux-pci@xxxxxxxxxxxxxxx>, "Madhavan Srinivasan" <maddy@xxxxxxxxxxxxx>, >> > "Michael Ellerman" <mpe@xxxxxxxxxxxxxx>, >> > "christophe leroy" <christophe.leroy@xxxxxxxxxx>, "Naveen N Rao" >> > <naveen@xxxxxxxxxx>, "Bjorn Helgaas" >> > <bhelgaas@xxxxxxxxxx>, "Shawn Anastasio" <sanastasio@xxxxxxxxxxxxxxxxxxxxx>, >> > "Lukas Wunner" <lukas@xxxxxxxxx> >> > Sent: Wednesday, June 18, 2025 2:44:00 PM >> > Subject: Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with >> > broken >> >> > [+cc Lukas, pciehp expert] >> > >> > On Wed, Jun 18, 2025 at 11:56:54AM -0500, Timothy Pearson wrote: >> >> presence detection >> > >> > (subject/commit wrapping seems to be on all of these patches) >> > >> >> The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system >> >> was observed to incorrectly assert the Presence Detect Set bit in its >> >> capabilities when tested on a Raptor Computing Systems Blackbird system, >> >> resulting in the hot insert path never attempting a rescan of the bus >> >> and any downstream devices not being re-detected. >> > >> > Seems like this switch supports standard PCIe hotplug? Quite a bit of >> > this driver looks similar to things in pciehp. Is there some reason >> > we can't use pciehp directly? Maybe pciehp could work if there were >> > hooks for the PPC-specific bits? >> >> While that is a good long term goal that Raptor is willing to work >> toward, it is non-trivial and will require buy-in from other >> stakeholders (e.g. IBM). If practical, I'd like to get this series >> merged first, to fix the broken hotplug on our hardware that is >> deployed worldwide, then in parallel see what can be done to merge >> PowerNV support into pciehp. Would that work? > > Yeah, it wouldn't make sense to switch horses at this stage. > > I guess I was triggered by this patch, which seems to be a workaround > for a defect in a device that is probably also used on non-PPC > systems, and pciehp would need a similar workaround. But I guess you > go on to say that pciehp already does something similar, so it guess > it's already covered. No problem, I completely understand. To be perfectly frank the existing code quality in this driver (and the associated EEH driver) is not the best, and it's been a frustrating experience trying to hack it into semi-stable operation. I would vastly prefer to rewrite / integrate into the pciehp driver, and we have plans to do so, but that will take an unacceptable amount of time vs. trying to fix up the existing driver as a stopgap. As you mentioned, pciehp already has this fix, so we just have to deal with the duplicated code until we (Raptor) figures out how to merge PowerNV support into pciehp.