Hi Lukas, The problem is that when this race occurs, the second NPU (PCI device) remains uninitialized in the kernel driver. And I don't think it's specific to the driver and device we are using, hence I am asking on this mailing list. The driver keeps internal global array of initialized devices and their count. The working sequence is this: - call pci_probe for 1st NPU, store it at index 0 in the array, increment count - call pci_probe for second NPU, store it at index 1, increment count What happens in erroneous case: - call pci_probe, store it at index 0 - call pci_probe, store it at index 0 !! - increment the counter in first pci probe In this case, datapath on top of these ASICs does not work, because it expects the driver to initialize both ASICs. I know this can be fixed in the driver by proper locking and we have contacted the vendor. However, I think this can happen in any machine with 2 identical PCI devices, because as far as I know, existing PCI drivers usually do not assume that probe function can be called from multiple threads. Thanks, Jozef -----Original Message----- From: Lukas Wunner <lukas@xxxxxxxxx> Sent: Thursday, June 26, 2025 2:09 PM To: Jozef Matejcik (Nokia) <jozef.matejcik@xxxxxxxxx> Cc: linux-pci@xxxxxxxxxxxxxxx Subject: Re: pci_probe called concurrently in machine with 2 identical PCI devices causing race condition [You don't often get email from lukas@xxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information. On Thu, Jun 26, 2025 at 10:14:00AM +0000, Jozef Matejcik (Nokia) wrote: > We have one specific problem related to Linux PCI subsystem. > > We have a device with 2 identical NPUs, so 2 identical PCI devices > sharing the same 3rd party driver. Our problem is that _pci_probe of > this driver is called concurrently from 2 kernel threads. It happens > more frequently when kernel debug logs are enabled in GRUB, appr. > every 20th or 30th reboot of the device. So what exactly is the "problem"? Does something not work? Do you get errors or warnings? > So the fix is specifically related to devices with multiple VFs. > But does this take into account the setup with 2 separate, but > otherwise identical PCI devices? Is it possible this can occur in any > machine with 2 identical PCI devices? Not unless probing of one PF creates another PF. Thanks, Lukas