On Wed, 2025-08-13 at 16:56 -0600, Alex Williamson wrote: > On Wed, 13 Aug 2025 14:52:24 -0700 > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > On 8/13/2025 1:30 PM, Alex Williamson wrote: > > > On Wed, 13 Aug 2025 10:08:19 -0700 > > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > > > > > For zPCI devices we should drive a platform specific function reset > > > > as part of VFIO_DEVICE_RESET. This reset is needed recover a zPCI device > > > > in error state. > > > > > > > > Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx> > > > > --- > > > > arch/s390/pci/pci.c | 1 + > > > > drivers/vfio/pci/vfio_pci_core.c | 4 ++++ > > > > drivers/vfio/pci/vfio_pci_priv.h | 5 ++++ > > > > drivers/vfio/pci/vfio_pci_zdev.c | 39 ++++++++++++++++++++++++++++++++ > > > > 4 files changed, 49 insertions(+) --- snip --- > > > > > > > > +int vfio_pci_zdev_reset(struct vfio_pci_core_device *vdev) > > > > +{ > > > > + struct zpci_dev *zdev = to_zpci(vdev->pdev); > > > > + int rc = -EIO; > > > > + > > > > + if (!zdev) > > > > + return -ENODEV; > > > > + > > > > + /* > > > > + * If we can't get the zdev->state_lock the device state is > > > > + * currently undergoing a transition and we bail out - just > > > > + * the same as if the device's state is not configured at all. > > > > + */ > > > > + if (!mutex_trylock(&zdev->state_lock)) > > > > + return rc; > > > > + > > > > + /* We can reset only if the function is configured */ > > > > + if (zdev->state != ZPCI_FN_STATE_CONFIGURED) > > > > + goto out; > > > > + > > > > + rc = zpci_hot_reset_device(zdev); > > > > + if (rc != 0) > > > > + goto out; > > > > + > > > > + if (!vdev->pci_saved_state) { > > > > + pci_err(vdev->pdev, "No saved available for the device"); > > > > + rc = -EIO; > > > > + goto out; > > > > + } > > > > + > > > > + pci_dev_lock(vdev->pdev); > > > > + pci_load_saved_state(vdev->pdev, vdev->pci_saved_state); > > > > + pci_restore_state(vdev->pdev); > > > > + pci_dev_unlock(vdev->pdev); > > > > +out: > > > > + mutex_unlock(&zdev->state_lock); > > > > + return rc; > > > > +} > > > This looks like it should be a device or arch specific reset > > > implemented in drivers/pci, not vfio. Thanks, > > > > > > Alex > > > > Are you suggesting to move this to an arch specific function? One thing > > we need to do after the zpci_hot_reset_device, is to correctly restore > > the config space of the device. And for vfio-pci bound devices we want > > to restore the state of the device to when it was initially opened. > > We generally rely on the abstraction of pci_reset_function() to select > the correct type of reset for a function scope reset. We've gone to > quite a bit of effort to implement all device specific resets and > quirks in the PCI core to be re-used across the kernel. > > Calling zpci_hot_reset_device() directly seems contradictory to those > efforts. Should pci_reset_function() call this universally on s390x > rather than providing access to FLR/PM/SBR reset? > I agree with you Alex. Still trying to figure out what's needed for this. We already do zpci_hot_reset_device() in reset_slot() from the s390_pci_hpc.c hotplug slot driver and that does get called via pci_reset_hotplug_slot() and pci_reset_function(). There are a few problems though that meant it didn't work for Farhan but I agree maybe we can fix them for the general case. For one pci_reset_function() via DEVICE_RESET first tries FLR but that won't work with the device in the error state and MMIO blocked. Sadly __pci_reset_function_locked() then concludes that other resets also won't work. So that's something we might want to improve in general, for example maybe we need something more like pci_dev_acpi_reset() with higher priority than FLR. Now for pci_reset_hotplug_slot() via VFIO_DEVICE_PCI_HOT_RESET I'm not sure why that won't work as is. @Farhan do you know? > Why is it > universally correct here given the ioctl previously made use of > standard reset mechanisms? > > The DEVICE_RESET ioctl is simply an in-place reset of the device, > without restoring the original device state. So we're also subtly > changing that behavior here, presumably because we're targeting the > specific error recovery case. Have you considered how this might > break non-error-recovery use cases? > > I wonder if we want a different reset mechanism for this use case > rather than these subtle semantic changes. I think an alternative to that, which Farhan actually had in the previous internal version, is to implement pci_error_handlers::reset_done() and do the pci_load_saved_state() there. That would only affect the error recovery case leaving other cases alone. Thanks, Niklas