On Mon, 2025-08-25 at 10:12 -0700, Farhan Ali wrote: > Commit c1e18c17bda6 ("s390/pci: add zpci_set_irq()/zpci_clear_irq()"), > introduced the zpci_set_irq() and zpci_clear_irq(), to be used while > resetting a zPCI device. > > Commit da995d538d3a ("s390/pci: implement reset_slot for hotplug slot"), > mentions zpci_clear_irq() being called in the path for zpci_hot_reset_device(). > But that is not the case anymore and these functions are not called > outside of this file. > > However after a CLP disable/enable reset (zpci_hot_reset_device),the airq Nit: missing space after "," > setup of the device will need to be restored. Since we are no longer > calling zpci_clear_airq() in the reset path, we should restore the airq for > device unconditionally. > > Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx> > --- > arch/s390/include/asm/pci.h | 1 - > arch/s390/pci/pci_irq.c | 9 +-------- > 2 files changed, 1 insertion(+), 9 deletions(-) > > diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h > index 41f900f693d9..aed19a1aa9d7 100644 > --- a/arch/s390/include/asm/pci.h > +++ b/arch/s390/include/asm/pci.h > @@ -145,7 +145,6 @@ struct zpci_dev { > u8 has_resources : 1; > u8 is_physfn : 1; > u8 util_str_avail : 1; > - u8 irqs_registered : 1; > u8 tid_avail : 1; > u8 rtr_avail : 1; /* Relaxed translation allowed */ > unsigned int devfn; /* DEVFN part of the RID*/ > diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c > index 84482a921332..e73be96ce5fe 100644 > --- a/arch/s390/pci/pci_irq.c > +++ b/arch/s390/pci/pci_irq.c > @@ -107,9 +107,6 @@ static int zpci_set_irq(struct zpci_dev *zdev) > else > rc = zpci_set_airq(zdev); > > - if (!rc) > - zdev->irqs_registered = 1; > - > return rc; > } > > @@ -123,9 +120,6 @@ static int zpci_clear_irq(struct zpci_dev *zdev) > else > rc = zpci_clear_airq(zdev); > > - if (!rc) > - zdev->irqs_registered = 0; > - > return rc; > } > > @@ -427,8 +421,7 @@ bool arch_restore_msi_irqs(struct pci_dev *pdev) > { > struct zpci_dev *zdev = to_zpci(pdev); > > - if (!zdev->irqs_registered) > - zpci_set_irq(zdev); > + zpci_set_irq(zdev); > return true; > } > I dug a bit to see why this isn't a problem for the existing non-vfio PCI recovery. It looks like the drivers end up calling arch_teardown_msi_irqs() and then arch_setup_msi_irqs() as part of their recovery handlers. For example nvme calls nvme_dev_disable() in error_detected() which calls pci_free_irq_vectors() and ultimately zpci_clear_irq(). Similarly zpci_set_irq() is ultimately called in pci_alloc_irq_vectors() in nvme_pci_enable() as part of nvme_reset_work(). Additionally zpci_clear_irq() returns success and ignores errors when the IRQs are already cleared allowing zpci_clear_irq() to set zdev- >irqs_registered = 0 even if the device is in the error or disabled state. On the other hand zpci_set_irq() would not ignore trying to register IRQs if they are already registered. So I think the commit description is somewhat confusing because the CLP disable case works if, like with the existing recovery, IRQs get torn down and setup anew after the reset and because the zpci_clear_irq() isn't needed in zpci_hot_reset_device() because clp_disable_fh() already does this. I believe the mention of that was because in an earlier, never merged, version I had an explicit zpci_clear_irq() but this was removed because it is redundant, except for flipping the flag of course. On the other hand I think the code change itself makes sense. The zdev- >irqs_registered flag hides when someone tries to register IRQs twice which I think we would want to know about. And more importantly the flag doesn't correctly mirror the actual state because CLP disable doesn't clear the flag but unregisters IRQs and then arch_restore_msi_irqs() doesn't actually re-regiser IRQs because it assumes the wrong state. And this is just hidden because none of the relevant drivers seem to solely rely on pci_restore_state() but do tear down / setup regardless. I think thus the commit description should focus on the possibly inconsistent state and arch_restore_msi_irqs() and then it all makes sense. Thanks, Niklas