From: Nam Cao <namcao@xxxxxxxxxxxxx> Sent: Thursday, July 3, 2025 9:33 PM > > On Fri, Jul 04, 2025 at 02:27:01AM +0000, Michael Kelley wrote: > > I haven't resolved the conflict. As a shortcut for testing I just > > removed the conflicting patch since it is for a Microsoft custom NIC > > ("MANA") that's not in the configuration I'm testing with. I'll have to > > look more closely to figure out the resolution. > > > > Separately, this patch (the switch to misc_create_parent_irq_domain) > > isn't working for Linux VMs on Hyper-V on ARM64. The initial symptom > > is that interrupts from the NVMe controller aren't getting handled > > and everything hangs. Here's the dmesg output: > > > > [ 84.463419] hv_vmbus: registering driver hv_pci > > [ 84.463875] hv_pci abee639e-0b9d-49b7-9a07-c54ba8cd5734: PCI VMBus probing: Using version 0x10004 > > [ 84.464518] hv_pci abee639e-0b9d-49b7-9a07-c54ba8cd5734: PCI host bridge to bus 0b9d:00 > > [ 84.464529] pci_bus 0b9d:00: root bus resource [mem 0xfc0000000-0xfc00fffff window] > > [ 84.464531] pci_bus 0b9d:00: No busn resource found for root bus, will use [bus 00-ff] > > [ 84.465211] pci 0b9d:00:00.0: [1414:b111] type 00 class 0x010802 PCIe Endpoint > > [ 84.466657] pci 0b9d:00:00.0: BAR 0 [mem 0xfc0000000-0xfc00fffff 64bit] > > [ 84.481923] pci_bus 0b9d:00: busn_res: [bus 00-ff] end is updated to 00 > > [ 84.481936] pci 0b9d:00:00.0: BAR 0 [mem 0xfc0000000-0xfc00fffff 64bit]: assigned > > [ 84.482413] nvme nvme0: pci function 0b9d:00:00.0 > > [ 84.482513] nvme 0b9d:00:00.0: enabling device (0000 -> 0002) > > [ 84.556871] irq 17, desc: 00000000e8529819, depth: 0, count: 0, unhandled: 0 > > [ 84.556883] ->handle_irq(): 0000000062fa78bc, handle_bad_irq+0x0/0x270 > > [ 84.556892] ->irq_data.chip(): 00000000ba07832f, 0xffff00011469dc30 > > [ 84.556895] ->action(): 0000000069f160b3 > > [ 84.556896] ->action->handler(): 00000000e15d8191, nvme_irq+0x0/0x3e8 > > [ 172.307920] watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [kworker/6:1H:195] > > Thanks for the report. > > On arm64, this driver relies on the parent irq domain to set handler. So > the driver must not overwrite it to NULL. > > This should cures it: > > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c > index 3a24fadddb83..f4a435b0456c 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -577,8 +577,6 @@ static void hv_pci_onchannelcallback(void *context); > > #ifdef CONFIG_X86 > #define DELIVERY_MODE APIC_DELIVERY_MODE_FIXED > -#define FLOW_HANDLER handle_edge_irq > -#define FLOW_NAME "edge" > > static int hv_pci_irqchip_init(void) > { > @@ -723,8 +721,6 @@ static void hv_arch_irq_unmask(struct irq_data *data) > #define HV_PCI_MSI_SPI_START 64 > #define HV_PCI_MSI_SPI_NR (1020 - HV_PCI_MSI_SPI_START) > #define DELIVERY_MODE 0 > -#define FLOW_HANDLER NULL > -#define FLOW_NAME NULL > #define hv_msi_prepare NULL > > struct hv_pci_chip_data { > @@ -2162,8 +2158,9 @@ static int hv_pcie_domain_alloc(struct irq_domain *d, > unsigned int virq, unsigne > return ret; > > for (int i = 0; i < nr_irqs; i++) { > - irq_domain_set_info(d, virq + i, 0, &hv_msi_irq_chip, NULL, FLOW_HANDLER, NULL, > - FLOW_NAME); > + irq_domain_set_hwirq_and_chip(d, virq + i, 0, &hv_msi_irq_chip, NULL); > + if (IS_ENABLED(CONFIG_X86)) > + __irq_set_handler(virq + i, handle_edge_irq, 0, "edge"); > } > > return 0; Yes, that fixes the problem. Linux now boots with the PCI NIC VF and two NVMe controllers being visible and operational. Thanks for the fix! It would have taken me a while to figure it out. I want to do some additional testing tomorrow, and look more closely at the code, but now I have something that works well enough to make further progress. Michael