Re: [syzbot] [pci?] linux-next test error: general protection fault in msix_capability_init

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Also able to reproduce this trace on every boot with a basic KVM guest on an
EPYC Milan system using next-20250325 for both host/guest.

A bisect of commits to drivers/pci/msi seems to indicate the following commit
is the source of the regression:

  commit d9f2164238d814d119e8c979a3579d1199e271bb
  Author: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  Date:   Wed Feb 19 10:20:57 2025 +0100
  
      PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag
      
      Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both
      MSI and MSI-X entries globally, regardless of the IRQ chip they are using.
      Only Xen sets the pci_msi_ignore_mask when routing physical interrupts over
      event channels, to prevent PCI code from attempting to toggle the maskbit,
      as it's Xen that controls the bit.
      
      However, the pci_msi_ignore_mask being global will affect devices that use
      MSI interrupts but are not routing those interrupts over event channels
      (not using the Xen pIRQ chip).  One example is devices behind a VMD PCI
      bridge.  In that scenario the VMD bridge configures MSI(-X) using the
      normal IRQ chip (the pIRQ one in the Xen case), and devices behind the
      bridge configure the MSI entries using indexes into the VMD bridge MSI
      table.  The VMD bridge then demultiplexes such interrupts and delivers to
      the destination device(s).  Having pci_msi_ignore_mask set in that scenario
      prevents (un)masking of MSI entries for devices behind the VMD bridge.
      
      Move the signaling of no entry masking into the MSI domain flags, as that
      allows setting it on a per-domain basis.  Set it for the Xen MSI domain
      that uses the pIRQ chip, while leaving it unset for the rest of the
      cases.
      
      Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
      with Xen dropping usage the variable is unneeded.
      
      This fixes using devices behind a VMD bridge on Xen PV hardware domains.
      
      Albeit Devices behind a VMD bridge are not known to Xen, that doesn't mean
      Linux cannot use them.  By inhibiting the usage of
      VMD_FEAT_CAN_BYPASS_MSI_REMAP and the removal of the pci_msi_ignore_mask
      bodge devices behind a VMD bridge do work fine when use from a Linux Xen
      hardware domain.  That's the whole point of the series.
      
      Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
      Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
      Acked-by: Juergen Gross <jgross@xxxxxxxx>
      Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
      Message-ID: <20250219092059.90850-4-roger.pau@xxxxxxxxxx>
      Signed-off-by: Juergen Gross <jgross@xxxxxxxx>

Thanks,

Mike




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux