On Sat, May 10, 2025 at 11:27:55AM +0530, Manivannan Sadhasivam wrote: > On Wed, Apr 30, 2025 at 02:31:59PM +0200, Niklas Cassel wrote: > > While the parameter 'interrupts' to the functions pci_epc_set_msi() and > > pci_epc_set_msix() represent the actual number of interrupts, and > > pci_epc_get_msi() and pci_epc_get_msix() return the actual number of > > interrupts. > > > > These endpoint library functions just mentioned will however supply > > "interrupts - 1" to the EPC callback functions pci_epc_ops->set_msi() and > > pci_epc_ops->set_msix(), and likewise add 1 to return value from > > pci_epc_ops->get_msi() and pci_epc_ops->get_msix(), > > Only {get/set}_msix() callbacks were having this behavior, right? pci_epc_get_msix() adds 1 to the return of epc->ops->get_msix(). pci_epc_set_msix() subtracts 1 to the parameter sent to epc->ops->set_msix(). pci_epc_get_msi() does 1 << interrupt from the return of epc->ops->get_msi(). So a return of 0 from epc->ops->get_msi() will result in pci_epc_get_msi() returning 1. A return of 1 from epc->ops->get_msi() will result in pci_epc_get_msi() returning 2. Similar for pci_epc_set_msi(). It will call order_base_2() on the parameter before sending it to epc->ops->set_msi(). So pci_epc_get_msi() / pci_epc_set_msi() takes a interrupts parameter that actually represents the number of interrupts. epc->ops->set_msi() / epc->ops->get_msi() takes an interrupts parameter, but that value does NOT represent the actual number of interrupts. It is instead the value encoded per the Multiple Message Enable (MME) field in the "7.7.1.2 Message Control Register for MSI". So it is quite confusing that these the parameter for epc->ops->set_msi() / epc->ops->get_msi() is also called interrupts. A better parameter name would have been "mme". It is however called 'interrupts' in the pci-epc.h header: https://github.com/torvalds/linux/blob/v6.15-rc5/include/linux/pci-epc.h#L102-L103 and in the DWC driver and RCar driver: static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 func_no, u8 vfunc_no, u8 interrupts) static int rcar_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn, u8 interrupts) However, some drivers have seen this weirdness and actually named the parameter differently: static int cdns_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn, u8 mmc) static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn, u8 multi_msg_cap) TL;DR: all of these callbacks are a mess IMO, not only {get/set}_msix(). I did the smallest fix possible, because doing a cleanup of this will require changing all drivers that implement these callbacks, which seems different from a fix. > > > even though the > > parameter name for the callback function is also named 'interrupts'. > > > > While the set_msix() callback function in pcie-designware-ep writes the > > Table Size field correctly (N-1), the calculation of the PBA offset > > is wrong because it calculates space for (N-1) entries instead of N. > > > > This results in e.g. the following error when using QEMU with PCI > > passthrough on a device which relies on the PCI endpoint subsystem: > > failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align > > > > Fix the calculation of PBA offset in the MSI-X capability. > > > > Thanks for the fix! We should also fix the API discrepancy w.r.t interrupts as > it is causing much of a headache. One more example is the interrupt vector. API > expects the vectors to be 1-based, but in reality, vectors start from 0. So the > callers of raise_irq() has to increment the vector and the implementation has to > decrement it. > > If you want to fix it up too, let me know. Otherwise, I may do it at some point. As you know, I'm working on adding SRNS/SRIS support for dw-rockchip, I might send a cleanup for the Tegra driver too. I do not intend to clean up all the drivers. I am a bit worried about patches after the cleanup getting backported, which will need to be different before and after this cleanup. Perhaps renaming the callbacks at the same as the cleanup might be a good idea? It should be a simple cleanup though, just do the order_base_2() etc in the driver callbacks themselves. We really should rename the parameter of .set_msi() though, as it is totally misleading as of now. > > > Fixes: 83153d9f36e2 ("PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments") > > This doesn't seem like the correct fixes commit. It is correct. Commit 83153d9f36e2 ("PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments") added the PBA offset to dw_pcie_ep_set_msix(). dw_pcie_ep_set_msix() already wrote the interrupts value (zeroes based) to the QSIZE register. Commit 83153d9f36e2 added this code: + val = (offset + (interrupts * PCI_MSIX_ENTRY_SIZE)) | bir; So it used a zeroes based value to calculate the size, which is obviously wrong. Kind regards, Niklas