On Sat, May 10, 2025 at 01:43:39PM +0200, Niklas Cassel wrote: > On Sat, May 10, 2025 at 11:27:55AM +0530, Manivannan Sadhasivam wrote: > > On Wed, Apr 30, 2025 at 02:31:59PM +0200, Niklas Cassel wrote: > > > While the parameter 'interrupts' to the functions pci_epc_set_msi() and > > > pci_epc_set_msix() represent the actual number of interrupts, and > > > pci_epc_get_msi() and pci_epc_get_msix() return the actual number of > > > interrupts. > > > > > > These endpoint library functions just mentioned will however supply > > > "interrupts - 1" to the EPC callback functions pci_epc_ops->set_msi() and > > > pci_epc_ops->set_msix(), and likewise add 1 to return value from > > > pci_epc_ops->get_msi() and pci_epc_ops->get_msix(), > > > > Only {get/set}_msix() callbacks were having this behavior, right? > > pci_epc_get_msix() adds 1 to the return of epc->ops->get_msix(). > pci_epc_set_msix() subtracts 1 to the parameter sent to epc->ops->set_msix(). > > pci_epc_get_msi() does 1 << interrupt from the return of epc->ops->get_msi(). > So a return of 0 from epc->ops->get_msi() will result in pci_epc_get_msi() > returning 1. A return of 1 from epc->ops->get_msi() will result in > pci_epc_get_msi() returning 2. > > Similar for pci_epc_set_msi(). It will call order_base_2() on the parameter > before sending it to epc->ops->set_msi(). > Yeah. I was pointing out the fact that bitshifting != incrementing/decrementing 1. And you just mentioned the latter for both msi/msix callbacks. I can ammend it while applying. > So pci_epc_get_msi() / pci_epc_set_msi() takes a interrupts parameter > that actually represents the number of interrupts. > > epc->ops->set_msi() / epc->ops->get_msi() takes an interrupts parameter, > but that value does NOT represent the actual number of interrupts. > It is instead the value encoded per the Multiple Message Enable (MME) > field in the "7.7.1.2 Message Control Register for MSI". So it is quite > confusing that these the parameter for epc->ops->set_msi() / > epc->ops->get_msi() is also called interrupts. A better parameter name > would have been "mme". > Yeah. But I would try not to name it as 'mme' and rather keep the semantics for 'interrupt'. > It is however called 'interrupts' in the pci-epc.h header: > https://github.com/torvalds/linux/blob/v6.15-rc5/include/linux/pci-epc.h#L102-L103 > > and in the DWC driver and RCar driver: > static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 func_no, u8 vfunc_no, u8 interrupts) > static int rcar_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn, u8 interrupts) > > However, some drivers have seen this weirdness and actually named the parameter > differently: > static int cdns_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn, u8 mmc) > static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn, u8 multi_msg_cap) > > TL;DR: all of these callbacks are a mess IMO, not only {get/set}_msix(). > > I did the smallest fix possible, because doing a cleanup of this will > require changing all drivers that implement these callbacks, which seems > different from a fix. > Agree. > > > > > > even though the > > > parameter name for the callback function is also named 'interrupts'. > > > > > > While the set_msix() callback function in pcie-designware-ep writes the > > > Table Size field correctly (N-1), the calculation of the PBA offset > > > is wrong because it calculates space for (N-1) entries instead of N. > > > > > > This results in e.g. the following error when using QEMU with PCI > > > passthrough on a device which relies on the PCI endpoint subsystem: > > > failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align > > > > > > Fix the calculation of PBA offset in the MSI-X capability. > > > > > > > Thanks for the fix! We should also fix the API discrepancy w.r.t interrupts as > > it is causing much of a headache. One more example is the interrupt vector. API > > expects the vectors to be 1-based, but in reality, vectors start from 0. So the > > callers of raise_irq() has to increment the vector and the implementation has to > > decrement it. > > > > If you want to fix it up too, let me know. Otherwise, I may do it at some point. > > As you know, I'm working on adding SRNS/SRIS support for dw-rockchip, > I might send a cleanup for the Tegra driver too. > > I do not intend to clean up all the drivers. Okay, that's totally fine. > I am a bit worried about patches after the cleanup getting backported, which > will need to be different before and after this cleanup. We can add stable+noautosel to mark the patches to not backport. > Perhaps renaming the > callbacks at the same as the cleanup might be a good idea? > > It should be a simple cleanup though, just do the order_base_2() etc in the > driver callbacks themselves. > > We really should rename the parameter of .set_msi() though, as it is totally > misleading as of now. > As I said above, we should keep the semantics for 'interrupt' and make changes accordingly i.e., by doing order_base_2() inside the callbacks as you suggested. > > > > > > Fixes: 83153d9f36e2 ("PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments") > > > > This doesn't seem like the correct fixes commit. > > It is correct. > > Commit 83153d9f36e2 ("PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments") > added the PBA offset to dw_pcie_ep_set_msix(). > > dw_pcie_ep_set_msix() already wrote the interrupts value (zeroes based) to the > QSIZE register. Commit 83153d9f36e2 added this code: > + val = (offset + (interrupts * PCI_MSIX_ENTRY_SIZE)) | bir; > > So it used a zeroes based value to calculate the size, > which is obviously wrong. > Ah okay. I only looked into the 'interrupt' argument and missed the PBA change. Thanks for clarifying. - Mani -- மணிவண்ணன் சதாசிவம்