On Thu, 26 Jun 2025 18:04:21 +0000 Aaron Lewis <aaronlewis@xxxxxxxxxx> wrote: > This series is being sent as an RFC to help brainstorm the best way to > fix a latency issue it uncovers. > > The crux of the issue is that when initializing multiple VFs from the > same PF the devices are reset serially rather than in parallel > regardless if they are initialized from different threads. That happens > because a shared lock is acquired when vfio_df_ioctl_bind_iommufd() is > called, then a FLR (function level reset) is done which takes 100ms to > complete. That in combination with trying to initialize many devices at > the same time results in a lot of wasted time. > > While the PCI spec does specify that a FLR requires 100ms to ensure it > has time to complete, I don't see anything indicating that other VFs > can't be reset at the same time. > > A couple of ideas on how to approach a fix are: > > 1. See if the lock preventing the second thread from making forward > progress can be sharded to only include the VF it protects. I think we're talking about the dev_set mutex here, right? I think this is just an oversight. The original lock that dev_set replaced was devised to manage the set of devices affected by the same bus or slot reset. I believe we've held the same semantics though and VFs just happen to fall through to the default of a bus-based dev_set. Obviously we cannot do a bus or slot reset of a VF, we only have FLR, and it especially doesn't make sense that VFs on the same "bus" from different PFs share this mutex. Seems like we just need this: diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 6328c3a05bcd..261a6dc5a5fc 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2149,7 +2149,7 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev) return -EBUSY; } - if (pci_is_root_bus(pdev->bus)) { + if (pci_is_root_bus(pdev->bus) || pdev->is_virtfn) { ret = vfio_assign_device_set(&vdev->vdev, vdev); } else if (!pci_probe_reset_slot(pdev->slot)) { ret = vfio_assign_device_set(&vdev->vdev, pdev->slot); Does that allow fully parallelized resets? Meanwhile this is a good use case of selftests and further impetus for me to get it working. Thanks, Alex > > 2. Do the FLR for the VF in probe() and close(device_fd) rather than in > vfio_df_ioctl_bind_iommufd(). > > To demonstrate the problem the run script had to be extended to bind > multiple devices to the vfio-driver, not just one. E.g. > > $ ./run.sh -d 0000:17:0c.1 -d 0000:17:0c.2 -d 0000:16:01.7 -s > > Also included is a selftest and BPF script. With those, the problem can > be reproduced with the output logging showing that one of the devices > takes >200ms to initialize despite running from different threads. > > $ VFIO_BDF_1=0000:17:0c.1 VFIO_BDF_2=0000:17:0c.2 ./vfio_flr_test > [0x7f61bb888700] '0000:17:0c.2' initialized in 108.6ms. > [0x7f61bc089700] '0000:17:0c.1' initialized in 212.3ms. > > And the BPF script indicating that the latency issues are coming from the > mutex in vfio_df_ioctl_bind_iommufd(). > > [pcie_flr] duration = 108ms > [vfio_df_ioctl_bind_iommufd] duration = 108ms > [pcie_flr] duration = 104ms > [vfio_df_ioctl_bind_iommufd] duration = 212ms > > [__mutex_lock] duration = 103ms > __mutex_lock+5 > vfio_df_ioctl_bind_iommufd+171 > __se_sys_ioctl+110 > do_syscall_64+109 > entry_SYSCALL_64_after_hwframe+120 > > This series can be applied on top of the VFIO selftests using the branch: > upstream/vfio/selftests/v1. > > https://github.com/dmatlack/linux/tree/vfio/selftests/v1 > > Aaron Lewis (3): > vfio: selftests: Allow run.sh to bind to more than one device > vfio: selftests: Introduce the selftest vfio_flr_test > vfio: selftests: Include a BPF script to pair with the selftest vfio_flr_test > > tools/testing/selftests/vfio/Makefile | 1 + > tools/testing/selftests/vfio/run.sh | 73 +++++++---- > tools/testing/selftests/vfio/vfio_flr_test.c | 120 ++++++++++++++++++ > .../testing/selftests/vfio/vfio_flr_trace.bt | 83 ++++++++++++ > 4 files changed, 251 insertions(+), 26 deletions(-) > create mode 100644 tools/testing/selftests/vfio/vfio_flr_test.c > create mode 100644 tools/testing/selftests/vfio/vfio_flr_trace.bt >