Re: [PATCH v8 4/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 22, 2025 at 01:37:39AM +0800, Yi Liu wrote:
> On 2025/3/21 12:27, Nicolin Chen wrote:
> > On Thu, Mar 20, 2025 at 04:48:33PM -0700, Nicolin Chen wrote:
> > Reading this further, I found that Yi did report VFIO device cap
> > for PASID via a VFIO ioctl in the early versions but switched to
> > using the IOMMU_GET_HW_INFO since v3 (nearly a year ago). So, I
> > see that's a made decision.
> > 
> > Given that our IOMMU_GET_HW_INFO defines this:
> >    * Query an iommu type specific hardware information data from an iommu behind
> >    * a given device that has been bound to iommufd. This hardware info data will
> >    * be used to sync capabilities between the virtual iommu and the physical
> >    * iommu, e.g. a nested translation setup needs to check the hardware info, so
> >    * a guest stage-1 page table can be compatible with the physical iommu.
> > 
> > max_pasid_log2 is something that fits well. But PCI device cap
> > still feels odd in that regard, as it repurposes the ioctl.
> 
> PASID cap is a bit special. It should not be reported to user unless
> both iommu and device enabled it. So adding it in this hw_info ioctl
> is fine. It can avoid duplicate ioctls across userspace driver frameworks
> as well.

Yea, I get the convenience.

> > So, perhaps we should update the uAPI documentation and ask user
> > space to run IOMMU_GET_HW_INFO for every iommufd_device, because
> > the output out_capabilities may be different per iommufd_device,
> > even if both devices are correctly assigned to the same vIOMMU.
> 
> since this is a per-device ioctl. userspace should expect difference
> and. Actually, the userspace e.g. vfio may just invoke this ioctl
> to know if the PASID cap instead of asking vIOMMU if we define it
> in the driver-specific part. This is much convenient.

A PASID cap of an IOMMU's is reported by max_pasid_log2 alone,
isn't it? Only the PCI layer that holds the VFIO device cares
about these two PCI device PASID caps that will be reported in
its emulated PCI_PASID_CAP register.

Yes, this is a per-device ioctl. But we defined it to use the
device only as a bridge to get access to its IOMMU and return
IOMMU's caps/infos. Now, we are reporting HW info about this
bridge itself. I think it repurposes the ioctl.

And honestly, "userspace should expect difference" isn't very
fair. A vIOMMU could have been initialized by the first given
iommufd_device, as it could have expected the IOMMU info from
either the first device or the second device to be consistent.
Yet now how a vIOMMU to get finalized given "userspace should
expect difference"? Certainly, I don't see an issue with these
two PCI caps, since a vIOMMU would unlikely integrate them in
its registers, so long as we note it down clearly that these
two "IOMMU_HW" caps come from the bridging idev v.s. IOMMU HW.

Thanks
Nicolin




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux