On 4/8/25 13:58, Aneesh Kumar K.V wrote:
Alexey Kardashevskiy <aik@xxxxxxx> writes:
On 28/7/25 19:47, Jason Gunthorpe wrote:
On Mon, Jul 28, 2025 at 07:21:47PM +0530, Aneesh Kumar K.V (Arm) wrote:
With passthrough devices, we need to make sure private memory is
allocated and assigned to the secure guest before we can issue the DMA.
For ARM RMM, we only need to map and the secure SMMU management is
internal to RMM. For shared IPA, vfio/iommufd DMA MAP/UNMAP interface
does the equivalent
I'm not really sure what this is about? It is about getting KVM to pin
all the memory and commit it to the RMM so it can be used for DMA?
But it looks really strange to have an iommufd ioctl that just calls a
KVM function. Feeling this should be a KVM function, or a guestmfd
behavior??
I ended up exporting the guestmemfd's kvm_gmem_get_folio() for gfn->pfn and its fd a bit differently in iommufd - "no extra referencing":
https://github.com/AMDESE/linux-kvm/commit/f1ebd358327f026f413f8d3d64d46decfd6ab7f6
It is a new iommufd->kvm dependency though.
Was the motivation for that design choice the fact that in case of AMD
VFIO/IOMMUFD manages both private memory allocation and updates to the
IOMMU page tables?
IOMMUFD maps pages for DMA in the IOMMU pagetable, let it do just that.
On the ARM side, the requirement is to ensure that pages are present in
the stage-2 page table, which is managed by the firmware (RMM). Because
of this, we need an interface that VFIO/IOMMUFD can use to trigger
stage-2 mappings within KVM.
Alternatively, we could introduce a dedicated KVM ioctl for this
purpose, avoiding the need to rely on IOMMUFD.
Right, if there is a requirement like this, and QEMU can do just another ioctl() - then I just do that, helps to untangle all these kernel module references. It is the firmware which makes sure that page tables are in sync so no much point teaching KVM about it imho, DMA map requests cannot go past QEMU anyway. Thanks,
For reference, TDX uses a similar ioctl—`KVM_TDX_INIT_MEM_REGION`—to
initialize guest memory. However, that interface isn’t well-suited for
dynamic updates to stage-2 mappings during shared-to-private or
private-to-shared transitions.
I was kind of thinking it would be nice to have a guestmemfd mode that
was "pinned", meaning the memory is allocated and remains almost
always mapped into the TSM's page tables automatically. VFIO using
guests would set things this way.
Yeah while doing the above, I was wondering if I want to pass the fd type when DMA-mapping from an fd or "detect" it as I do in the above commit or have some iommufd_fdmap_ops in this fd saying "(no) pinning needed" (or make this a flag of IOMMU_IOAS_MAP_FILE).
The "detection" is (mapping_inaccessible(mapping) && mapping_unevictable(mapping)), works for now.
btw in the AMD case, here it does not matter as much if it is private or shared, I map everything and let RMP and the VM deal with the permissions. Thanks,
-aneesh
--
Alexey