Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 17, 2025 at 09:56:01AM -0700, Ackerley Tng wrote:
> Xu Yilun <yilun.xu@xxxxxxxxxxxxxxx> writes:
> 
> > On Wed, Jul 16, 2025 at 03:22:06PM -0700, Ackerley Tng wrote:
> >> Yan Zhao <yan.y.zhao@xxxxxxxxx> writes:
> >> 
> >> > On Tue, Jun 24, 2025 at 07:10:38AM -0700, Vishal Annapurve wrote:
> >> >> On Tue, Jun 24, 2025 at 6:08 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >> >> >
> >> >> > On Tue, Jun 24, 2025 at 06:23:54PM +1000, Alexey Kardashevskiy wrote:
> >> >> >
> >> >> > > Now, I am rebasing my RFC on top of this patchset and it fails in
> >> >> > > kvm_gmem_has_safe_refcount() as IOMMU holds references to all these
> >> >> > > folios in my RFC.
> >> >> > >
> >> >> > > So what is the expected sequence here? The userspace unmaps a DMA
> >> >> > > page and maps it back right away, all from the userspace? The end
> >> >> > > result will be the exactly same which seems useless. And IOMMU TLB
> >> >> 
> >> >>  As Jason described, ideally IOMMU just like KVM, should just:
> >> >> 1) Directly rely on guest_memfd for pinning -> no page refcounts taken
> >> >> by IOMMU stack
> >> > In TDX connect, TDX module and TDs do not trust VMM. So, it's the TDs to inform
> >> > TDX module about which pages are used by it for DMAs purposes.
> >> > So, if a page is regarded as pinned by TDs for DMA, the TDX module will fail the
> >> > unmap of the pages from S-EPT.
> >> >
> >> > If IOMMU side does not increase refcount, IMHO, some way to indicate that
> >> > certain PFNs are used by TDs for DMA is still required, so guest_memfd can
> >> > reject the request before attempting the actual unmap.
> >> > Otherwise, the unmap of TD-DMA-pinned pages will fail.
> >> >
> >> > Upon this kind of unmapping failure, it also doesn't help for host to retry
> >> > unmapping without unpinning from TD.
> >> >
> >> >
> >> 
> >> Yan, Yilun, would it work if, on conversion,
> >> 
> >> 1. guest_memfd notifies IOMMU that a conversion is about to happen for a
> >>    PFN range
> >
> > It is the Guest fw call to release the pinning.
> 
> I see, thanks for explaining.
> 
> > By the time VMM get the
> > conversion requirement, the page is already physically unpinned. So I
> > agree with Jason the pinning doesn't have to reach to iommu from SW POV.
> >
> 
> If by the time KVM gets the conversion request, the page is unpinned,
> then we're all good, right?

Yes, unless guest doesn't unpin the page first by mistake. Guest would
invoke a fw call tdg.mem.page.release to unpin the page before
KVM_HC_MAP_GPA_RANGE.

> 
> When guest_memfd gets the conversion request, as part of conversion
> handling it will request to zap the page from stage-2 page tables. TDX
> module would see that the page is unpinned and the unmapping will
> proceed fine. Is that understanding correct?

Yes, again unless guess doesn't unpin.

> 
> >> 2. IOMMU forwards the notification to TDX code in the kernel
> >> 3. TDX code in kernel tells TDX module to stop thinking of any PFNs in
> >>    the range as pinned for DMA?
> >
> > TDX host can't stop the pinning. Actually this mechanism is to prevent
> > host from unpin/unmap the DMA out of Guest expectation.
> >
> 
> On this note, I'd also like to check something else. Putting TDX connect
> and IOMMUs aside, if the host unmaps a guest private page today without
> the guest requesting it, the unmapping will work and the guest will be
> broken, right?

Correct. The unmapping will work, the guest can't continue anymore.

Thanks,
Yilun




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux