Xu Yilun wrote: > On Thu, Jul 17, 2025 at 09:56:01AM -0700, Ackerley Tng wrote: > > Xu Yilun <yilun.xu@xxxxxxxxxxxxxxx> writes: > > > > > On Wed, Jul 16, 2025 at 03:22:06PM -0700, Ackerley Tng wrote: > > >> Yan Zhao <yan.y.zhao@xxxxxxxxx> writes: > > >> > > >> > On Tue, Jun 24, 2025 at 07:10:38AM -0700, Vishal Annapurve wrote: > > >> >> On Tue, Jun 24, 2025 at 6:08 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > >> >> > > > >> >> > On Tue, Jun 24, 2025 at 06:23:54PM +1000, Alexey Kardashevskiy wrote: > > >> >> > > > >> >> > > Now, I am rebasing my RFC on top of this patchset and it fails in > > >> >> > > kvm_gmem_has_safe_refcount() as IOMMU holds references to all these > > >> >> > > folios in my RFC. > > >> >> > > > > >> >> > > So what is the expected sequence here? The userspace unmaps a DMA > > >> >> > > page and maps it back right away, all from the userspace? The end > > >> >> > > result will be the exactly same which seems useless. And IOMMU TLB > > >> >> > > >> >> As Jason described, ideally IOMMU just like KVM, should just: > > >> >> 1) Directly rely on guest_memfd for pinning -> no page refcounts taken > > >> >> by IOMMU stack > > >> > In TDX connect, TDX module and TDs do not trust VMM. So, it's the TDs to inform > > >> > TDX module about which pages are used by it for DMAs purposes. > > >> > So, if a page is regarded as pinned by TDs for DMA, the TDX module will fail the > > >> > unmap of the pages from S-EPT. > > >> > > > >> > If IOMMU side does not increase refcount, IMHO, some way to indicate that > > >> > certain PFNs are used by TDs for DMA is still required, so guest_memfd can > > >> > reject the request before attempting the actual unmap. > > >> > Otherwise, the unmap of TD-DMA-pinned pages will fail. > > >> > > > >> > Upon this kind of unmapping failure, it also doesn't help for host to retry > > >> > unmapping without unpinning from TD. > > >> > > > >> > > > >> > > >> Yan, Yilun, would it work if, on conversion, > > >> > > >> 1. guest_memfd notifies IOMMU that a conversion is about to happen for a > > >> PFN range > > > > > > It is the Guest fw call to release the pinning. > > > > I see, thanks for explaining. > > > > > By the time VMM get the > > > conversion requirement, the page is already physically unpinned. So I > > > agree with Jason the pinning doesn't have to reach to iommu from SW POV. > > > > > > > If by the time KVM gets the conversion request, the page is unpinned, > > then we're all good, right? > > Yes, unless guest doesn't unpin the page first by mistake. Or maliciously? :-( My initial response to this was that this is a bug and we don't need to be concerned with it. However, can't this be a DOS from one TD to crash the system if the host uses the private page for something else and the machine #MC's? Ira > Guest would > invoke a fw call tdg.mem.page.release to unpin the page before > KVM_HC_MAP_GPA_RANGE. > > > > > When guest_memfd gets the conversion request, as part of conversion > > handling it will request to zap the page from stage-2 page tables. TDX > > module would see that the page is unpinned and the unmapping will > > proceed fine. Is that understanding correct? > > Yes, again unless guess doesn't unpin. > > > > > >> 2. IOMMU forwards the notification to TDX code in the kernel > > >> 3. TDX code in kernel tells TDX module to stop thinking of any PFNs in > > >> the range as pinned for DMA? > > > > > > TDX host can't stop the pinning. Actually this mechanism is to prevent > > > host from unpin/unmap the DMA out of Guest expectation. > > > > > > > On this note, I'd also like to check something else. Putting TDX connect > > and IOMMUs aside, if the host unmaps a guest private page today without > > the guest requesting it, the unmapping will work and the guest will be > > broken, right? > > Correct. The unmapping will work, the guest can't continue anymore. > > Thanks, > Yilun