Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 08, 2025, Vishal Annapurve wrote:
> On Tue, Jul 8, 2025 at 7:52 AM Edgecombe, Rick P
> <rick.p.edgecombe@xxxxxxxxx> wrote:
> >
> > On Tue, 2025-07-08 at 07:20 -0700, Sean Christopherson wrote:
> > > > For TDX if we don't zero on conversion from private->shared we will be
> > > > dependent
> > > > on behavior of the CPU when reading memory with keyid 0, which was
> > > > previously
> > > > encrypted and has some protection bits set. I don't *think* the behavior is
> > > > architectural. So it might be prudent to either make it so, or zero it in
> > > > the
> > > > kernel in order to not make non-architectual behavior into userspace ABI.
> > >
> > > Ya, by "vendor specific", I was also lumping in cases where the kernel would
> > > need to zero memory in order to not end up with effectively undefined
> > > behavior.
> >
> > Yea, more of an answer to Vishal's question about if CC VMs need zeroing. And
> > the answer is sort of yes, even though TDX doesn't require it. But we actually
> > don't want to zero memory when reclaiming memory. So TDX KVM code needs to know
> > that the operation is a to-shared conversion and not another type of private
> > zap. Like a callback from gmem, or maybe more simply a kernel internal flag to
> > set in gmem such that it knows it should zero it.
> 
> If the answer is that "always zero on private to shared conversions"
> for all CC VMs,

pKVM VMs *are* CoCo VMs.  Just because pKVM doesn't rely on third party firmware
to provide confidentiality and integrity doesn't make it any less of a CoCo VM.

> > >  : And maybe a new flag for KVM_GMEM_CONVERT_PRIVATE for user space to
> > >  : explicitly request that the page range is converted to private and the
> > >  : content needs to be retained. So that TDX can identify which case needs
> > >  : to call in-place TDH.PAGE.ADD.
> > >
> > > If so, I agree with that idea, e.g. add a PRESERVE flag or whatever.  That way
> > > userspace has explicit control over what happens to the data during
> > > conversion,
> > > and KVM can reject unsupported conversions, e.g. PRESERVE is only allowed for
> > > shared => private and only for select VM types.
> >
> > Ok, we should POC how it works with TDX.
> 
> I don't think we need a flag to preserve memory as I mentioned in [2]. IIUC,
> 1) Conversions are always content-preserving for pKVM.

No?  Perserving contents on private => shared is a security vulnerability waiting
to happen.

> 2) Shared to private conversions are always content-preserving for all
> VMs as far as guest_memfd is concerned.

There is no "as far as guest_memfd is concerned".  Userspace doesn't care whether
code lives in guest_memfd.c versus arch/xxx/kvm, the only thing that matters is
the behavior that userspace sees.  I don't want to end up with userspace ABI that
is vendor/VM specific.

> 3) Private to shared conversions are not content-preserving for CC VMs
> as far as guest_memfd is concerned, subject to more discussions.
> 
> [2] https://lore.kernel.org/lkml/CAGtprH-Kzn2kOGZ4JuNtUT53Hugw64M-_XMmhz_gCiDS6BAFtQ@xxxxxxxxxxxxxx/





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux