Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 19, 2025 at 01:28:46PM -0400, Yazen Ghannam wrote:
> On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
> > Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
> > check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
> > valid physical address bits within the machine check bank address register.
> > 
> > This is particularly needed in the case of errors in TDX/SEAM non-root mode
> > because the reported address contains the TDX KeyID.  Refer to TDX and
> > TME-MK documentation for more information about KeyIDs.
> > 
> > Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
> > non-root mode") uses the address to mark the affected page as poisoned, but
> > omits to use the aforementioned mask.
> > 
> > Investigation of user space expectations has concluded it would be more
> > correct for the address to contain only address bits in the first place.
> > Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@xxxxxxxxx
> > 
> > Mask the address when it is read from the machine check bank address
> > register.  Do not use MCI_ADDR_PHYSADDR because that will be removed in a
> > later patch.
> > 
> > It is assumed __log_error() in arch/x86/kernel/cpu/mce/amd.c does not need
> > similar treatment.
> > 
> > Amend struct mce addr member description slightly to reflect that it is
> > not, and never has been, an exact copy of the bank's MCi_ADDR MSR.
> > 
> 
> I think it would be more accurate to say that the MCi_ADDR MSR is not,
> and never has been, guaranteed to be a system physical address.
> 
> We could introduce a new field that represents the system physical
> address, if one exists for the error type. This way we can operate on a
> value without assumption or additional checks. And we can keep the raw
> MCi_ADDR MSR value in case it is of value to debug folks or hardware
> designers. In my experience, they seem to appreciate having the full,
> unfiltered data. We don't give them that today, but we can work towards
> that goal.

Having and exact copy of MCi_ADDR might be useful. I recall some angst
about this code masking off low order bits:

		m->addr = mce_rdmsrq(mca_msr_reg(i, MCA_ADDR));

		/*
		 * Mask the reported address by the reported granularity.
		 */
		if (mca_cfg.ser && (m->status & MCI_STATUS_MISCV)) {
			u8 shift = MCI_MISC_ADDR_LSB(m->misc);
			m->addr >>= shift;
			m->addr <<= shift;
		}

this proposal masks some high order bits too.

I second Yazen's suggestion of a new field. One for the raw value,
another for the massaged phsical address derived from the MSR.

-Tony




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux