Re: [PATCH] kvm: x86: fix infinite loop in kvm_guest_time_update when tsc is 0

Sean Christopherson <seanjc@xxxxxxxxxx> · Wed, 25 Jun 2025 07:45:59 -0700

On Wed, May 14, 2025, Yuntao Liu wrote:
> Call Trace:
>  <TASK>
>  kvm_get_time_scale arch/x86/kvm/x86.c:2458 [inline]
>  kvm_guest_time_update+0x926/0xb00 arch/x86/kvm/x86.c:3268
>  vcpu_enter_guest.constprop.0+0x1e70/0x3cf0 arch/x86/kvm/x86.c:10678
>  vcpu_run+0x129/0x8d0 arch/x86/kvm/x86.c:11126
>  kvm_arch_vcpu_ioctl_run+0x37a/0x13d0 arch/x86/kvm/x86.c:11352
>  kvm_vcpu_ioctl+0x56b/0xe60 virt/kvm/kvm_main.c:4188
>  vfs_ioctl fs/ioctl.c:51 [inline]
>  __do_sys_ioctl fs/ioctl.c:871 [inline]
>  __se_sys_ioctl+0x12d/0x190 fs/ioctl.c:857
>  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>  do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81
>  entry_SYSCALL_64_after_hwframe+0x78/0xe2
> 
> ioctl$KVM_SET_TSC_KHZ(r2, 0xaea2, 0x1)
> user_tsc_khz = 0x1
> 	|
> kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz)
> 	|
> 	ioctl$KVM_RUN(r2, 0xae80, 0x0)
> 		|
> 		...
> 	kvm_guest_time_update(struct kvm_vcpu *v)
> 		|
> 		if (kvm_caps.has_tsc_control)
> 			tgt_tsc_khz = kvm_scale_tsc(tgt_tsc_khz,
> 					    v->arch.l1_tsc_scaling_ratio);
> 			|
> 			kvm_scale_tsc(u64 tsc, u64 ratio)
> 			|
> 			__scale_tsc(u64 ratio, u64 tsc)
> 			ratio=122380531, tsc=2299998, N=48
> 			ratio*tsc >> N = 0.999... -> 0
> 			|
> 		kvm_get_time_scale
> 
> In function __scale_tsc, it uses fixed point number to calculate
> tsc, therefore, a certain degree of precision is lost, the actual tsc
> value of 0.999... would be 0. In function kvm_get_time_scale
> tps32=tps64=base_hz=0, would lead second while_loop infinite. when
> CONFIG_PREEMPT is n, it causes a soft lockup issue.
> 
> Fixes: 35181e86df97 ("KVM: x86: Add a common TSC scaling function")
> Signed-off-by: Yuntao Liu <liuyuntao12@xxxxxxxxxx>
> ---
>  arch/x86/kvm/x86.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1fa5d89f8d27..3e9d6f368eed 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2605,10 +2605,14 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu)
>   * point number (mult + frac * 2^(-N)).
>   *
>   * N equals to kvm_caps.tsc_scaling_ratio_frac_bits.
> + *
> + * return 1 if _tsc is 0.
>   */
>  static inline u64 __scale_tsc(u64 ratio, u64 tsc)
>  {
> -	return mul_u64_u64_shr(tsc, ratio, kvm_caps.tsc_scaling_ratio_frac_bits);
> +	u64 _tsc = mul_u64_u64_shr(tsc, ratio, kvm_caps.tsc_scaling_ratio_frac_bits);
> +
> +	return  !_tsc ? 1 : _tsc;

This can be

	return _tsc ? : 1;

However, I'm 99% certain this only affects kvm_guest_time_update(), because it's
the only code that scales a TSC *frequency*, versus scaling a TSC value.  Hmm,
kvm_x86_vendor_init() also scales a frequency, but the multiplier and shift are
KVM controlled, so that calculation can never be '0.

So I think just this for a fix?  Because in all other cases, a result of '0' is
totally fine, and arguably even more correct, e.g. when used in adjust_tsc_offset_host().

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b58a74c1722d..de51dbd85a58 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3258,9 +3258,11 @@ int kvm_guest_time_update(struct kvm_vcpu *v)
 
        /* With all the info we got, fill in the values */
 
-       if (kvm_caps.has_tsc_control)
+       if (kvm_caps.has_tsc_control) {
                tgt_tsc_khz = kvm_scale_tsc(tgt_tsc_khz,
                                            v->arch.l1_tsc_scaling_ratio);
+               tgt_tsc_khz = tgt_tsc_khz ? : 1;
+       }
 
        if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
                kvm_get_time_scale(NSEC_PER_SEC, tgt_tsc_khz * 1000LL,