Re: [PATCH v2 4/5] KVM: selftests: Relax precise event count validation as overcount issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 18, 2025, Dapeng Mi wrote:
> From: dongsheng <dongsheng.x.zhang@xxxxxxxxx>
> 
> For Intel Atom CPUs, the PMU events "Instruction Retired" or
> "Branch Instruction Retired" may be overcounted for some certain
> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
> and complex SGX/SMX/CSTATE instructions/flows.
> 
> The detailed information can be found in the errata (section SRF7):
> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
> 
> For the Atom platforms before Sierra Forest (including Sierra Forest),
> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
> be overcounted on these certain instructions, but for Clearwater Forest
> only "Instruction Retired" event is overcounted on these instructions.
> 
> As the overcount issue on VM-Exit/VM-Entry, it has no way to validate
> the precise count for these 2 events on these affected Atom platforms,
> so just relax the precise event count check for these 2 events on these
> Atom platforms.
> 
> Signed-off-by: dongsheng <dongsheng.x.zhang@xxxxxxxxx>
> Co-developed-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
> Signed-off-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
> Tested-by: Yi Lai <yi1.lai@xxxxxxxxx>
> ---

...

> diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> index 342a72420177..074cdf323406 100644
> --- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> @@ -52,6 +52,9 @@ struct kvm_intel_pmu_event {
>  	struct kvm_x86_pmu_feature fixed_event;
>  };
>  
> +
> +static uint8_t inst_overcount_flags;
> +
>  /*
>   * Wrap the array to appease the compiler, as the macros used to construct each
>   * kvm_x86_pmu_feature use syntax that's only valid in function scope, and the
> @@ -163,10 +166,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
>  
>  	switch (idx) {
>  	case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
> -		GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> +		/* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> +		if (inst_overcount_flags & INST_RETIRED_OVERCOUNT)
> +			GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
> +		else
> +			GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
>  		break;
>  	case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
> -		GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
> +		/* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> +		if (inst_overcount_flags & BR_RETIRED_OVERCOUNT)
> +			GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
> +		else
> +			GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
>  		break;
>  	case INTEL_ARCH_LLC_REFERENCES_INDEX:
>  	case INTEL_ARCH_LLC_MISSES_INDEX:
> @@ -335,6 +346,7 @@ static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities,
>  				length);
>  	vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
>  				unavailable_mask);
> +	sync_global_to_guest(vm, inst_overcount_flags);

Rather than force individual tests to sync_global_to_guest(), and to cache the
value, I think it makes sense to handle this automatically in kvm_arch_vm_post_create(),
similar to things like host_cpu_is_intel and host_cpu_is_amd.

And explicitly call these out as errata, so that it's super clear that we're
working around PMU/CPU flaws, not KVM bugs.  With some shenanigans, we can even
reuse the this_pmu_has()/this_cpu_has(0 terminology as this_pmu_has_errata(), and
hide the use of a bitmask too.

diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
index d4f90f5ec5b8..046d992c5940 100644
--- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
@@ -163,10 +163,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
 
        switch (idx) {
        case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
-               GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
+               /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
+               if (this_pmu_has_errata(INSTRUCTIONS_RETIRED_OVERCOUNT))
+                       GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
+               else
+                       GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
                break;
        case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
-               GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
+               /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
+               if (this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT))
+                       GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
+               else
+                       GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
                break;
        case INTEL_ARCH_LLC_REFERENCES_INDEX:
        case INTEL_ARCH_LLC_MISSES_INDEX:
diff --git a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
index c15513cd74d1..1c5b7611db24 100644
--- a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
@@ -214,8 +214,10 @@ static void remove_event(struct __kvm_pmu_event_filter *f, uint64_t event)
 do {                                                                                   \
        uint64_t br = pmc_results.branches_retired;                                     \
        uint64_t ir = pmc_results.instructions_retired;                                 \
+       bool br_matched = this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT) ?             \
+                         br >= NUM_BRANCHES : br == NUM_BRANCHES;                      \
                                                                                        \
-       if (br && br != NUM_BRANCHES)                                                   \
+       if (br && !br_matched)                                                          \
                pr_info("%s: Branch instructions retired = %lu (expected %u)\n",        \
                        __func__, br, NUM_BRANCHES);                                    \
        TEST_ASSERT(br, "%s: Branch instructions retired = %lu (expected > 0)",         \




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux