Re: [PATCH 2/2] KVM: SVM: drop useless cpumask_test_cpu() in pre_sev_run()

Yury Norov <yury.norov@xxxxxxxxx> · Mon, 11 Aug 2025 17:28:05 -0400

On Mon, Aug 11, 2025 at 01:45:46PM -0700, Sean Christopherson wrote:
> On Mon, Aug 11, 2025, Yury Norov wrote:
> > Testing cpumask for a CPU to be cleared just before setting the exact
> > same CPU is useless because the end result is always the same: CPU is
> > set.
> 
> No, it is not useless.  Blindly writing to the variable will unnecessarily bounce
> the cacheline, and this is a hot path.

How hot is that path? How bad the cache contention is? Is there any evidence
that conditional cpumask_set_cpu() worth the effort? The original patch
doesn't discuss that at all, and without any comment the code looks just
buggy.

> > While there, switch CPU setter to a non-atomic version. Atomicity is
> > useless here 
> 
> No, atomicity isn't useless here either.  Dropping atomicity could result in
> CPU's bit being lost.  I.e. the atomic accesses aren't for the benefit of
> smp_call_function_many_cond(), the writes are atomic so that multiple vCPUs can
> concurrently update the mask without needing additional protection.

OK, I see. Something heavy hit my head before I decided to drop
atomicity there.

> > because sev_writeback_caches() ends up with a plain
> > for_each_cpu() loop in smp_call_function_many_cond(), which is not
> > atomic by nature.
> 
> That's fine.  As noted in sev_writeback_caches(), if vCPU could be running, then
> the caller is responsible for ensuring that all vCPUs flush caches before the
> memory being reclaimed is fully freed.  Those guarantees are provided by KVM's
> MMU.
> 
> sev_writeback_caches() => smp_call_function_many_cond() could hit false positives,
> i.e. trigger WBINVD on CPUs that couldn't possibly have accessed the memory being
> reclaimed, but such false positives are functionally benign, and are "intended"
> in the sense that we chose to prioritize simplicity over precision.

So, I don't object to drop the patch, but it would be really nice to
have this 
                        if (!cpumask_test_cpu())
                                cpumask_set_cpu()

pattern explained, and even better supported with performance numbers.

Thanks,
Yury