Okanovic, Haris <harisokn@xxxxxxxxxx> writes: > On Fri, 2025-08-29 at 01:07 -0700, Ankur Arora wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. >> >> >> >> Hi, >> >> This series adds waited variants of the smp_cond_load() primitives: >> smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait(). >> >> Why?: as the name suggests, the new interfaces are meant for contexts >> where you want to wait on a condition variable for a finite duration. >> This is easy enough to do with a loop around cpu_relax(). However, >> some architectures (ex. arm64) also allow waiting on a cacheline. So, >> these interfaces handle a mixture of spin/wait with a smp_cond_load() >> thrown in. >> >> There are two known users for these interfaces: >> >> - poll_idle() [1] >> - resilient queued spinlocks [2] >> >> The interfaces are: >> smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr) >> smp_cond_load_acquire_spinwait(ptr, cond_expr, time_check_expr) >> >> The added parameter, time_check_expr, determines the bail out condition. >> >> Changelog: >> v3 [3]: >> - further interface simplifications (suggested by Catalin Marinas) >> >> v2 [4]: >> - simplified the interface (suggested by Catalin Marinas) >> - get rid of wait_policy, and a multitude of constants >> - adds a slack parameter >> This helped remove a fair amount of duplicated code duplication and in hindsight >> unnecessary constants. >> >> v1 [5]: >> - add wait_policy (coarse and fine) >> - derive spin-count etc at runtime instead of using arbitrary >> constants. >> >> Haris Okanovic had tested an earlier version of this series with >> poll_idle()/haltpoll patches. [6] >> >> Any comments appreciated! >> >> Thanks! >> Ankur >> >> [1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@xxxxxxxxxx/ >> [2] Uses the smp_cond_load_acquire_timewait() from v1 >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h >> [3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@xxxxxxxxxx/ >> [4] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@xxxxxxxxxx/ >> [5] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@xxxxxxxxxx/ >> [6] https://lore.kernel.org/lkml/f2f5d09e79539754ced085ed89865787fa668695.camel@xxxxxxxxxx >> >> Cc: Arnd Bergmann <arnd@xxxxxxxx> >> Cc: Will Deacon <will@xxxxxxxxxx> >> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> >> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> Cc: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> >> Cc: Alexei Starovoitov <ast@xxxxxxxxxx> >> Cc: linux-arch@xxxxxxxxxxxxxxx >> >> Ankur Arora (5): >> asm-generic: barrier: Add smp_cond_load_relaxed_timewait() >> arm64: barrier: Add smp_cond_load_relaxed_timewait() >> arm64: rqspinlock: Remove private copy of >> smp_cond_load_acquire_timewait >> asm-generic: barrier: Add smp_cond_load_acquire_timewait() >> rqspinlock: use smp_cond_load_acquire_timewait() >> >> arch/arm64/include/asm/barrier.h | 22 ++++++++ >> arch/arm64/include/asm/rqspinlock.h | 84 +---------------------------- >> include/asm-generic/barrier.h | 57 ++++++++++++++++++++ >> include/asm-generic/rqspinlock.h | 4 ++ >> kernel/bpf/rqspinlock.c | 25 ++++----- >> 5 files changed, 93 insertions(+), 99 deletions(-) >> >> -- >> 2.31.1 >> > > Tested on AWS Graviton 2, 3, and 4 (ARM64 Neoverse N1, V1, and V2) with > your V10 haltpoll changes, atop 6.17.0-rc3 (commit 07d9df8008). > Still seeing between 1.3x and 2.5x speedups in `perf bench sched pipe` > and `seccomp-notify`; no change in `messaging`. Great. > Reviewed-by: Haris Okanovic <harisokn@xxxxxxxxxx> > Tested-by: Haris Okanovic <harisokn@xxxxxxxxxx> Thank you. -- ankur