On Fri, 2025-08-29 at 01:07 -0700, Ankur Arora wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Hi, > > This series adds waited variants of the smp_cond_load() primitives: > smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait(). > > Why?: as the name suggests, the new interfaces are meant for contexts > where you want to wait on a condition variable for a finite duration. > This is easy enough to do with a loop around cpu_relax(). However, > some architectures (ex. arm64) also allow waiting on a cacheline. So, > these interfaces handle a mixture of spin/wait with a smp_cond_load() > thrown in. > > There are two known users for these interfaces: > > - poll_idle() [1] > - resilient queued spinlocks [2] > > The interfaces are: > smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr) > smp_cond_load_acquire_spinwait(ptr, cond_expr, time_check_expr) > > The added parameter, time_check_expr, determines the bail out condition. > > Changelog: > v3 [3]: > - further interface simplifications (suggested by Catalin Marinas) > > v2 [4]: > - simplified the interface (suggested by Catalin Marinas) > - get rid of wait_policy, and a multitude of constants > - adds a slack parameter > This helped remove a fair amount of duplicated code duplication and in hindsight > unnecessary constants. > > v1 [5]: > - add wait_policy (coarse and fine) > - derive spin-count etc at runtime instead of using arbitrary > constants. > > Haris Okanovic had tested an earlier version of this series with > poll_idle()/haltpoll patches. [6] > > Any comments appreciated! > > Thanks! > Ankur > > [1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@xxxxxxxxxx/ > [2] Uses the smp_cond_load_acquire_timewait() from v1 > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h > [3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@xxxxxxxxxx/ > [4] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@xxxxxxxxxx/ > [5] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@xxxxxxxxxx/ > [6] https://lore.kernel.org/lkml/f2f5d09e79539754ced085ed89865787fa668695.camel@xxxxxxxxxx > > Cc: Arnd Bergmann <arnd@xxxxxxxx> > Cc: Will Deacon <will@xxxxxxxxxx> > Cc: Catalin Marinas <catalin.marinas@xxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > Cc: Alexei Starovoitov <ast@xxxxxxxxxx> > Cc: linux-arch@xxxxxxxxxxxxxxx > > Ankur Arora (5): > asm-generic: barrier: Add smp_cond_load_relaxed_timewait() > arm64: barrier: Add smp_cond_load_relaxed_timewait() > arm64: rqspinlock: Remove private copy of > smp_cond_load_acquire_timewait > asm-generic: barrier: Add smp_cond_load_acquire_timewait() > rqspinlock: use smp_cond_load_acquire_timewait() > > arch/arm64/include/asm/barrier.h | 22 ++++++++ > arch/arm64/include/asm/rqspinlock.h | 84 +---------------------------- > include/asm-generic/barrier.h | 57 ++++++++++++++++++++ > include/asm-generic/rqspinlock.h | 4 ++ > kernel/bpf/rqspinlock.c | 25 ++++----- > 5 files changed, 93 insertions(+), 99 deletions(-) > > -- > 2.31.1 > Tested on AWS Graviton 2, 3, and 4 (ARM64 Neoverse N1, V1, and V2) with your V10 haltpoll changes, atop 6.17.0-rc3 (commit 07d9df8008). Still seeing between 1.3x and 2.5x speedups in `perf bench sched pipe` and `seccomp-notify`; no change in `messaging`. Reviewed-by: Haris Okanovic <harisokn@xxxxxxxxxx> Tested-by: Haris Okanovic <harisokn@xxxxxxxxxx> Regards, Haris Okanovic AWS Graviton Software