This series adds waited variants of the smp_cond_load() primitives: smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout(). As the name suggests, the new interfaces are meant for contexts where you want to wait on a condition variable for a finite duration. This is easy enough to do with a loop around cpu_relax() and a periodic timeout check (pretty much what we do in poll_idle(). However, some architectures (ex. arm64) also allow waiting on a cacheline. So, smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr) smp_cond_load_acquire_timeout(ptr, cond_expr, time_check_expr) do a mixture of spin/wait with a smp_cond_load() thrown in. The added parameter, time_check_expr, determines the bail out condition. There are two current users for these interfaces. poll_idle() with the change: poll_idle() { ... time_end = local_clock_noinstr() + cpuidle_poll_time(drv, dev); raw_local_irq_enable(); if (!current_set_polling_and_test()) flags = smp_cond_load_relaxed_timeout(¤t_thread_info()->flags, (VAL & _TIF_NEED_RESCHED), ((local_clock_noinstr() >= time_end))); dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED); raw_local_irq_disable(); ... } where smp_cond_load_relaxed_timeout() replaces the inner loop in poll_idle() (on x86 the generated code for both is similar): poll_idle() { ... raw_local_irq_enable(); if (!current_set_polling_and_test()) { unsigned int loop_count = 0; u64 limit; limit = cpuidle_poll_time(drv, dev); while (!need_resched()) { cpu_relax(); if (loop_count++ < POLL_IDLE_RELAX_COUNT) continue; loop_count = 0; if (local_clock_noinstr() - time_start > limit) { dev->poll_time_limit = true; break; } } } raw_local_irq_disable(); ... } And resilient queued spinlocks: resilient_queued_spin_lock_slowpath() { ... if (val & _Q_LOCKED_MASK) { RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT); smp_cond_load_acquire_timeout(&lock->locked, !VAL, (ret = check_timeout(lock, _Q_LOCKED_MASK, &ts))); } ... } Changelog: v4 [1]: - naming change 's/timewait/timeout/' - resilient spinlocks: get rid of res_smp_cond_load_acquire_waiting() and fixup use of RES_CHECK_TIMEOUT(). (Both suggested by Catalin Marinas) v3 [2]: - further interface simplifications (suggested by Catalin Marinas) v2 [3]: - simplified the interface (suggested by Catalin Marinas) - get rid of wait_policy, and a multitude of constants - adds a slack parameter This helped remove a fair amount of duplicated code duplication and in hindsight unnecessary constants. v1 [4]: - add wait_policy (coarse and fine) - derive spin-count etc at runtime instead of using arbitrary constants. Haris Okanovic tested v4 of this series with poll_idle()/haltpoll patches. [5] Any comments appreciated! Thanks Ankur [1] https://lore.kernel.org/lkml/20250829080735.3598416-1-ankur.a.arora@xxxxxxxxxx/ [2] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@xxxxxxxxxx/ [3] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@xxxxxxxxxx/ [4] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@xxxxxxxxxx/ [5] https://lore.kernel.org/lkml/2cecbf7fb23ee83a4ce027e1be3f46f97efd585c.camel@xxxxxxxxxx/ Cc: Arnd Bergmann <arnd@xxxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> Cc: Alexei Starovoitov <ast@xxxxxxxxxx> Cc: linux-arch@xxxxxxxxxxxxxxx Ankur Arora (5): asm-generic: barrier: Add smp_cond_load_relaxed_timeout() arm64: barrier: Add smp_cond_load_relaxed_timeout() arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait asm-generic: barrier: Add smp_cond_load_acquire_timeout() rqspinlock: use smp_cond_load_acquire_timeout() arch/arm64/include/asm/barrier.h | 23 ++++++++ arch/arm64/include/asm/rqspinlock.h | 85 ----------------------------- include/asm-generic/barrier.h | 57 +++++++++++++++++++ kernel/bpf/rqspinlock.c | 23 +++----- 4 files changed, 87 insertions(+), 101 deletions(-) -- 2.43.5