On Thu, Jun 26, 2025 at 09:48:01PM -0700, Ankur Arora wrote: > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > index d4f581c1e21d..d33c2701c9ee 100644 > --- a/include/asm-generic/barrier.h > +++ b/include/asm-generic/barrier.h > @@ -273,6 +273,101 @@ do { \ > }) > #endif > > +#ifndef SMP_TIMEWAIT_SPIN_BASE > +#define SMP_TIMEWAIT_SPIN_BASE 16 > +#endif > + > +/* > + * Policy handler that adjusts the number of times we spin or > + * wait for cacheline to change before evaluating the time-expr. > + * > + * The generic version only supports spinning. > + */ > +static inline u64 ___smp_cond_spinwait(u64 now, u64 prev, u64 end, > + u32 *spin, bool *wait, u64 slack) > +{ > + if (now >= end) > + return 0; > + > + *spin = SMP_TIMEWAIT_SPIN_BASE; > + *wait = false; > + return now; > +} > + > +#ifndef __smp_cond_policy > +#define __smp_cond_policy ___smp_cond_spinwait > +#endif > + > +/* > + * Non-spin primitive that allows waiting for stores to an address, > + * with support for a timeout. This works in conjunction with an > + * architecturally defined policy. > + */ > +#ifndef __smp_timewait_store > +#define __smp_timewait_store(ptr, val) do { } while (0) > +#endif > + > +#ifndef __smp_cond_load_relaxed_timewait > +#define __smp_cond_load_relaxed_timewait(ptr, cond_expr, policy, \ > + time_expr, time_end, \ > + slack) ({ \ > + typeof(ptr) __PTR = (ptr); \ > + __unqual_scalar_typeof(*ptr) VAL; \ > + u32 __n = 0, __spin = SMP_TIMEWAIT_SPIN_BASE; \ > + u64 __prev = 0, __end = (time_end); \ > + u64 __slack = slack; \ > + bool __wait = false; \ > + \ > + for (;;) { \ > + VAL = READ_ONCE(*__PTR); \ > + if (cond_expr) \ > + break; \ > + cpu_relax(); \ > + if (++__n < __spin) \ > + continue; \ > + if (!(__prev = policy((time_expr), __prev, __end, \ > + &__spin, &__wait, __slack))) \ > + break; \ > + if (__wait) \ > + __smp_timewait_store(__PTR, VAL); \ > + __n = 0; \ > + } \ > + (typeof(*ptr))VAL; \ > +}) > +#endif TBH, this still looks over-engineered to me, especially with the second patch trying to reduce the spin loops based on the remaining time. Does any of the current users of this interface need it to get more precise? Also I feel the spinning added to poll_idle() is more of an architecture choice as some CPUs could not cope with local_clock() being called too frequently. The above generic implementation takes a spin into consideration even if an arch implementation doesn't need it (e.g. WFET or WFE). Yes, the arch policy could set a spin of 0 but it feels overly complicated for the generic implementation. Can we instead have the generic implementation without any spinning? Just polling a variable with cpu_relax() like smp_cond_load_acquire/relaxed() with the additional check for time. We redefine it in the arch code. > +#define __check_time_types(type, a, b) \ > + (__same_type(typeof(a), type) && \ > + __same_type(typeof(b), type)) > + > +/** > + * smp_cond_load_relaxed_timewait() - (Spin) wait for cond with no ordering > + * guarantees until a timeout expires. > + * @ptr: pointer to the variable to wait on > + * @cond: boolean expression to wait for > + * @time_expr: monotonic expression that evaluates to the current time > + * @time_end: end time, compared against time_expr > + * @slack: how much timer overshoot can the caller tolerate? > + * Useful for when we go into wait states. A value of 0 indicates a high > + * tolerance. > + * > + * Note that all times (time_expr, time_end, and slack) are in microseconds, > + * with no mandated precision. > + * > + * Equivalent to using READ_ONCE() on the condition variable. > + */ > +#define smp_cond_load_relaxed_timewait(ptr, cond_expr, time_expr, \ > + time_end, slack) ({ \ > + __unqual_scalar_typeof(*ptr) _val; \ > + BUILD_BUG_ON_MSG(!__check_time_types(u64, time_expr, time_end), \ > + "incompatible time units"); \ > + _val = __smp_cond_load_relaxed_timewait(ptr, cond_expr, \ > + __smp_cond_policy, \ > + time_expr, time_end, \ > + slack); \ > + (typeof(*ptr))_val; \ > +}) Looking at the current user of the acquire variant - rqspinlock, it does not even bother with a time_expr but rather added the time condition to cond_expr. I don't think it has any "slack" requirements, only that there's no deadlock eventually. About poll_idle(), are there any slack requirement or we get away without? I think we have two ways forward (well, at least): 1. Clearly define what time_end is and we won't need a time_expr at all. This may work for poll_idle(), not sure about rqspinlock. The advantage is that we can drop the 'slack' argument since none of the current users seem to need it. The downside is that we need to know exactly what this time_end is to convert it to timer cycles for a WFET implementation on arm64. 2. Drop time_end and only leave time_expr as a bool (we don't care whether it uses ns, jiffies or whatever underneath, it's just a bool). In this case, we could use a 'slack' argument mostly to make a decision on whether we use WFET, WFE or just polling with cpu_relax(). For WFET, the wait time would be based on the slack value rather than some absolute end time which we won't have. I'd go with (2), it looks simpler. Maybe even drop the 'slack' argument for the time being until we have a clear user. The fallback on arm64 would be from wfe (if event streaming available), wfet with the same period as the event stream (in the absence of a slack argument) or cpu_relax(). -- Catalin