Hi, Thanks for the patchset. Some logistics: 1. Please prefix future patches properly with "bpf" or "bpf-next", for example, [PATCH v2 bpf-next 1/2]. 2. Please be specific with the patch title, i.e. "selftests/bpf: Add selftests" should be something like "selftests/bpf: Add selftests for cpu-idle ext". On Fri, Aug 29, 2025 at 3:11 AM Lin Yikai <yikai.lin@xxxxxxxx> wrote: > > Summary > ---------- > Hi, everyone, > This patch set introduces an extensible cpuidle governor framework > using BPF struct_ops, enabling dynamic implementation of idle-state selection policies > via BPF programs. > > Motivation > ---------- > As is well-known, CPUs support multiple idle states (e.g., C0, C1, C2, ...), > where deeper states reduce power consumption, but results in longer wakeup latency, > potentially affecting performance. > Existing generic cpuidle governors operate effectively in common scenarios > but exhibit suboptimal behavior in specific Android phone's use cases. > > Our testing reveals that during low-utilization scenarios > (e.g., screen-off background tasks like music playback with CPU utilization <10%), > the C0 state occupies ~50% of idle time, causing significant energy inefficiency. > Reducing C0 to ≤20% could yield ≥5% power savings on mobile phones. > > To address this, we expect: > 1.Dynamic governor switching to power-saved policies for low cpu utilization scenarios (e.g., screen-off mode) > 2.Dynamic switching to alternate governors for high-performance scenarios (e.g., gaming) > > OverView > ---------- > The BPF cpuidle ext governor registers at postcore_initcall() > but remains disabled by default due to its low priority "rating" with value "1". > Activation requires adjust higer "rating" than other governors within BPF. > > Core Components: > 1.**struct cpuidle_gov_ext_ops** – BPF-overridable operations: > - ops.enable()/ops.disable(): enable or disable callback > - ops.select(): cpu Idle-state selection logic > - ops.set_stop_tick(): Scheduler tick management after state selection > - ops.reflect(): feedback info about previous idle state. > - ops.init()/ops.deinit(): Initialization or cleanup. > > 2.**Critical kfuncs for kernel state access**: > - bpf_cpuidle_ext_gov_update_rating(): > Activate ext governor by raising rating must be called from "ops.init()" > - bpf_cpuidle_ext_gov_latency_req(): get idle-state latency constraints > - bpf_tick_nohz_get_sleep_length(): get CPU sleep duration in tickless mode > > Future work > ---------- > 1. Scenario detection: Identifying low-utilization states (e.g., screen-off + background music) > 2. Policy optimization: Optimizing state-selection algorithms for specific scenarios I am not an expert on cpuidle, so pardon me if the following are rookie questions. But I guess some more detail will help other folks too. 1. It is not clear to me why a BPF based solution is needed here. Can we achieve similar benefits with a knob and some userspace daemon? 2. Is it possible to extend sched_ext to cover cpuidle logic? Thanks, Song