Hi,
Thanks for the patchset.
Some logistics:
1. Please prefix future patches properly with "bpf" or "bpf-next", for example, [PATCH v2 bpf-next 1/2].
2. Please be specific with the patch title, i.e. "selftests/bpf: Add selftests" should be something like "selftests/bpf: Add selftests for cpu-idle ext".
On Fri, Aug 29, 2025 at 3:11 AM Lin Yikai yikai.lin@vivo.com wrote:
Summary
Hi, everyone, This patch set introduces an extensible cpuidle governor framework using BPF struct_ops, enabling dynamic implementation of idle-state selection policies via BPF programs.
Motivation
As is well-known, CPUs support multiple idle states (e.g., C0, C1, C2, ...), where deeper states reduce power consumption, but results in longer wakeup latency, potentially affecting performance. Existing generic cpuidle governors operate effectively in common scenarios but exhibit suboptimal behavior in specific Android phone's use cases.
Our testing reveals that during low-utilization scenarios (e.g., screen-off background tasks like music playback with CPU utilization <10%), the C0 state occupies ~50% of idle time, causing significant energy inefficiency. Reducing C0 to ≤20% could yield ≥5% power savings on mobile phones.
To address this, we expect: 1.Dynamic governor switching to power-saved policies for low cpu utilization scenarios (e.g., screen-off mode) 2.Dynamic switching to alternate governors for high-performance scenarios (e.g., gaming)
OverView
The BPF cpuidle ext governor registers at postcore_initcall() but remains disabled by default due to its low priority "rating" with value "1". Activation requires adjust higer "rating" than other governors within BPF.
Core Components: 1.**struct cpuidle_gov_ext_ops** – BPF-overridable operations:
- ops.enable()/ops.disable(): enable or disable callback
- ops.select(): cpu Idle-state selection logic
- ops.set_stop_tick(): Scheduler tick management after state selection
- ops.reflect(): feedback info about previous idle state.
- ops.init()/ops.deinit(): Initialization or cleanup.
2.**Critical kfuncs for kernel state access**:
- bpf_cpuidle_ext_gov_update_rating(): Activate ext governor by raising rating must be called from "ops.init()"
- bpf_cpuidle_ext_gov_latency_req(): get idle-state latency constraints
- bpf_tick_nohz_get_sleep_length(): get CPU sleep duration in tickless mode
Future work
- Scenario detection: Identifying low-utilization states (e.g., screen-off + background music)
- Policy optimization: Optimizing state-selection algorithms for specific scenarios
I am not an expert on cpuidle, so pardon me if the following are rookie questions. But I guess some more detail will help other folks too.
1. It is not clear to me why a BPF based solution is needed here. Can we achieve similar benefits with a knob and some userspace daemon?
2. Is it possible to extend sched_ext to cover cpuidle logic?
Thanks, Song