Expose resctrl monitoring data via a lightweight perf PMU.
Background: The kernel's initial cache-monitoring interface shipped via perf (commit 4afbb24ce5e7, 2015). That approach tied monitoring to tasks and cgroups. Later, cache control was designed around the resctrl filesystem to better match hardware semantics, and the incompatible perf CQM code was removed (commit c39a0e2c8850, 2017). This series implements a thin, generic perf PMU that _is_ compatible with resctrl.
Motivation: perf support enables measuring cache occupancy and memory bandwidth metrics on hrtimer (high resolution timer) interrupts via eBPF. Compared with polling from userspace, hrtimer-based reads remove scheduling jitter and context switch overhead. Further, PMU reads can be parallel, since the PMU read path need not lock resctrl's rdtgroup_mutex. Parallelization and reduced jitter enable more accurate snapshots of cache occupancy and memory bandwidth. [1] has more details on the motivation and design.
Design: The "resctrl" PMU is a small adapter on top of resctrl's monitoring path: - Event selection uses `attr.config` to pass an open `mon_data` fd (e.g. `mon_L3_00/llc_occupancy`). - Events must be CPU-bound within the file's domain. Perf is responsible the read executes on the bound CPU. - Event init resolves and pins the rdtgroup, prepares struct rmid_read via mon_event_setup_read(), and validates the bound CPU is in the file's domain CPU mask. - Sampling is not supported; reads match the `mon_data` file contents. - If the rdtgroup is deleted, reads return 0.
Includes a new selftest (tools/testing/selftests/resctrl/pmu_test.c) to validate the PMU event init path, and adds PMU testing to existing CMT tests.
Example usage (see Documentation/filesystems/resctrl.rst): Open a monitoring file and pass its fd in `perf_event_attr.config`, with `attr.type` set to the `resctrl` PMU type.
The patches are based on top of v6.18-rc1 (commit 3a8660878839).
[1] https://www.youtube.com/watch?v=4BGhAMJdZTc
Jonathan Perry (8): resctrl: Pin rdtgroup for mon_data file lifetime resctrl/mon: Split RMID read init from execution resctrl/mon: Select cpumask before invoking mon_event_read() resctrl/mon: Create mon_event_setup_read() helper resctrl: Propagate CPU mask validation error via rr->err resctrl/pmu: Introduce skeleton PMU and selftests resctrl/pmu: Use mon_event_setup_read() and validate CPU resctrl/pmu: Implement .read via direct RMID read; add LLC selftest
Documentation/filesystems/resctrl.rst | 64 ++++ fs/resctrl/Makefile | 2 +- fs/resctrl/ctrlmondata.c | 118 ++++--- fs/resctrl/internal.h | 24 +- fs/resctrl/monitor.c | 8 +- fs/resctrl/pmu.c | 217 +++++++++++++ fs/resctrl/rdtgroup.c | 131 +++++++- tools/testing/selftests/resctrl/cache.c | 94 +++++- tools/testing/selftests/resctrl/cmt_test.c | 17 +- tools/testing/selftests/resctrl/pmu_test.c | 292 ++++++++++++++++++ tools/testing/selftests/resctrl/pmu_utils.c | 32 ++ tools/testing/selftests/resctrl/resctrl.h | 4 + .../testing/selftests/resctrl/resctrl_tests.c | 1 + 13 files changed, 948 insertions(+), 56 deletions(-) create mode 100644 fs/resctrl/pmu.c create mode 100644 tools/testing/selftests/resctrl/pmu_test.c create mode 100644 tools/testing/selftests/resctrl/pmu_utils.c