On Thu, Aug 07, 2025 at 11:06:21PM +0900, Akihiko Odaki wrote:
The only cross-PMU events we will support are the fixed counters, my strong preference is that we do not reverse-map architectural events to generic perf events for all counters.
I wonder if there is a benefit to special case PERF_COUNT_HW_CPU_CYCLES then; the current logic of kvm_map_pmu_event() looks sufficient for me.
I'd rather we just use the generic perf events and let the driver remap things on our behalf. These are fixed counters, using constant events feels like the right way to go about that.
kvm_map_pmu_event() is trying to solve a slightly different problem where we need to map programmable PMUv3 events into a non-PMUv3 event space, like on the M1 PMU.
This isn't what I meant. What I mean is that userspace either can use the SET_PMU ioctl or the COMPOSITION ioctl. Once one of them has been used the other ioctl returns an error.
We're really bad at getting ioctl ordering / interleaving right and syzkaller has a habit of finding these mistakes. There's zero practical value in using both of these ioctls on the same VM, let's prevent it.
The corresponding RFC series for QEMU uses KVM_ARM_VCPU_PMU_V3_SET_PMU to probe host PMUs, and falls back to KVM_ARM_VCPU_PMU_V3_COMPOSITION if none covers all CPUs. Switching between SET_PMU and COMPOSITION is useful during such probing.
COMPOSITION is designed to behave like just another host PMU that is set with SET_PMU. SET_PMU allows setting a different host PMU even if SET_PMU has already been invoked so it is also allowed to set a host PMU even if COMPOSITION has already been invoked, maintaining consistency with non-composed PMUs.
You can find the QEMU patch at: https://lore.kernel.org/qemu-devel/20250806-kvmq-v1-1-d1d50b7058cd@rsg.ci.i....
(look up KVM_ARM_VCPU_PMU_V3_SET_PMU for the probing code)
Having both of these attributes return success when probed with KVM_HAS_DEVICE_ATTR is fine; what I mean is that once KVM_SET_DEVICE_ATTR has been called on an attribute the other fails.
On a system that has FEAT_PMUv3_ICNTR, userspace can still use this ioctl and explicitly de-feature ICNTR by writing to the ID register after initialization.
Now I understand better.
Currently, KVM_ARM_VCPU_PMU_V3_COMPOSITION sets supported_cpus to ones that have cycle counters compatible with PMU emulation.
If FEAT_PMUv3_ICNTR is set to the ID register, I guess KVM_ARM_VCPU_PMU_V3_COMPOSITION will set supported_cpus to ones that have compatible cycle and instruction counters. If so, the naming KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY indeed makes sense.
Perfect. Ideally SOC vendors do the sensible thing and ensure that FEAT_PMUv3_ICNTR is consistent on all implementations in a machine. We will hide the feature in KVM if it is not.
Thanks, Oliver