Extend pmu_counters_test to AMD CPUs.
As the AMD PMU is quite different from Intel with different events and feature sets, this series introduces a new code path to test it, specifically focusing on the core counters including the PerfCtrExtCore and PerfMonV2 features. Northbridge counters and cache counters exist, but are not as important and can be deferred to a later series.
The first patch is a bug fix that could be submitted separately.
The series has been tested on both Intel and AMD machines, but I have not found an AMD machine old enough to lack PerfCtrExtCore. I have made efforts that no part of the code has any dependency on its presence.
I am aware of similar work in this direction done by Jinrong Liang [1]. He told me he is not working on it currently and I am not intruding by making my own submission.
[1] https://lore.kernel.org/kvm/20231121115457.76269-1-cloudliang@tencent.com/
v2: * Test all combinations of VM setup rather than only the maximum allowed by hardware * Add fixes tag to bug fix in patch 1 * Refine some names
v1: https://lore.kernel.org/kvm/20240813164244.751597-1-coltonlewis@google.com/
Colton Lewis (6): KVM: x86: selftests: Fix typos in macro variable use KVM: x86: selftests: Define AMD PMU CPUID leaves KVM: x86: selftests: Set up AMD VM in pmu_counters_test KVM: x86: selftests: Test read/write core counters KVM: x86: selftests: Test core events KVM: x86: selftests: Test PerfMonV2
.../selftests/kvm/include/x86_64/processor.h | 7 + .../selftests/kvm/x86_64/pmu_counters_test.c | 304 ++++++++++++++++-- 2 files changed, 277 insertions(+), 34 deletions(-)
base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63 -- 2.46.0.662.g92d0881bb0-goog
Without the leading underscore, these variables are referencing a variable in the calling scope. It only worked before by accident because all calling scopes had a variable with the right name.
Fixes: cd34fd8c758e ("KVM: selftests: Test PMC virtualization with forced emulation") Signed-off-by: Colton Lewis coltonlewis@google.com --- tools/testing/selftests/kvm/x86_64/pmu_counters_test.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c index 698cb36989db..0e305e43a93b 100644 --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c @@ -174,7 +174,7 @@ do { \
#define GUEST_TEST_EVENT(_idx, _event, _pmc, _pmc_msr, _ctrl_msr, _value, FEP) \ do { \ - wrmsr(pmc_msr, 0); \ + wrmsr(_pmc_msr, 0); \ \ if (this_cpu_has(X86_FEATURE_CLFLUSHOPT)) \ GUEST_MEASURE_EVENT(_ctrl_msr, _value, "clflushopt .", FEP); \ @@ -331,9 +331,9 @@ __GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \ expect_gp ? "#GP" : "no fault", msr, vector) \
#define GUEST_ASSERT_PMC_VALUE(insn, msr, val, expected) \ - __GUEST_ASSERT(val == expected_val, \ + __GUEST_ASSERT(val == expected, \ "Expected " #insn "(0x%x) to yield 0x%lx, got 0x%lx", \ - msr, expected_val, val); + msr, expected, val);
static void guest_test_rdpmc(uint32_t rdpmc_idx, bool expect_success, uint64_t expected_val)
This defined the CPUID calls to determine what extensions and properties are available. AMD reference manual names listed below.
* PerfCtrExtCore (six core counters instead of four) * PerfCtrExtNB (four counters for northbridge events) * PerfCtrExtL2I (four counters for L2 cache events) * PerfMonV2 (support for registers to control multiple counters with a single register write) * LbrAndPmcFreeze (support for freezing last branch recorded stack on performance counter overflow) * NumPerfCtrCore (number of core counters) * NumPerfCtrNB (number of northbridge counters)
Signed-off-by: Colton Lewis coltonlewis@google.com --- tools/testing/selftests/kvm/include/x86_64/processor.h | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index a0c1440017bb..44ddfc4c1673 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -183,6 +183,9 @@ struct kvm_x86_cpu_feature { #define X86_FEATURE_GBPAGES KVM_X86_CPU_FEATURE(0x80000001, 0, EDX, 26) #define X86_FEATURE_RDTSCP KVM_X86_CPU_FEATURE(0x80000001, 0, EDX, 27) #define X86_FEATURE_LM KVM_X86_CPU_FEATURE(0x80000001, 0, EDX, 29) +#define X86_FEATURE_PERF_CTR_EXT_CORE KVM_X86_CPU_FEATURE(0x80000001, 0, ECX, 23) +#define X86_FEATURE_PERF_CTR_EXT_NB KVM_X86_CPU_FEATURE(0x80000001, 0, ECX, 24) +#define X86_FEATURE_PERF_CTR_EXT_L2I KVM_X86_CPU_FEATURE(0x80000001, 0, ECX, 28) #define X86_FEATURE_INVTSC KVM_X86_CPU_FEATURE(0x80000007, 0, EDX, 8) #define X86_FEATURE_RDPRU KVM_X86_CPU_FEATURE(0x80000008, 0, EBX, 4) #define X86_FEATURE_AMD_IBPB KVM_X86_CPU_FEATURE(0x80000008, 0, EBX, 12) @@ -195,6 +198,8 @@ struct kvm_x86_cpu_feature { #define X86_FEATURE_VGIF KVM_X86_CPU_FEATURE(0x8000000A, 0, EDX, 16) #define X86_FEATURE_SEV KVM_X86_CPU_FEATURE(0x8000001F, 0, EAX, 1) #define X86_FEATURE_SEV_ES KVM_X86_CPU_FEATURE(0x8000001F, 0, EAX, 3) +#define X86_FEATURE_PERFMON_V2 KVM_X86_CPU_FEATURE(0x80000022, 0, EAX, 0) +#define X86_FEATURE_PERF_LBR_PMC_FREEZE KVM_X86_CPU_FEATURE(0x80000022, 0, EAX, 2)
/* * KVM defined paravirt features. @@ -281,6 +286,8 @@ struct kvm_x86_cpu_property { #define X86_PROPERTY_GUEST_MAX_PHY_ADDR KVM_X86_CPU_PROPERTY(0x80000008, 0, EAX, 16, 23) #define X86_PROPERTY_SEV_C_BIT KVM_X86_CPU_PROPERTY(0x8000001F, 0, EBX, 0, 5) #define X86_PROPERTY_PHYS_ADDR_REDUCTION KVM_X86_CPU_PROPERTY(0x8000001F, 0, EBX, 6, 11) +#define X86_PROPERTY_NUM_PERF_CTR_CORE KVM_X86_CPU_PROPERTY(0x80000022, 0, EBX, 0, 3) +#define X86_PROPERTY_NUM_PERF_CTR_NB KVM_X86_CPU_PROPERTY(0x80000022, 0, EBX, 10, 15)
#define X86_PROPERTY_MAX_CENTAUR_LEAF KVM_X86_CPU_PROPERTY(0xC0000000, 0, EAX, 0, 31)
Branch in main() depending on if the CPU is Intel or AMD. They are subject to vastly different requirements because the AMD PMU lacks many properties defined by the Intel PMU including the entire CPUID 0xa function where Intel stores all the PMU properties. AMD lacks this as well as any consistent notion of PMU versions as Intel does. Every feature is a separate flag and they aren't the same features as Intel.
Set up a VM for testing core AMD counters and ensure proper CPUID features are set.
Signed-off-by: Colton Lewis coltonlewis@google.com --- .../selftests/kvm/x86_64/pmu_counters_test.c | 104 ++++++++++++++---- 1 file changed, 83 insertions(+), 21 deletions(-)
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c index 0e305e43a93b..5b240585edc5 100644 --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c @@ -30,10 +30,21 @@ #define NUM_INSNS_RETIRED (NUM_LOOPS * NUM_INSNS_PER_LOOP + NUM_EXTRA_INSNS)
+/* + * Limit testing to MSRs that are actually defined by Intel (in the SDM). MSRs + * that aren't defined counter MSRs *probably* don't exist, but there's no + * guarantee that currently undefined MSR indices won't be used for something + * other than PMCs in the future. + */ +#define MAX_NR_GP_COUNTERS 8 +#define MAX_NR_FIXED_COUNTERS 3 +#define AMD_NR_CORE_COUNTERS 4 +#define AMD_NR_CORE_EXT_COUNTERS 6 + static uint8_t kvm_pmu_version; static bool kvm_has_perf_caps;
-static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu, +static struct kvm_vm *intel_pmu_vm_create(struct kvm_vcpu **vcpu, void *guest_code, uint8_t pmu_version, uint64_t perf_capabilities) @@ -303,7 +314,7 @@ static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities, if (!pmu_version) return;
- vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_arch_events, + vm = intel_pmu_vm_create(&vcpu, guest_test_arch_events, pmu_version, perf_capabilities);
vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH, @@ -316,15 +327,6 @@ static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities, kvm_vm_free(vm); }
-/* - * Limit testing to MSRs that are actually defined by Intel (in the SDM). MSRs - * that aren't defined counter MSRs *probably* don't exist, but there's no - * guarantee that currently undefined MSR indices won't be used for something - * other than PMCs in the future. - */ -#define MAX_NR_GP_COUNTERS 8 -#define MAX_NR_FIXED_COUNTERS 3 - #define GUEST_ASSERT_PMC_MSR_ACCESS(insn, msr, expect_gp, vector) \ __GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \ "Expected %s on " #insn "(0x%x), got vector %u", \ @@ -463,7 +465,7 @@ static void test_gp_counters(uint8_t pmu_version, uint64_t perf_capabilities, struct kvm_vcpu *vcpu; struct kvm_vm *vm;
- vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_gp_counters, + vm = intel_pmu_vm_create(&vcpu, guest_test_gp_counters, pmu_version, perf_capabilities);
vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_NR_GP_COUNTERS, @@ -530,7 +532,7 @@ static void test_fixed_counters(uint8_t pmu_version, uint64_t perf_capabilities, struct kvm_vcpu *vcpu; struct kvm_vm *vm;
- vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_fixed_counters, + vm = intel_pmu_vm_create(&vcpu, guest_test_fixed_counters, pmu_version, perf_capabilities);
vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK, @@ -627,18 +629,78 @@ static void test_intel_counters(void) } }
-int main(int argc, char *argv[]) +static uint8_t nr_core_counters(void) { - TEST_REQUIRE(kvm_is_pmu_enabled()); + uint8_t nr_counters = kvm_cpu_property(X86_PROPERTY_NUM_PERF_CTR_CORE); + bool core_ext = kvm_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE); + + if (nr_counters != 0) + return nr_counters; + + if (core_ext) + return AMD_NR_CORE_EXT_COUNTERS;
- TEST_REQUIRE(host_cpu_is_intel); - TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION)); - TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0); + return AMD_NR_CORE_COUNTERS;
- kvm_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION); - kvm_has_perf_caps = kvm_cpu_has(X86_FEATURE_PDCM); +} + +static void guest_test_core_counters(void) +{ + GUEST_DONE(); +} + +static void test_core_counters(void) +{ + uint8_t nr_counters = nr_core_counters(); + bool core_ext = kvm_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE); + bool perfmon_v2 = kvm_cpu_has(X86_FEATURE_PERFMON_V2); + struct kvm_vcpu *vcpu; + struct kvm_vm *vm; + + for (uint8_t ce = 0; ce <= core_ext; ce++) { + for (uint8_t pm = 0; pm <= perfmon_v2; pm++) { + for (uint8_t nc = 0; nc <= nr_counters; nc++) { + vm = vm_create_with_one_vcpu(&vcpu, guest_test_core_counters); + + if (nc) + vcpu_set_cpuid_property( + vcpu, X86_PROPERTY_NUM_PERF_CTR_CORE, nc); + if (ce) + vcpu_set_cpuid_feature( + vcpu, X86_FEATURE_PERF_CTR_EXT_CORE); + if (pm) + vcpu_set_cpuid_feature( + vcpu, X86_FEATURE_PERFMON_V2); + + pr_info("Testing core counters: CoreExt = %u, PerfMonV2 = %u, NumCounters = %u\n", + ce, pm, nc); + run_vcpu(vcpu); + + kvm_vm_free(vm); + } + } + } +} + +static void test_amd_counters(void) +{ + test_core_counters(); +}
- test_intel_counters(); +int main(int argc, char *argv[]) +{ + TEST_REQUIRE(kvm_is_pmu_enabled()); + + if (host_cpu_is_intel) { + TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION)); + TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0); + kvm_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION); + kvm_has_perf_caps = kvm_cpu_has(X86_FEATURE_PDCM); + test_intel_counters(); + } else if (host_cpu_is_amd) { + /* AMD CPUs don't have the same properties to look at. */ + test_amd_counters(); + }
return 0; }
Run a basic test to ensure we can write an arbitrary value to the core counters and read it back.
Signed-off-by: Colton Lewis coltonlewis@google.com --- .../selftests/kvm/x86_64/pmu_counters_test.c | 54 +++++++++++++++++++ 1 file changed, 54 insertions(+)
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c index 5b240585edc5..79ca7d608e00 100644 --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c @@ -641,11 +641,65 @@ static uint8_t nr_core_counters(void) return AMD_NR_CORE_EXT_COUNTERS;
return AMD_NR_CORE_COUNTERS; +} + +static uint8_t guest_nr_core_counters(void) +{ + uint8_t nr_counters = this_cpu_property(X86_PROPERTY_NUM_PERF_CTR_CORE); + bool core_ext = this_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE); + + if (nr_counters != 0) + return nr_counters; + + if (core_ext) + return AMD_NR_CORE_EXT_COUNTERS; + + return AMD_NR_CORE_COUNTERS; + +}
+static void guest_test_rdwr_core_counters(void) +{ + bool core_ext = this_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE); + uint8_t nr_counters = guest_nr_core_counters(); + uint8_t i; + uint32_t esel_msr_base = core_ext ? MSR_F15H_PERF_CTL : MSR_K7_EVNTSEL0; + uint32_t cnt_msr_base = core_ext ? MSR_F15H_PERF_CTR : MSR_K7_PERFCTR0; + uint32_t msr_step = core_ext ? 2 : 1; + + for (i = 0; i < AMD_NR_CORE_EXT_COUNTERS; i++) { + uint64_t test_val = 0xffff; + uint32_t esel_msr = esel_msr_base + msr_step * i; + uint32_t cnt_msr = cnt_msr_base + msr_step * i; + bool expect_gp = !(i < nr_counters); + uint8_t vector; + uint64_t val; + + /* Test event selection register. */ + vector = wrmsr_safe(esel_msr, test_val); + GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, esel_msr, expect_gp, vector); + + vector = rdmsr_safe(esel_msr, &val); + GUEST_ASSERT_PMC_MSR_ACCESS(RDMSR, esel_msr, expect_gp, vector); + + if (!expect_gp) + GUEST_ASSERT_PMC_VALUE(RDMSR, esel_msr, val, test_val); + + /* Test counter register. */ + vector = wrmsr_safe(cnt_msr, test_val); + GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, cnt_msr, expect_gp, vector); + + vector = rdmsr_safe(cnt_msr, &val); + GUEST_ASSERT_PMC_MSR_ACCESS(RDMSR, cnt_msr, expect_gp, vector); + + if (!expect_gp) + GUEST_ASSERT_PMC_VALUE(RDMSR, cnt_msr, val, test_val); + } }
static void guest_test_core_counters(void) { + guest_test_rdwr_core_counters(); GUEST_DONE(); }
Test events on core counters by iterating through every combination of events in amd_pmu_zen_events with every core counter.
For each combination, calculate the appropriate register addresses for the event selection/control register and the counter register. The base addresses and layout schemes change depending on whether we have the CoreExt feature.
To do the testing, reuse GUEST_TEST_EVENT to run a standard known workload. Decouple it from guest_assert_event_count (now guest_assert_intel_event_count) to generalize to AMD.
Then assert the most specific detail that can be reasonably known about the counter result. Exact count is defined and known for some events and for other events merely asserted to be nonzero.
Note on exact counts: AMD counts one more branch than Intel for the same workload. Though I can't confirm a reason, the only thing it could be is the boundary of the loop instruction being counted differently. Presumably, when the counter reaches 0 and execution continues to the next instruction, AMD counts this as a branch and Intel doesn't.
Signed-off-by: Colton Lewis coltonlewis@google.com --- .../selftests/kvm/x86_64/pmu_counters_test.c | 87 ++++++++++++++++--- 1 file changed, 77 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c index 79ca7d608e00..cf2941cc7c4c 100644 --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c @@ -29,6 +29,9 @@ /* Total number of instructions retired within the measured section. */ #define NUM_INSNS_RETIRED (NUM_LOOPS * NUM_INSNS_PER_LOOP + NUM_EXTRA_INSNS)
+/* AMD counting one extra branch. Probably at loop boundary condition. */ +#define NUM_BRANCH_INSNS_RETIRED_AMD (NUM_LOOPS+1) +#define NUM_INSNS_RETIRED_AMD (NUM_INSNS_RETIRED+1)
/* * Limit testing to MSRs that are actually defined by Intel (in the SDM). MSRs @@ -109,7 +112,7 @@ static uint8_t guest_get_pmu_version(void) * Sanity check that in all cases, the event doesn't count when it's disabled, * and that KVM correctly emulates the write of an arbitrary value. */ -static void guest_assert_event_count(uint8_t idx, +static void guest_assert_intel_event_count(uint8_t idx, struct kvm_x86_pmu_feature event, uint32_t pmc, uint32_t pmc_msr) { @@ -151,6 +154,33 @@ static void guest_assert_event_count(uint8_t idx, GUEST_ASSERT_EQ(_rdpmc(pmc), 0xdead); }
+static void guest_assert_amd_event_count(uint8_t evt_idx, uint8_t cnt_idx, uint32_t pmc_msr) +{ + uint64_t count; + uint64_t count_pmc; + + count = rdmsr(pmc_msr); + count_pmc = _rdpmc(cnt_idx); + GUEST_ASSERT_EQ(count, count_pmc); + + switch (evt_idx) { + case AMD_ZEN_CORE_CYCLES_INDEX: + GUEST_ASSERT_NE(count, 0); + break; + case AMD_ZEN_INSTRUCTIONS_INDEX: + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED_AMD); + break; + case AMD_ZEN_BRANCHES_INDEX: + GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED_AMD); + break; + case AMD_ZEN_BRANCH_MISSES_INDEX: + GUEST_ASSERT_NE(count, 0); + break; + default: + break; + } + +} /* * Enable and disable the PMC in a monolithic asm blob to ensure that the * compiler can't insert _any_ code into the measured sequence. Note, ECX @@ -183,28 +213,29 @@ do { \ ); \ } while (0)
-#define GUEST_TEST_EVENT(_idx, _event, _pmc, _pmc_msr, _ctrl_msr, _value, FEP) \ +#define GUEST_TEST_EVENT(_pmc_msr, _ctrl_msr, _ctrl_value, FEP) \ do { \ wrmsr(_pmc_msr, 0); \ \ if (this_cpu_has(X86_FEATURE_CLFLUSHOPT)) \ - GUEST_MEASURE_EVENT(_ctrl_msr, _value, "clflushopt .", FEP); \ + GUEST_MEASURE_EVENT(_ctrl_msr, _ctrl_value, "clflushopt .", FEP); \ else if (this_cpu_has(X86_FEATURE_CLFLUSH)) \ - GUEST_MEASURE_EVENT(_ctrl_msr, _value, "clflush .", FEP); \ + GUEST_MEASURE_EVENT(_ctrl_msr, _ctrl_value, "clflush .", FEP); \ else \ - GUEST_MEASURE_EVENT(_ctrl_msr, _value, "nop", FEP); \ - \ - guest_assert_event_count(_idx, _event, _pmc, _pmc_msr); \ + GUEST_MEASURE_EVENT(_ctrl_msr, _ctrl_value, "nop", FEP); \ } while (0)
static void __guest_test_arch_event(uint8_t idx, struct kvm_x86_pmu_feature event, uint32_t pmc, uint32_t pmc_msr, uint32_t ctrl_msr, uint64_t ctrl_msr_value) { - GUEST_TEST_EVENT(idx, event, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, ""); + GUEST_TEST_EVENT(pmc_msr, ctrl_msr, ctrl_msr_value, ""); + guest_assert_intel_event_count(idx, event, pmc, pmc_msr);
- if (is_forced_emulation_enabled) - GUEST_TEST_EVENT(idx, event, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, KVM_FEP); + if (is_forced_emulation_enabled) { + GUEST_TEST_EVENT(pmc_msr, ctrl_msr, ctrl_msr_value, KVM_FEP); + guest_assert_intel_event_count(idx, event, pmc, pmc_msr); + } }
#define X86_PMU_FEATURE_NULL \ @@ -697,9 +728,45 @@ static void guest_test_rdwr_core_counters(void) } }
+static void __guest_test_core_event(uint8_t event_idx, uint8_t counter_idx) +{ + /* One fortunate area of actual compatibility! This register + * layout is the same for both AMD and Intel. + */ + uint64_t eventsel = ARCH_PERFMON_EVENTSEL_OS | + ARCH_PERFMON_EVENTSEL_ENABLE | + amd_pmu_zen_events[event_idx]; + bool core_ext = this_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE); + uint64_t esel_msr_base = core_ext ? MSR_F15H_PERF_CTL : MSR_K7_EVNTSEL0; + uint64_t cnt_msr_base = core_ext ? MSR_F15H_PERF_CTR : MSR_K7_PERFCTR0; + uint64_t msr_step = core_ext ? 2 : 1; + uint64_t esel_msr = esel_msr_base + msr_step * counter_idx; + uint64_t cnt_msr = cnt_msr_base + msr_step * counter_idx; + + GUEST_TEST_EVENT(cnt_msr, esel_msr, eventsel, ""); + guest_assert_amd_event_count(event_idx, counter_idx, cnt_msr); + + if (is_forced_emulation_enabled) { + GUEST_TEST_EVENT(cnt_msr, esel_msr, eventsel, KVM_FEP); + guest_assert_amd_event_count(event_idx, counter_idx, cnt_msr); + } + +} + +static void guest_test_core_events(void) +{ + uint8_t nr_counters = guest_nr_core_counters(); + + for (uint8_t i = 0; i < NR_AMD_ZEN_EVENTS; i++) { + for (uint8_t j = 0; j < nr_counters; j++) + __guest_test_core_event(i, j); + } +} + static void guest_test_core_counters(void) { guest_test_rdwr_core_counters(); + guest_test_core_events(); GUEST_DONE(); }
Test PerfMonV2, which defines global registers to enable multiple performance counters with a single MSR write, in its own function.
If the feature is available, ensure the global control register has the ability to start and stop the performance counters and the global status register correctly flags an overflow by the associated counter.
Signed-off-by: Colton Lewis coltonlewis@google.com --- .../selftests/kvm/x86_64/pmu_counters_test.c | 53 +++++++++++++++++++ 1 file changed, 53 insertions(+)
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c index cf2941cc7c4c..a90df8b67a19 100644 --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c @@ -763,10 +763,63 @@ static void guest_test_core_events(void) } }
+static void guest_test_perfmon_v2(void) +{ + uint64_t i; + uint64_t eventsel = ARCH_PERFMON_EVENTSEL_OS | + ARCH_PERFMON_EVENTSEL_ENABLE | + AMD_ZEN_CORE_CYCLES; + bool core_ext = this_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE); + uint64_t sel_msr_base = core_ext ? MSR_F15H_PERF_CTL : MSR_K7_EVNTSEL0; + uint64_t cnt_msr_base = core_ext ? MSR_F15H_PERF_CTR : MSR_K7_PERFCTR0; + uint64_t msr_step = core_ext ? 2 : 1; + uint8_t nr_counters = guest_nr_core_counters(); + bool perfmon_v2 = this_cpu_has(X86_FEATURE_PERFMON_V2); + uint64_t sel_msr; + uint64_t cnt_msr; + + if (!perfmon_v2) + return; + + for (i = 0; i < nr_counters; i++) { + sel_msr = sel_msr_base + msr_step * i; + cnt_msr = cnt_msr_base + msr_step * i; + + /* Ensure count stays 0 when global register disables counter. */ + wrmsr(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, 0); + wrmsr(sel_msr, eventsel); + wrmsr(cnt_msr, 0); + __asm__ __volatile__("loop ." : "+c"((int){NUM_LOOPS})); + GUEST_ASSERT(!_rdpmc(i)); + + /* Ensure counter is >0 when global register enables counter. */ + wrmsr(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, BIT_ULL(i)); + __asm__ __volatile__("loop ." : "+c"((int){NUM_LOOPS})); + wrmsr(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, 0); + GUEST_ASSERT(_rdpmc(i)); + + /* Ensure global status register flags a counter overflow. */ + wrmsr(cnt_msr, -1); + wrmsr(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, 0xff); + wrmsr(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, BIT_ULL(i)); + __asm__ __volatile__("loop ." : "+c"((int){NUM_LOOPS})); + wrmsr(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, 0); + GUEST_ASSERT(rdmsr(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS) & + BIT_ULL(i)); + + /* Ensure global status register flag is cleared correctly. */ + wrmsr(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, BIT_ULL(i)); + GUEST_ASSERT(!(rdmsr(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS) & + BIT_ULL(i))); + } +} + + static void guest_test_core_counters(void) { guest_test_rdwr_core_counters(); guest_test_core_events(); + guest_test_perfmon_v2(); GUEST_DONE(); }
Bumping this for Mingwei
Colton Lewis coltonlewis@google.com writes:
Extend pmu_counters_test to AMD CPUs.
As the AMD PMU is quite different from Intel with different events and feature sets, this series introduces a new code path to test it, specifically focusing on the core counters including the PerfCtrExtCore and PerfMonV2 features. Northbridge counters and cache counters exist, but are not as important and can be deferred to a later series.
The first patch is a bug fix that could be submitted separately.
The series has been tested on both Intel and AMD machines, but I have not found an AMD machine old enough to lack PerfCtrExtCore. I have made efforts that no part of the code has any dependency on its presence.
I am aware of similar work in this direction done by Jinrong Liang [1]. He told me he is not working on it currently and I am not intruding by making my own submission.
[1] https://lore.kernel.org/kvm/20231121115457.76269-1-cloudliang@tencent.com/
v2:
- Test all combinations of VM setup rather than only the maximum allowed by hardware
- Add fixes tag to bug fix in patch 1
- Refine some names
v1: https://lore.kernel.org/kvm/20240813164244.751597-1-coltonlewis@google.com/
Colton Lewis (6): KVM: x86: selftests: Fix typos in macro variable use KVM: x86: selftests: Define AMD PMU CPUID leaves KVM: x86: selftests: Set up AMD VM in pmu_counters_test KVM: x86: selftests: Test read/write core counters KVM: x86: selftests: Test core events KVM: x86: selftests: Test PerfMonV2
.../selftests/kvm/include/x86_64/processor.h | 7 + .../selftests/kvm/x86_64/pmu_counters_test.c | 304 ++++++++++++++++-- 2 files changed, 277 insertions(+), 34 deletions(-)
base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63
2.46.0.662.g92d0881bb0-goog
linux-kselftest-mirror@lists.linaro.org