On Tue, Mar 25, 2025, James Houghton wrote:
On Mon, Mar 24, 2025 at 6:57 PM Maxim Levitsky mlevitsk@redhat.com wrote:
Add an option to skip sanity check of number of still idle pages, and set it by default to skip, in case hypervisor or NUMA balancing is detected.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com
Thanks Maxim! I'm still working on a respin of this test with MGLRU integration, like [1]. Sorry it's taking me so long. I'll apply my changes on top of yours.
.../selftests/kvm/access_tracking_perf_test.c | 33 ++++++++++++++++--- .../testing/selftests/kvm/include/test_util.h | 1 + tools/testing/selftests/kvm/lib/test_util.c | 7 ++++ 3 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index 3c7defd34f56..6d50c829f00c 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -65,6 +65,8 @@ static int vcpu_last_completed_iteration[KVM_MAX_VCPUS]; /* Whether to overlap the regions of memory vCPUs access. */ static bool overlap_memory_access;
+static int warn_on_too_many_idle_pages = -1;
struct test_params { /* The backing source for the region of memory. */ enum vm_mem_backing_src_type backing_src; @@ -184,11 +186,10 @@ static void mark_vcpu_memory_idle(struct kvm_vm *vm, * are cached and the guest won't see the "idle" bit cleared. */ if (still_idle >= pages / 10) { -#ifdef __x86_64__
TEST_ASSERT(this_cpu_has(X86_FEATURE_HYPERVISOR),
TEST_ASSERT(warn_on_too_many_idle_pages,
I think this assertion is flipped (or how warn_on_too_many_idle_pages is being set is flipped, see below).
"vCPU%d: Too many pages still idle (%lu out of %lu)", vcpu_idx, still_idle, pages);
-#endif
printf("WARNING: vCPU%d: Too many pages still idle (%lu out of %lu), " "this will affect performance results.\n", vcpu_idx, still_idle, pages);
@@ -342,6 +343,8 @@ static void help(char *name) printf(" -v: specify the number of vCPUs to run.\n"); printf(" -o: Overlap guest memory accesses instead of partitioning\n" " them into a separate region of memory for each vCPU.\n");
printf(" -w: Skip or force enable the check that after dirtying the guest memory, most (90%%) of \n"
"it is reported as dirty again (0/1)"); backing_src_help("-s"); puts(""); exit(0);
@@ -359,7 +362,7 @@ int main(int argc, char *argv[])
guest_modes_append_default();
while ((opt = getopt(argc, argv, "hm:b:v:os:")) != -1) {
while ((opt = getopt(argc, argv, "hm:b:v:os:w:")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg);
@@ -376,6 +379,11 @@ int main(int argc, char *argv[]) case 's': params.backing_src = parse_backing_src_type(optarg); break;
case 'w':
warn_on_too_many_idle_pages =
atoi_non_negative("1 - enable warning, 0 - disable",
optarg);
We still get a "warning" either way, right? Maybe this should be called "fail_on_too_many_idle_pages" (in which case the above assertion is indeed flipped). Or "warn_on_too_many_idle_pages" should mean *only* warn, i.e., *don't* fail, in which case, below we need to flip how we set it below.
Agreed. I like the "warn" terminology, Maybe this?
printf(" -w: Control whether the test warns or fails if more than 10%\n" " of pages are still seen as idle/old after accessing guest\n" " memory. >0 == warn only, 0 == fail, <0 == auto. For auto\n" " mode, the test fails by default, but switches to warn only\n" " if NUMA balancing is enabled or the test detects it's running\n" " in a VM.");
And let the user explicitly select auto:
case 'w': warn_only = atoi_paranoid(optarg); break;
Then the auto resolving works as below, and as James pointed out, the assert becomes
TEST_ASSERT(!warn_only, ....);
break; case 'h': default: help(argv[0]);
@@ -386,6 +394,23 @@ int main(int argc, char *argv[]) page_idle_fd = open("/sys/kernel/mm/page_idle/bitmap", O_RDWR); __TEST_REQUIRE(page_idle_fd >= 0, "CONFIG_IDLE_PAGE_TRACKING is not enabled");
if (warn_on_too_many_idle_pages == -1) {
+#ifdef __x86_64__
if (this_cpu_has(X86_FEATURE_HYPERVISOR)) {
printf("Skipping idle page count sanity check, because the test is run nested\n");
warn_on_too_many_idle_pages = 0;
} else
+#endif
if (is_numa_balancing_enabled()) {
printf("Skipping idle page count sanity check, because NUMA balance is enabled\n");
warn_on_too_many_idle_pages = 0;
} else {
warn_on_too_many_idle_pages = 1;
}
} else if (!warn_on_too_many_idle_pages) {
printf("Skipping idle page count sanity check, because this was requested by the user\n");
Eh, I vote to omit this. The sanity check is still there, it's just degraded to a warn. I'm not totally against it, just seems superfluous and potentially confusing.