This series contains a few cleanups that didn't make it into previous series, including some cosmetic changes and small bug fixes. The series also lays the groundwork for a memslot modification test which stresses the memslot update and page fault code paths in an attempt to expose races.
Tested: dirty_log_perf_test, memslot_modification_stress_test, and demand_paging_test were run, with all the patches in this series applied, on an Intel Skylake machine.
echo Y > /sys/module/kvm/parameters/tdp_mmu; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \ ./dirty_log_perf_test -v 64 -b 1G; \ ./dirty_log_perf_test -v 64 -b 64M -o; \ ./demand_paging_test -v 64 -b 1G; \ ./demand_paging_test -v 64 -b 64M -o; \ echo N > /sys/module/kvm/parameters/tdp_mmu; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \ ./dirty_log_perf_test -v 64 -b 1G; \ ./dirty_log_perf_test -v 64 -b 64M -o; \ ./demand_paging_test -v 64 -b 1G; \ ./demand_paging_test -v 64 -b 64M -o
The tests behaved as expected, and fixed the problem of the population stage being skipped in dirty_log_perf_test. This can be seen in the output, with the population stage taking about the time dirty pass 1 took and dirty pass 1 falling closer to the times for the other passes.
Note that when running these tests, the -o option causes the test to take much longer as the work each vCPU must do increases proportional to the number of vCPUs.
You can view this series in Gerrit at: https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/72...
Ben Gardon (6): KVM: selftests: Rename timespec_diff_now to timespec_elapsed KVM: selftests: Avoid flooding debug log while populating memory KVM: selftests: Convert iterations to int in dirty_log_perf_test KVM: selftests: Fix population stage in dirty_log_perf_test KVM: selftests: Add option to overlap vCPU memory access KVM: selftests: Add memslot modification stress test
tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/demand_paging_test.c | 40 +++- .../selftests/kvm/dirty_log_perf_test.c | 72 +++--- .../selftests/kvm/include/perf_test_util.h | 4 +- .../testing/selftests/kvm/include/test_util.h | 2 +- .../selftests/kvm/lib/perf_test_util.c | 25 ++- tools/testing/selftests/kvm/lib/test_util.c | 2 +- .../kvm/memslot_modification_stress_test.c | 211 ++++++++++++++++++ 9 files changed, 307 insertions(+), 51 deletions(-) create mode 100644 tools/testing/selftests/kvm/memslot_modification_stress_test.c
In response to some earlier comments from Peter Xu, rename timespec_diff_now to the much more sensible timespec_elapsed.
No functional change intended.
Reviewed-by: Jacob Xu jacobhxu@google.com Reviewed-by: Makarand Sonare makarandsonare@google.com
Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/demand_paging_test.c | 8 ++++---- tools/testing/selftests/kvm/dirty_log_perf_test.c | 14 +++++++------- tools/testing/selftests/kvm/include/test_util.h | 2 +- tools/testing/selftests/kvm/lib/test_util.c | 2 +- 4 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index cdad1eca72f7..a1cd234e6f5e 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -64,7 +64,7 @@ static void *vcpu_worker(void *data) exit_reason_str(run->exit_reason)); }
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds\n", vcpu_id, ts_diff.tv_sec, ts_diff.tv_nsec);
@@ -95,7 +95,7 @@ static int handle_uffd_page_request(int uffd, uint64_t addr) return r; }
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start);
PER_PAGE_DEBUG("UFFDIO_COPY %d \t%ld ns\n", tid, timespec_to_ns(ts_diff)); @@ -190,7 +190,7 @@ static void *uffd_handler_thread_fn(void *arg) pages++; }
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n", pages, ts_diff.tv_sec, ts_diff.tv_nsec, pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0)); @@ -339,7 +339,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) PER_VCPU_DEBUG("Joined thread for vCPU %d\n", vcpu_id); }
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start);
pr_info("All vCPU threads joined\n");
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 2283a0ec74a9..16efe6589b43 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -52,7 +52,7 @@ static void *vcpu_worker(void *data)
clock_gettime(CLOCK_MONOTONIC, &start); ret = _vcpu_run(vm, vcpu_id); - ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start);
TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); TEST_ASSERT(get_ucall(vm, vcpu_id, NULL) == UCALL_SYNC, @@ -149,7 +149,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) pr_debug("Waiting for vcpu_last_completed_iteration == %lu\n", iteration);
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); pr_info("Populate memory time: %ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec);
@@ -157,7 +157,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) clock_gettime(CLOCK_MONOTONIC, &start); vm_mem_region_set_flags(vm, PERF_TEST_MEM_SLOT_INDEX, KVM_MEM_LOG_DIRTY_PAGES); - ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); pr_info("Enabling dirty logging time: %ld.%.9lds\n\n", ts_diff.tv_sec, ts_diff.tv_nsec);
@@ -176,7 +176,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) vcpu_id, iteration); }
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff); pr_info("Iteration %lu dirty memory time: %ld.%.9lds\n", iteration, ts_diff.tv_sec, ts_diff.tv_nsec); @@ -184,7 +184,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) clock_gettime(CLOCK_MONOTONIC, &start); kvm_vm_get_dirty_log(vm, PERF_TEST_MEM_SLOT_INDEX, bmap);
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); get_dirty_log_total = timespec_add(get_dirty_log_total, ts_diff); pr_info("Iteration %lu get dirty log time: %ld.%.9lds\n", @@ -195,7 +195,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) kvm_vm_clear_dirty_log(vm, PERF_TEST_MEM_SLOT_INDEX, bmap, 0, host_num_pages);
- ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); clear_dirty_log_total = timespec_add(clear_dirty_log_total, ts_diff); pr_info("Iteration %lu clear dirty log time: %ld.%.9lds\n", @@ -211,7 +211,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) /* Disable dirty logging */ clock_gettime(CLOCK_MONOTONIC, &start); vm_mem_region_set_flags(vm, PERF_TEST_MEM_SLOT_INDEX, 0); - ts_diff = timespec_diff_now(start); + ts_diff = timespec_elapsed(start); pr_info("Disabling dirty logging time: %ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec);
diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index ffffa560436b..b86090ef82da 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -64,7 +64,7 @@ int64_t timespec_to_ns(struct timespec ts); struct timespec timespec_add_ns(struct timespec ts, int64_t ns); struct timespec timespec_add(struct timespec ts1, struct timespec ts2); struct timespec timespec_sub(struct timespec ts1, struct timespec ts2); -struct timespec timespec_diff_now(struct timespec start); +struct timespec timespec_elapsed(struct timespec start); struct timespec timespec_div(struct timespec ts, int divisor);
#endif /* SELFTEST_KVM_TEST_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index 8e04c0b1608e..5f87ed32caf5 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -84,7 +84,7 @@ struct timespec timespec_sub(struct timespec ts1, struct timespec ts2) return timespec_add_ns((struct timespec){0}, ns1 - ns2); }
-struct timespec timespec_diff_now(struct timespec start) +struct timespec timespec_elapsed(struct timespec start) { struct timespec end;
Peter Xu pointed out that a log message printed while waiting for the memory population phase of the dirty_log_perf_test will flood the debug logs as there is no delay after printing the message. Since the message does not provide much value anyway, remove it.
Reviewed-by: Jacob Xu jacobhxu@google.com
Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/dirty_log_perf_test.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 16efe6589b43..15a9c45bdb5f 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -146,8 +146,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) /* Allow the vCPU to populate memory */ pr_debug("Starting iteration %lu - Populating\n", iteration); while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration) - pr_debug("Waiting for vcpu_last_completed_iteration == %lu\n", - iteration); + ;
ts_diff = timespec_elapsed(start); pr_info("Populate memory time: %ld.%.9lds\n", @@ -171,9 +170,9 @@ static void run_test(enum vm_guest_mode mode, void *arg)
pr_debug("Starting iteration %lu\n", iteration); for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) { - while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration) - pr_debug("Waiting for vCPU %d vcpu_last_completed_iteration == %lu\n", - vcpu_id, iteration); + while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) + != iteration) + ; }
ts_diff = timespec_elapsed(start);
On 12/01/2021 22.42, Ben Gardon wrote:
Peter Xu pointed out that a log message printed while waiting for the memory population phase of the dirty_log_perf_test will flood the debug logs as there is no delay after printing the message. Since the message does not provide much value anyway, remove it.
Reviewed-by: Jacob Xu jacobhxu@google.com
Signed-off-by: Ben Gardon bgardon@google.com
tools/testing/selftests/kvm/dirty_log_perf_test.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 16efe6589b43..15a9c45bdb5f 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -146,8 +146,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) /* Allow the vCPU to populate memory */ pr_debug("Starting iteration %lu - Populating\n", iteration); while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration)
pr_debug("Waiting for vcpu_last_completed_iteration == %lu\n",
iteration);
;
ts_diff = timespec_elapsed(start); pr_info("Populate memory time: %ld.%.9lds\n", @@ -171,9 +170,9 @@ static void run_test(enum vm_guest_mode mode, void *arg) pr_debug("Starting iteration %lu\n", iteration); for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration)
pr_debug("Waiting for vCPU %d vcpu_last_completed_iteration == %lu\n",
vcpu_id, iteration);
while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id])
!= iteration)
};
ts_diff = timespec_elapsed(start);
Reviewed-by: Thomas Huth thuth@redhat.com
On Tue, Jan 12, 2021, Ben Gardon wrote:
Peter Xu pointed out that a log message printed while waiting for the memory population phase of the dirty_log_perf_test will flood the debug logs as there is no delay after printing the message. Since the message does not provide much value anyway, remove it.
Does it provide value if something goes wrong? E.g. if a vCPU doesn't finish, how would one go about debugging? Would it make sense to make the print ratelimited instead of removing it altogether?
Reviewed-by: Jacob Xu jacobhxu@google.com
Signed-off-by: Ben Gardon bgardon@google.com
tools/testing/selftests/kvm/dirty_log_perf_test.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 16efe6589b43..15a9c45bdb5f 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -146,8 +146,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) /* Allow the vCPU to populate memory */ pr_debug("Starting iteration %lu - Populating\n", iteration); while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration)
pr_debug("Waiting for vcpu_last_completed_iteration == %lu\n",
iteration);
;
ts_diff = timespec_elapsed(start); pr_info("Populate memory time: %ld.%.9lds\n", @@ -171,9 +170,9 @@ static void run_test(enum vm_guest_mode mode, void *arg) pr_debug("Starting iteration %lu\n", iteration); for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration)
pr_debug("Waiting for vCPU %d vcpu_last_completed_iteration == %lu\n",
vcpu_id, iteration);
while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id])
!= iteration)
I like the original better. Poking out past 80 chars isn't the end of the world.
};
ts_diff = timespec_elapsed(start); -- 2.30.0.284.gd98b1dd5eaa7-goog
In order to add an iteration -1 to indicate that the memory population phase has not yet completed, convert the interations counters to ints.
No functional change intended.
Reviewed-by: Jacob Xu jacobhxu@google.com
Signed-off-by: Ben Gardon bgardon@google.com --- .../selftests/kvm/dirty_log_perf_test.c | 26 +++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 15a9c45bdb5f..3875f22d7283 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -28,8 +28,8 @@ static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; /* Host variables */ static u64 dirty_log_manual_caps; static bool host_quit; -static uint64_t iteration; -static uint64_t vcpu_last_completed_iteration[KVM_MAX_VCPUS]; +static int iteration; +static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
static void *vcpu_worker(void *data) { @@ -48,7 +48,7 @@ static void *vcpu_worker(void *data) run = vcpu_state(vm, vcpu_id);
while (!READ_ONCE(host_quit)) { - uint64_t current_iteration = READ_ONCE(iteration); + int current_iteration = READ_ONCE(iteration);
clock_gettime(CLOCK_MONOTONIC, &start); ret = _vcpu_run(vm, vcpu_id); @@ -61,17 +61,17 @@ static void *vcpu_worker(void *data)
pr_debug("Got sync event from vCPU %d\n", vcpu_id); vcpu_last_completed_iteration[vcpu_id] = current_iteration; - pr_debug("vCPU %d updated last completed iteration to %lu\n", + pr_debug("vCPU %d updated last completed iteration to %d\n", vcpu_id, vcpu_last_completed_iteration[vcpu_id]);
if (current_iteration) { pages_count += vcpu_args->pages; total = timespec_add(total, ts_diff); - pr_debug("vCPU %d iteration %lu dirty memory time: %ld.%.9lds\n", + pr_debug("vCPU %d iteration %d dirty memory time: %ld.%.9lds\n", vcpu_id, current_iteration, ts_diff.tv_sec, ts_diff.tv_nsec); } else { - pr_debug("vCPU %d iteration %lu populate memory time: %ld.%.9lds\n", + pr_debug("vCPU %d iteration %d populate memory time: %ld.%.9lds\n", vcpu_id, current_iteration, ts_diff.tv_sec, ts_diff.tv_nsec); } @@ -81,7 +81,7 @@ static void *vcpu_worker(void *data) }
avg = timespec_div(total, vcpu_last_completed_iteration[vcpu_id]); - pr_debug("\nvCPU %d dirtied 0x%lx pages over %lu iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n", + pr_debug("\nvCPU %d dirtied 0x%lx pages over %d iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n", vcpu_id, pages_count, vcpu_last_completed_iteration[vcpu_id], total.tv_sec, total.tv_nsec, avg.tv_sec, avg.tv_nsec);
@@ -144,7 +144,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) }
/* Allow the vCPU to populate memory */ - pr_debug("Starting iteration %lu - Populating\n", iteration); + pr_debug("Starting iteration %d - Populating\n", iteration); while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration) ;
@@ -168,7 +168,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) clock_gettime(CLOCK_MONOTONIC, &start); iteration++;
- pr_debug("Starting iteration %lu\n", iteration); + pr_debug("Starting iteration %d\n", iteration); for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) { while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration) @@ -177,7 +177,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
ts_diff = timespec_elapsed(start); vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff); - pr_info("Iteration %lu dirty memory time: %ld.%.9lds\n", + pr_info("Iteration %d dirty memory time: %ld.%.9lds\n", iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
clock_gettime(CLOCK_MONOTONIC, &start); @@ -186,7 +186,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) ts_diff = timespec_elapsed(start); get_dirty_log_total = timespec_add(get_dirty_log_total, ts_diff); - pr_info("Iteration %lu get dirty log time: %ld.%.9lds\n", + pr_info("Iteration %d get dirty log time: %ld.%.9lds\n", iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
if (dirty_log_manual_caps) { @@ -197,7 +197,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) ts_diff = timespec_elapsed(start); clear_dirty_log_total = timespec_add(clear_dirty_log_total, ts_diff); - pr_info("Iteration %lu clear dirty log time: %ld.%.9lds\n", + pr_info("Iteration %d clear dirty log time: %ld.%.9lds\n", iteration, ts_diff.tv_sec, ts_diff.tv_nsec); } } @@ -273,7 +273,7 @@ int main(int argc, char *argv[]) while ((opt = getopt(argc, argv, "hi:p:m:b:f:v:")) != -1) { switch (opt) { case 'i': - p.iterations = strtol(optarg, NULL, 10); + p.iterations = atoi(optarg); break; case 'p': p.phys_offset = strtoull(optarg, NULL, 0);
On 12/01/2021 22.42, Ben Gardon wrote:
In order to add an iteration -1 to indicate that the memory population phase has not yet completed, convert the interations counters to ints.
No functional change intended.
Reviewed-by: Jacob Xu jacobhxu@google.com
Signed-off-by: Ben Gardon bgardon@google.com
.../selftests/kvm/dirty_log_perf_test.c | 26 +++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 15a9c45bdb5f..3875f22d7283 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -28,8 +28,8 @@ static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; /* Host variables */ static u64 dirty_log_manual_caps; static bool host_quit; -static uint64_t iteration; -static uint64_t vcpu_last_completed_iteration[KVM_MAX_VCPUS]; +static int iteration; +static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
Wouldn't it be better to use signed 64-bit variables instead? I.e. "int64_t" ?
Thomas
Currently the population stage in the dirty_log_perf_test does nothing as the per-vCPU iteration counters are not initialized and the loop does not wait for each vCPU. Remedy those errors.
Reviewed-by: Jacob Xu jacobhxu@google.com Reviewed-by: Makarand Sonare makarandsonare@google.com
Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/dirty_log_perf_test.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 3875f22d7283..fb6eb7fa0b45 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -139,14 +139,19 @@ static void run_test(enum vm_guest_mode mode, void *arg)
clock_gettime(CLOCK_MONOTONIC, &start); for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) { + vcpu_last_completed_iteration[vcpu_id] = -1; + pthread_create(&vcpu_threads[vcpu_id], NULL, vcpu_worker, &perf_test_args.vcpu_args[vcpu_id]); }
- /* Allow the vCPU to populate memory */ + /* Allow the vCPUs to populate memory */ pr_debug("Starting iteration %d - Populating\n", iteration); - while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration) - ; + for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) { + while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != + iteration) + ; + }
ts_diff = timespec_elapsed(start); pr_info("Populate memory time: %ld.%.9lds\n",
On Tue, Jan 12, 2021, Ben Gardon wrote:
Currently the population stage in the dirty_log_perf_test does nothing as the per-vCPU iteration counters are not initialized and the loop does not wait for each vCPU. Remedy those errors.
Reviewed-by: Jacob Xu jacobhxu@google.com Reviewed-by: Makarand Sonare makarandsonare@google.com
Signed-off-by: Ben Gardon bgardon@google.com
tools/testing/selftests/kvm/dirty_log_perf_test.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 3875f22d7283..fb6eb7fa0b45 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -139,14 +139,19 @@ static void run_test(enum vm_guest_mode mode, void *arg) clock_gettime(CLOCK_MONOTONIC, &start); for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
vcpu_last_completed_iteration[vcpu_id] = -1;
- pthread_create(&vcpu_threads[vcpu_id], NULL, vcpu_worker, &perf_test_args.vcpu_args[vcpu_id]); }
- /* Allow the vCPU to populate memory */
- /* Allow the vCPUs to populate memory */ pr_debug("Starting iteration %d - Populating\n", iteration);
- while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration)
;
- for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) !=
iteration)
Same comment as earlier. I vote to let this poke out, or shorten the variables so that the lines aren't so long.
;
- }
ts_diff = timespec_elapsed(start); pr_info("Populate memory time: %ld.%.9lds\n", -- 2.30.0.284.gd98b1dd5eaa7-goog
Add an option to overlap the ranges of memory each vCPU accesses instead of partitioning them. This option will increase the probability of multiple vCPUs faulting on the same page at the same time, and causing interesting races, if there are bugs in the page fault handler or elsewhere in the kernel.
Reviewed-by: Jacob Xu jacobhxu@google.com Reviewed-by: Makarand Sonare makarandsonare@google.com
Signed-off-by: Ben Gardon bgardon@google.com --- .../selftests/kvm/demand_paging_test.c | 32 +++++++++++++++---- .../selftests/kvm/dirty_log_perf_test.c | 14 ++++++-- .../selftests/kvm/include/perf_test_util.h | 4 ++- .../selftests/kvm/lib/perf_test_util.c | 25 +++++++++++---- 4 files changed, 57 insertions(+), 18 deletions(-)
diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index a1cd234e6f5e..e8fda95f8389 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -250,6 +250,7 @@ static int setup_demand_paging(struct kvm_vm *vm, struct test_params { bool use_uffd; useconds_t uffd_delay; + bool partition_vcpu_memory_access; };
static void run_test(enum vm_guest_mode mode, void *arg) @@ -277,7 +278,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) vcpu_threads = malloc(nr_vcpus * sizeof(*vcpu_threads)); TEST_ASSERT(vcpu_threads, "Memory allocation failed");
- perf_test_setup_vcpus(vm, nr_vcpus, guest_percpu_mem_size); + perf_test_setup_vcpus(vm, nr_vcpus, guest_percpu_mem_size, + p->partition_vcpu_memory_access);
if (p->use_uffd) { uffd_handler_threads = @@ -293,10 +295,19 @@ static void run_test(enum vm_guest_mode mode, void *arg) for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) { vm_paddr_t vcpu_gpa; void *vcpu_hva; + uint64_t vcpu_mem_size;
- vcpu_gpa = guest_test_phys_mem + (vcpu_id * guest_percpu_mem_size); + + if (p->partition_vcpu_memory_access) { + vcpu_gpa = guest_test_phys_mem + + (vcpu_id * guest_percpu_mem_size); + vcpu_mem_size = guest_percpu_mem_size; + } else { + vcpu_gpa = guest_test_phys_mem; + vcpu_mem_size = guest_percpu_mem_size * nr_vcpus; + } PER_VCPU_DEBUG("Added VCPU %d with test mem gpa [%lx, %lx)\n", - vcpu_id, vcpu_gpa, vcpu_gpa + guest_percpu_mem_size); + vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_mem_size);
/* Cache the HVA pointer of the region */ vcpu_hva = addr_gpa2hva(vm, vcpu_gpa); @@ -313,7 +324,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) &uffd_handler_threads[vcpu_id], pipefds[vcpu_id * 2], p->uffd_delay, &uffd_args[vcpu_id], - vcpu_hva, guest_percpu_mem_size); + vcpu_hva, vcpu_mem_size); if (r < 0) exit(-r); } @@ -376,7 +387,7 @@ static void help(char *name) { puts(""); printf("usage: %s [-h] [-m mode] [-u] [-d uffd_delay_usec]\n" - " [-b memory] [-v vcpus]\n", name); + " [-b memory] [-v vcpus] [-o]\n", name); guest_modes_help(); printf(" -u: use User Fault FD to handle vCPU page\n" " faults.\n"); @@ -387,6 +398,8 @@ static void help(char *name) " demand paged by each vCPU. e.g. 10M or 3G.\n" " Default: 1G\n"); printf(" -v: specify the number of vCPUs to run.\n"); + printf(" -o: Overlap guest memory accesses instead of partitioning\n" + " them into a separate region of memory for each vCPU.\n"); puts(""); exit(0); } @@ -394,12 +407,14 @@ static void help(char *name) int main(int argc, char *argv[]) { int max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS); - struct test_params p = {}; + struct test_params p = { + .partition_vcpu_memory_access = true, + }; int opt;
guest_modes_append_default();
- while ((opt = getopt(argc, argv, "hm:ud:b:v:")) != -1) { + while ((opt = getopt(argc, argv, "hm:ud:b:v:o")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); @@ -419,6 +434,9 @@ int main(int argc, char *argv[]) TEST_ASSERT(nr_vcpus > 0 && nr_vcpus <= max_vcpus, "Invalid number of vcpus, must be between 1 and %d", max_vcpus); break; + case 'o': + p.partition_vcpu_memory_access = false; + break; case 'h': default: help(argv[0]); diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index fb6eb7fa0b45..a0231be3984d 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -92,6 +92,7 @@ struct test_params { unsigned long iterations; uint64_t phys_offset; int wr_fract; + bool partition_vcpu_memory_access; };
static void run_test(enum vm_guest_mode mode, void *arg) @@ -129,7 +130,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) vcpu_threads = malloc(nr_vcpus * sizeof(*vcpu_threads)); TEST_ASSERT(vcpu_threads, "Memory allocation failed");
- perf_test_setup_vcpus(vm, nr_vcpus, guest_percpu_mem_size); + perf_test_setup_vcpus(vm, nr_vcpus, guest_percpu_mem_size, + p->partition_vcpu_memory_access);
sync_global_to_guest(vm, perf_test_args);
@@ -240,7 +242,7 @@ static void help(char *name) { puts(""); printf("usage: %s [-h] [-i iterations] [-p offset] " - "[-m mode] [-b vcpu bytes] [-v vcpus]\n", name); + "[-m mode] [-b vcpu bytes] [-v vcpus] [-o]\n", name); puts(""); printf(" -i: specify iteration counts (default: %"PRIu64")\n", TEST_HOST_LOOP_N); @@ -255,6 +257,8 @@ static void help(char *name) " 1/<fraction of pages to write>.\n" " (default: 1 i.e. all pages are written to.)\n"); printf(" -v: specify the number of vCPUs to run.\n"); + printf(" -o: Overlap guest memory accesses instead of partitioning\n" + " them into a separate region of memory for each vCPU.\n"); puts(""); exit(0); } @@ -265,6 +269,7 @@ int main(int argc, char *argv[]) struct test_params p = { .iterations = TEST_HOST_LOOP_N, .wr_fract = 1, + .partition_vcpu_memory_access = true, }; int opt;
@@ -275,7 +280,7 @@ int main(int argc, char *argv[])
guest_modes_append_default();
- while ((opt = getopt(argc, argv, "hi:p:m:b:f:v:")) != -1) { + while ((opt = getopt(argc, argv, "hi:p:m:b:f:v:o")) != -1) { switch (opt) { case 'i': p.iterations = atoi(optarg); @@ -299,6 +304,9 @@ int main(int argc, char *argv[]) TEST_ASSERT(nr_vcpus > 0 && nr_vcpus <= max_vcpus, "Invalid number of vcpus, must be between 1 and %d", max_vcpus); break; + case 'o': + p.partition_vcpu_memory_access = false; + break; case 'h': default: help(argv[0]); diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h index b1188823c31b..f406534f0487 100644 --- a/tools/testing/selftests/kvm/include/perf_test_util.h +++ b/tools/testing/selftests/kvm/include/perf_test_util.h @@ -46,6 +46,8 @@ extern uint64_t guest_test_phys_mem; struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int vcpus, uint64_t vcpu_memory_bytes); void perf_test_destroy_vm(struct kvm_vm *vm); -void perf_test_setup_vcpus(struct kvm_vm *vm, int vcpus, uint64_t vcpu_memory_bytes); +void perf_test_setup_vcpus(struct kvm_vm *vm, int vcpus, + uint64_t vcpu_memory_bytes, + bool partition_vcpu_memory_access);
#endif /* SELFTEST_KVM_PERF_TEST_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c index 9be1944c2d1c..f5fed2fbe964 100644 --- a/tools/testing/selftests/kvm/lib/perf_test_util.c +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c @@ -112,7 +112,9 @@ void perf_test_destroy_vm(struct kvm_vm *vm) kvm_vm_free(vm); }
-void perf_test_setup_vcpus(struct kvm_vm *vm, int vcpus, uint64_t vcpu_memory_bytes) +void perf_test_setup_vcpus(struct kvm_vm *vm, int vcpus, + uint64_t vcpu_memory_bytes, + bool partition_vcpu_memory_access) { vm_paddr_t vcpu_gpa; struct perf_test_vcpu_args *vcpu_args; @@ -122,13 +124,22 @@ void perf_test_setup_vcpus(struct kvm_vm *vm, int vcpus, uint64_t vcpu_memory_by vcpu_args = &perf_test_args.vcpu_args[vcpu_id];
vcpu_args->vcpu_id = vcpu_id; - vcpu_args->gva = guest_test_virt_mem + - (vcpu_id * vcpu_memory_bytes); - vcpu_args->pages = vcpu_memory_bytes / - perf_test_args.guest_page_size; + if (partition_vcpu_memory_access) { + vcpu_args->gva = guest_test_virt_mem + + (vcpu_id * vcpu_memory_bytes); + vcpu_args->pages = vcpu_memory_bytes / + perf_test_args.guest_page_size; + vcpu_gpa = guest_test_phys_mem + + (vcpu_id * vcpu_memory_bytes); + } else { + vcpu_args->gva = guest_test_virt_mem; + vcpu_args->pages = (vcpus * vcpu_memory_bytes) / + perf_test_args.guest_page_size; + vcpu_gpa = guest_test_phys_mem; + }
- vcpu_gpa = guest_test_phys_mem + (vcpu_id * vcpu_memory_bytes); pr_debug("Added VCPU %d with test mem gpa [%lx, %lx)\n", - vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_memory_bytes); + vcpu_id, vcpu_gpa, vcpu_gpa + + (vcpu_args->pages * perf_test_args.guest_page_size)); } }
Add a memslot modification stress test in which a memslot is repeatedly created and removed while vCPUs access memory in another memslot. Most userspaces do not create or remove memslots on running VMs which makes it hard to test races in adding and removing memslots without a dedicated test. Adding and removing a memslot also has the effect of tearing down the entire paging structure, which leads to more page faults and pressure on the page fault handling path than a one-and-done memory population test.
Reviewed-by: Jacob Xu jacobhxu@google.com
Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../kvm/memslot_modification_stress_test.c | 211 ++++++++++++++++++ 3 files changed, 213 insertions(+) create mode 100644 tools/testing/selftests/kvm/memslot_modification_stress_test.c
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index ce8f4ad39684..5a9aebfd5e01 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -29,5 +29,6 @@ /dirty_log_test /dirty_log_perf_test /kvm_create_max_vcpus +/memslot_modification_stress_test /set_memory_region_test /steal_time diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index fe41c6a0fa67..df208dc4f2ed 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -63,6 +63,7 @@ TEST_GEN_PROGS_x86_64 += demand_paging_test TEST_GEN_PROGS_x86_64 += dirty_log_test TEST_GEN_PROGS_x86_64 += dirty_log_perf_test TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus +TEST_GEN_PROGS_x86_64 += memslot_modification_stress_test TEST_GEN_PROGS_x86_64 += set_memory_region_test TEST_GEN_PROGS_x86_64 += steal_time
diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c new file mode 100644 index 000000000000..cae1b90cb63f --- /dev/null +++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c @@ -0,0 +1,211 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KVM memslot modification stress test + * Adapted from demand_paging_test.c + * + * Copyright (C) 2018, Red Hat, Inc. + * Copyright (C) 2020, Google, Inc. + */ + +#define _GNU_SOURCE /* for program_invocation_name */ + +#include <stdio.h> +#include <stdlib.h> +#include <sys/syscall.h> +#include <unistd.h> +#include <asm/unistd.h> +#include <time.h> +#include <poll.h> +#include <pthread.h> +#include <linux/bitmap.h> +#include <linux/bitops.h> +#include <linux/userfaultfd.h> + +#include "perf_test_util.h" +#include "processor.h" +#include "test_util.h" +#include "guest_modes.h" + +#define DUMMY_MEMSLOT_INDEX 7 + +#define DEFAULT_MEMSLOT_MODIFICATION_ITERATIONS 10 + + +static int nr_vcpus = 1; +static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; + +static bool run_vcpus = true; + +static void *vcpu_worker(void *data) +{ + int ret; + struct perf_test_vcpu_args *vcpu_args = + (struct perf_test_vcpu_args *)data; + int vcpu_id = vcpu_args->vcpu_id; + struct kvm_vm *vm = perf_test_args.vm; + struct kvm_run *run; + + vcpu_args_set(vm, vcpu_id, 1, vcpu_id); + run = vcpu_state(vm, vcpu_id); + + /* Let the guest access its memory until a stop signal is received */ + while (READ_ONCE(run_vcpus)) { + ret = _vcpu_run(vm, vcpu_id); + TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); + + if (get_ucall(vm, vcpu_id, NULL) == UCALL_SYNC) + continue; + + TEST_ASSERT(false, + "Invalid guest sync status: exit_reason=%s\n", + exit_reason_str(run->exit_reason)); + } + + return NULL; +} + +struct memslot_antagonist_args { + struct kvm_vm *vm; + useconds_t delay; + uint64_t nr_modifications; +}; + +static void add_remove_memslot(struct kvm_vm *vm, useconds_t delay, + uint64_t nr_modifications, uint64_t gpa) +{ + int i; + + for (i = 0; i < nr_modifications; i++) { + usleep(delay); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, gpa, + DUMMY_MEMSLOT_INDEX, 1, 0); + + vm_mem_region_delete(vm, DUMMY_MEMSLOT_INDEX); + } +} + +struct test_params { + useconds_t memslot_modification_delay; + uint64_t nr_memslot_modifications; + bool partition_vcpu_memory_access; +}; + +static void run_test(enum vm_guest_mode mode, void *arg) +{ + struct test_params *p = arg; + pthread_t *vcpu_threads; + struct kvm_vm *vm; + int vcpu_id; + + vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size); + + perf_test_args.wr_fract = 1; + + vcpu_threads = malloc(nr_vcpus * sizeof(*vcpu_threads)); + TEST_ASSERT(vcpu_threads, "Memory allocation failed"); + + perf_test_setup_vcpus(vm, nr_vcpus, guest_percpu_mem_size, + p->partition_vcpu_memory_access); + + /* Export the shared variables to the guest */ + sync_global_to_guest(vm, perf_test_args); + + pr_info("Finished creating vCPUs\n"); + + for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) + pthread_create(&vcpu_threads[vcpu_id], NULL, vcpu_worker, + &perf_test_args.vcpu_args[vcpu_id]); + + pr_info("Started all vCPUs\n"); + + add_remove_memslot(vm, p->memslot_modification_delay, + p->nr_memslot_modifications, + guest_test_phys_mem + + (guest_percpu_mem_size * nr_vcpus) + + perf_test_args.host_page_size + + perf_test_args.guest_page_size); + + run_vcpus = false; + + /* Wait for the vcpu threads to quit */ + for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) + pthread_join(vcpu_threads[vcpu_id], NULL); + + pr_info("All vCPU threads joined\n"); + + ucall_uninit(vm); + kvm_vm_free(vm); + + free(vcpu_threads); +} + +static void help(char *name) +{ + puts(""); + printf("usage: %s [-h] [-m mode] [-d delay_usec]\n" + " [-b memory] [-v vcpus] [-o] [-i iterations]\n", name); + guest_modes_help(); + printf(" -d: add a delay between each iteration of adding and\n" + " deleting a memslot in usec.\n"); + printf(" -b: specify the size of the memory region which should be\n" + " accessed by each vCPU. e.g. 10M or 3G.\n" + " Default: 1G\n"); + printf(" -v: specify the number of vCPUs to run.\n"); + printf(" -o: Overlap guest memory accesses instead of partitioning\n" + " them into a separate region of memory for each vCPU.\n"); + printf(" -i: specify the number of iterations of adding and removing\n" + " a memslot.\n" + " Default: %d\n", DEFAULT_MEMSLOT_MODIFICATION_ITERATIONS); + puts(""); + exit(0); +} + +int main(int argc, char *argv[]) +{ + int max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS); + int opt; + struct test_params p = { + .memslot_modification_delay = 0, + .nr_memslot_modifications = + DEFAULT_MEMSLOT_MODIFICATION_ITERATIONS, + .partition_vcpu_memory_access = true + }; + + guest_modes_append_default(); + + while ((opt = getopt(argc, argv, "hm:d:b:v:oi:")) != -1) { + switch (opt) { + case 'm': + guest_modes_cmdline(optarg); + break; + case 'd': + p.memslot_modification_delay = strtoul(optarg, NULL, 0); + TEST_ASSERT(p.memslot_modification_delay >= 0, + "A negative delay is not supported."); + break; + case 'b': + guest_percpu_mem_size = parse_size(optarg); + break; + case 'v': + nr_vcpus = atoi(optarg); + TEST_ASSERT(nr_vcpus > 0 && nr_vcpus <= max_vcpus, + "Invalid number of vcpus, must be between 1 and %d", + max_vcpus); + break; + case 'o': + p.partition_vcpu_memory_access = false; + break; + case 'i': + p.nr_memslot_modifications = atoi(optarg); + break; + case 'h': + default: + help(argv[0]); + break; + } + } + + for_each_guest_mode(run_test, &p); + + return 0; +}
On Tue, Jan 12, 2021, Ben Gardon wrote:
Add a memslot modification stress test in which a memslot is repeatedly created and removed while vCPUs access memory in another memslot. Most userspaces do not create or remove memslots on running VMs which makes it hard to test races in adding and removing memslots without a dedicated test. Adding and removing a memslot also has the effect of tearing down the entire paging structure, which leads to more page faults and pressure on the page fault handling path than a one-and-done memory population test.
Would it make sense to integrate this with set_memory_region_test? At a high level, they are doing very similar things. Not sure how much code can be shared, but I assume there's some amount of overlap.
On 12/01/21 22:42, Ben Gardon wrote:
This series contains a few cleanups that didn't make it into previous series, including some cosmetic changes and small bug fixes. The series also lays the groundwork for a memslot modification test which stresses the memslot update and page fault code paths in an attempt to expose races.
Tested: dirty_log_perf_test, memslot_modification_stress_test, and demand_paging_test were run, with all the patches in this series applied, on an Intel Skylake machine.
echo Y > /sys/module/kvm/parameters/tdp_mmu; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \ ./dirty_log_perf_test -v 64 -b 1G; \ ./dirty_log_perf_test -v 64 -b 64M -o; \ ./demand_paging_test -v 64 -b 1G; \ ./demand_paging_test -v 64 -b 64M -o; \ echo N > /sys/module/kvm/parameters/tdp_mmu; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \ ./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \ ./dirty_log_perf_test -v 64 -b 1G; \ ./dirty_log_perf_test -v 64 -b 64M -o; \ ./demand_paging_test -v 64 -b 1G; \ ./demand_paging_test -v 64 -b 64M -o
The tests behaved as expected, and fixed the problem of the population stage being skipped in dirty_log_perf_test. This can be seen in the output, with the population stage taking about the time dirty pass 1 took and dirty pass 1 falling closer to the times for the other passes.
Note that when running these tests, the -o option causes the test to take much longer as the work each vCPU must do increases proportional to the number of vCPUs.
You can view this series in Gerrit at: https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/72...
Ben Gardon (6): KVM: selftests: Rename timespec_diff_now to timespec_elapsed KVM: selftests: Avoid flooding debug log while populating memory KVM: selftests: Convert iterations to int in dirty_log_perf_test KVM: selftests: Fix population stage in dirty_log_perf_test KVM: selftests: Add option to overlap vCPU memory access KVM: selftests: Add memslot modification stress test
tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/demand_paging_test.c | 40 +++- .../selftests/kvm/dirty_log_perf_test.c | 72 +++--- .../selftests/kvm/include/perf_test_util.h | 4 +- .../testing/selftests/kvm/include/test_util.h | 2 +- .../selftests/kvm/lib/perf_test_util.c | 25 ++- tools/testing/selftests/kvm/lib/test_util.c | 2 +- .../kvm/memslot_modification_stress_test.c | 211 ++++++++++++++++++ 9 files changed, 307 insertions(+), 51 deletions(-) create mode 100644 tools/testing/selftests/kvm/memslot_modification_stress_test.c
Queued, thanks.
Paolo
linux-kselftest-mirror@lists.linaro.org