[PATCH v2 0/5] Add a dirty logging performance test

List overview All Threads
Download

newer

older

[PATCH 00/13] Introduce seqnum_ops

RE: [PATCH v3 1/2] dma-mapping:...

Ben Gardon

3 Nov 2020 3 Nov '20

11:49 p.m.

Currently KVM lacks a simple, userspace agnostic, performance benchmark for dirty logging. Such a benchmark will be beneficial for ensuring that dirty logging performance does not regress, and to give a common baseline for validating performance improvements. The dirty log perf test introduced in this series builds on aspects of the existing demand paging perf test and provides time-based performance metrics for enabling and disabling dirty logging, getting the dirty log, and dirtying memory.

While the test currently only has a build target for x86, I expect it will work on, or be easily modified to support other architectures.

This series was tested by running the following invocations on an Intel Skylake machine after apply all commits in the series: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected.

v1 -> v2 changes: (in response to comments from Peter Xu) - Removed pr_debugs from main test thread while waiting on vCPUs to reduce log spam - Fixed a bug in iteration counting that caused the population stage to be counted as part of the first dirty logging pass - Fixed a bug in which the test failed to wait for the population stage for all but the first vCPU. - Refactored the common code in perf_test_util.c/h - Moved testing description to cover letter - Renamed timespec_diff_now to timespec_elapsed

Ben Gardon (5): KVM: selftests: Remove address rounding in guest code KVM: selftests: Factor code out of demand_paging_test KVM: selftests: Simplify demand_paging_test with timespec_diff_now KVM: selftests: Add wrfract to common guest code KVM: selftests: Introduce the dirty log perf test

tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 7 +- .../selftests/kvm/demand_paging_test.c | 231 ++--------- .../selftests/kvm/dirty_log_perf_test.c | 381 ++++++++++++++++++ .../selftests/kvm/include/perf_test_util.h | 51 +++ .../testing/selftests/kvm/include/test_util.h | 2 + .../selftests/kvm/lib/perf_test_util.c | 166 ++++++++ tools/testing/selftests/kvm/lib/test_util.c | 22 +- 8 files changed, 661 insertions(+), 200 deletions(-) create mode 100644 tools/testing/selftests/kvm/dirty_log_perf_test.c create mode 100644 tools/testing/selftests/kvm/include/perf_test_util.h create mode 100644 tools/testing/selftests/kvm/lib/perf_test_util.c

-- 2.29.1.341.ge80a0c044ae-goog

Show replies by date

Ben Gardon

3 Nov 3 Nov

11:49 p.m.

New subject: [PATCH v2 1/5] KVM: selftests: Remove address rounding in guest code

Rounding the address the guest writes to a host page boundary will only have an effect if the host page size is larger than the guest page size, but in that case the guest write would still go to the same host page. There's no reason to round the address down, so remove the rounding to simplify the demand paging test.

Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/demand_paging_test.c | 1 - 1 file changed, 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 360cd3ea4cd67..32a42eafc6b5c 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -103,7 +103,6 @@ static void guest_code(uint32_t vcpu_id) for (i = 0; i < pages; i++) { uint64_t addr = gva + (i * guest_page_size);

- addr &= ~(host_page_size - 1); *(uint64_t *)addr = 0x0123456789ABCDEF; }

-- 2.29.1.341.ge80a0c044ae-goog

Andrew Jones

4 Nov 4 Nov

10:54 a.m.

New subject: [PATCH v2 1/5] KVM: selftests: Remove address rounding in guest code

On Tue, Nov 03, 2020 at 03:49:48PM -0800, Ben Gardon wrote:

...

Rounding the address the guest writes to a host page boundary will only have an effect if the host page size is larger than the guest page size, but in that case the guest write would still go to the same host page. There's no reason to round the address down, so remove the rounding to simplify the demand paging test.

Signed-off-by: Ben Gardon bgardon@google.com

tools/testing/selftests/kvm/demand_paging_test.c | 1 - 1 file changed, 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 360cd3ea4cd67..32a42eafc6b5c 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -103,7 +103,6 @@ static void guest_code(uint32_t vcpu_id) for (i = 0; i < pages; i++) { uint64_t addr = gva + (i * guest_page_size);
addr &= ~(host_page_size - 1);
*(uint64_t *)addr = 0x0123456789ABCDEF; }
2.29.1.341.ge80a0c044ae-goog

Reviewed-by: Andrew Jones drjones@redhat.com

Ben Gardon

3 Nov 3 Nov

11:49 p.m.

New subject: [PATCH v2 2/5] KVM: selftests: Factor code out of demand_paging_test

Much of the code in demand_paging_test can be reused by other, similar multi-vCPU-memory-touching-perfromance-tests. Factor that common code out for reuse.

No functional change expected.

Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/Makefile | 6 +- .../selftests/kvm/demand_paging_test.c | 202 ++---------------- .../selftests/kvm/include/perf_test_util.h | 50 +++++ .../selftests/kvm/lib/perf_test_util.c | 161 ++++++++++++++ 4 files changed, 235 insertions(+), 184 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/perf_test_util.h create mode 100644 tools/testing/selftests/kvm/lib/perf_test_util.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 30afbad36cd55..9b2bebb64175b 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -33,8 +33,10 @@ ifeq ($(ARCH),s390) UNAME_M := s390x endif

-LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c lib/test_util.c -LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c +LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c \ + lib/test_util.c lib/perf_test_util.c +LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c \ + lib/x86_64/ucall.c LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c

diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 32a42eafc6b5c..682805dd8c2ac 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -21,18 +21,12 @@ #include <linux/bitops.h> #include <linux/userfaultfd.h>

-#include "test_util.h" -#include "kvm_util.h" +#include "perf_test_util.h" #include "processor.h" +#include "test_util.h"

#ifdef __NR_userfaultfd

-/* The memory slot index demand page */ -#define TEST_MEM_SLOT_INDEX 1 - -/* Default guest test virtual memory offset */ -#define DEFAULT_GUEST_TEST_MEM 0xc0000000 - #define DEFAULT_GUEST_TEST_MEM_SIZE (1 << 30) /* 1G */

#ifdef PRINT_PER_PAGE_UPDATES @@ -47,74 +41,14 @@ #define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__) #endif

-#define MAX_VCPUS 512 - -/* - * Guest/Host shared variables. Ensure addr_gva2hva() and/or - * sync_global_to/from_guest() are used when accessing from - * the host. READ/WRITE_ONCE() should also be used with anything - * that may change. - */ -static uint64_t host_page_size; -static uint64_t guest_page_size; - static char *guest_data_prototype;

-/* - * Guest physical memory offset of the testing memory slot. - * This will be set to the topmost valid physical address minus - * the test memory size. - */ -static uint64_t guest_test_phys_mem; - -/* - * Guest virtual memory offset of the testing memory slot. - * Must not conflict with identity mapped test code. - */ -static uint64_t guest_test_virt_mem = DEFAULT_GUEST_TEST_MEM; - -struct vcpu_args { - uint64_t gva; - uint64_t pages; - - /* Only used by the host userspace part of the vCPU thread */ - int vcpu_id; - struct kvm_vm *vm; -}; - -static struct vcpu_args vcpu_args[MAX_VCPUS]; - -/* - * Continuously write to the first 8 bytes of each page in the demand paging - * memory region. - */ -static void guest_code(uint32_t vcpu_id) -{ - uint64_t gva; - uint64_t pages; - int i; - - /* Make sure vCPU args data structure is not corrupt. */ - GUEST_ASSERT(vcpu_args[vcpu_id].vcpu_id == vcpu_id); - - gva = vcpu_args[vcpu_id].gva; - pages = vcpu_args[vcpu_id].pages; - - for (i = 0; i < pages; i++) { - uint64_t addr = gva + (i * guest_page_size); - - *(uint64_t *)addr = 0x0123456789ABCDEF; - } - - GUEST_SYNC(1); -} - static void *vcpu_worker(void *data) { int ret; - struct vcpu_args *args = (struct vcpu_args *)data; - struct kvm_vm *vm = args->vm; - int vcpu_id = args->vcpu_id; + struct vcpu_args *vcpu_args = (struct vcpu_args *)data; + int vcpu_id = vcpu_args->vcpu_id; + struct kvm_vm *vm = perf_test_args.vm; struct kvm_run *run; struct timespec start, end, ts_diff;

@@ -140,39 +74,6 @@ static void *vcpu_worker(void *data) return NULL; }

-#define PAGE_SHIFT_4K 12 -#define PTES_PER_4K_PT 512 - -static struct kvm_vm *create_vm(enum vm_guest_mode mode, int vcpus, - uint64_t vcpu_memory_bytes) -{ - struct kvm_vm *vm; - uint64_t pages = DEFAULT_GUEST_PHY_PAGES; - - /* Account for a few pages per-vCPU for stacks */ - pages += DEFAULT_STACK_PGS * vcpus; - - /* - * Reserve twice the ammount of memory needed to map the test region and - * the page table / stacks region, at 4k, for page tables. Do the - * calculation with 4K page size: the smallest of all archs. (e.g., 64K - * page size guest will need even less memory for page tables). - */ - pages += (2 * pages) / PTES_PER_4K_PT; - pages += ((2 * vcpus * vcpu_memory_bytes) >> PAGE_SHIFT_4K) / - PTES_PER_4K_PT; - pages = vm_adjust_num_guest_pages(mode, pages); - - pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode)); - - vm = _vm_create(mode, pages, O_RDWR); - kvm_vm_elf_load(vm, program_invocation_name, 0, 0); -#ifdef __x86_64__ - vm_create_irqchip(vm); -#endif - return vm; -} - static int handle_uffd_page_request(int uffd, uint64_t addr) { pid_t tid; @@ -185,7 +86,7 @@ static int handle_uffd_page_request(int uffd, uint64_t addr)

copy.src = (uint64_t)guest_data_prototype; copy.dst = addr; - copy.len = host_page_size; + copy.len = perf_test_args.host_page_size; copy.mode = 0;

clock_gettime(CLOCK_MONOTONIC, &start); @@ -202,7 +103,7 @@ static int handle_uffd_page_request(int uffd, uint64_t addr) PER_PAGE_DEBUG("UFFDIO_COPY %d \t%ld ns\n", tid, timespec_to_ns(timespec_sub(end, start))); PER_PAGE_DEBUG("Paged in %ld bytes at 0x%lx from thread %d\n", - host_page_size, addr, tid); + perf_test_args.host_page_size, addr, tid);

return 0; } @@ -359,60 +260,15 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, struct timespec start, end, ts_diff; int *pipefds = NULL; struct kvm_vm *vm; - uint64_t guest_num_pages; int vcpu_id; int r;

vm = create_vm(mode, vcpus, vcpu_memory_bytes);

- guest_page_size = vm_get_page_size(vm); - - TEST_ASSERT(vcpu_memory_bytes % guest_page_size == 0, - "Guest memory size is not guest page size aligned."); - - guest_num_pages = (vcpus * vcpu_memory_bytes) / guest_page_size; - guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages); - - /* - * If there should be more memory in the guest test region than there - * can be pages in the guest, it will definitely cause problems. - */ - TEST_ASSERT(guest_num_pages < vm_get_max_gfn(vm), - "Requested more guest memory than address space allows.\n" - " guest pages: %lx max gfn: %x vcpus: %d wss: %lx]\n", - guest_num_pages, vm_get_max_gfn(vm), vcpus, - vcpu_memory_bytes); - - host_page_size = getpagesize(); - TEST_ASSERT(vcpu_memory_bytes % host_page_size == 0, - "Guest memory size is not host page size aligned."); - - guest_test_phys_mem = (vm_get_max_gfn(vm) - guest_num_pages) * - guest_page_size; - guest_test_phys_mem &= ~(host_page_size - 1); - -#ifdef __s390x__ - /* Align to 1M (segment size) */ - guest_test_phys_mem &= ~((1 << 20) - 1); -#endif - - pr_info("guest physical test memory offset: 0x%lx\n", guest_test_phys_mem); - - /* Add an extra memory slot for testing demand paging */ - vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, - guest_test_phys_mem, - TEST_MEM_SLOT_INDEX, - guest_num_pages, 0); - - /* Do mapping for the demand paging memory slot */ - virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, guest_num_pages, 0); - - ucall_init(vm, NULL); - - guest_data_prototype = malloc(host_page_size); + guest_data_prototype = malloc(perf_test_args.host_page_size); TEST_ASSERT(guest_data_prototype, "Failed to allocate buffer for guest data pattern"); - memset(guest_data_prototype, 0xAB, host_page_size); + memset(guest_data_prototype, 0xAB, perf_test_args.host_page_size);

vcpu_threads = malloc(vcpus * sizeof(*vcpu_threads)); TEST_ASSERT(vcpu_threads, "Memory allocation failed"); @@ -427,22 +283,18 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd,

pipefds = malloc(sizeof(int) * vcpus * 2); TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd"); - }

- for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { - vm_paddr_t vcpu_gpa; - void *vcpu_hva; - - vm_vcpu_add_default(vm, vcpu_id, guest_code); + for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { + vm_paddr_t vcpu_gpa; + void *vcpu_hva;

- vcpu_gpa = guest_test_phys_mem + (vcpu_id * vcpu_memory_bytes); - PER_VCPU_DEBUG("Added VCPU %d with test mem gpa [%lx, %lx)\n", - vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_memory_bytes); + vcpu_gpa = guest_test_phys_mem + (vcpu_id * vcpu_memory_bytes); + PER_VCPU_DEBUG("Added VCPU %d with test mem gpa [%lx, %lx)\n", + vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_memory_bytes);

- /* Cache the HVA pointer of the region */ - vcpu_hva = addr_gpa2hva(vm, vcpu_gpa); + /* Cache the HVA pointer of the region */ + vcpu_hva = addr_gpa2hva(vm, vcpu_gpa);

- if (use_uffd) { /* * Set up user fault fd to handle demand paging * requests. @@ -459,30 +311,15 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, if (r < 0) exit(-r); } - -#ifdef __x86_64__ - vcpu_set_cpuid(vm, vcpu_id, kvm_get_supported_cpuid()); -#endif - - vcpu_args[vcpu_id].vm = vm; - vcpu_args[vcpu_id].vcpu_id = vcpu_id; - vcpu_args[vcpu_id].gva = guest_test_virt_mem + - (vcpu_id * vcpu_memory_bytes); - vcpu_args[vcpu_id].pages = vcpu_memory_bytes / guest_page_size; }

- /* Export the shared variables to the guest */ - sync_global_to_guest(vm, host_page_size); - sync_global_to_guest(vm, guest_page_size); - sync_global_to_guest(vm, vcpu_args); - pr_info("Finished creating vCPUs and starting uffd threads\n");

clock_gettime(CLOCK_MONOTONIC, &start);

for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { pthread_create(&vcpu_threads[vcpu_id], NULL, vcpu_worker, - &vcpu_args[vcpu_id]); + &perf_test_args.vcpu_args[vcpu_id]); }

pr_info("Started all vCPUs\n"); @@ -513,7 +350,8 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, pr_info("Total guest execution time: %ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec); pr_info("Overall demand paging rate: %f pgs/sec\n", - guest_num_pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0)); + perf_test_args.vcpu_args[0].pages * vcpus / + ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));

ucall_uninit(vm); kvm_vm_free(vm); diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h new file mode 100644 index 0000000000000..4d52b9ee13c42 --- /dev/null +++ b/tools/testing/selftests/kvm/include/perf_test_util.h @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * tools/testing/selftests/kvm/include/perf_test_util.h + * + * Copyright (C) 2020, Google LLC. + */ + +#ifndef SELFTEST_KVM_PERF_TEST_UTIL_H +#define SELFTEST_KVM_PERF_TEST_UTIL_H + +#include "kvm_util.h" +#include "processor.h" + +#define MAX_VCPUS 512 + +#define PAGE_SHIFT_4K 12 +#define PTES_PER_4K_PT 512 + +#define TEST_MEM_SLOT_INDEX 1 + +/* Default guest test virtual memory offset */ +#define DEFAULT_GUEST_TEST_MEM 0xc0000000 + +extern uint64_t guest_test_phys_mem; +extern uint64_t guest_test_virt_mem; + +struct vcpu_args { + uint64_t gva; + uint64_t pages; + + /* Only used by the host userspace part of the vCPU thread */ + int vcpu_id; +}; + +struct perf_test_args { + struct kvm_vm *vm; + uint64_t host_page_size; + uint64_t guest_page_size; + + struct vcpu_args vcpu_args[MAX_VCPUS]; +}; + +extern struct perf_test_args perf_test_args; + +void guest_code(uint32_t vcpu_id); + +struct kvm_vm *create_vm(enum vm_guest_mode mode, int vcpus, + uint64_t vcpu_memory_bytes); + +#endif /* SELFTEST_KVM_PERF_TEST_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c new file mode 100644 index 0000000000000..fa7efbac9ef8a --- /dev/null +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c @@ -0,0 +1,161 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * tools/testing/selftests/kvm/lib/perf_test_util.c + * + * Copyright (C) 2020, Google LLC. + */ + +#define _GNU_SOURCE /* for program_invocation_name */ + +#include "perf_test_util.h" + +/* + * Guest physical memory offset of the testing memory slot. + * This will be set to the topmost valid physical address minus + * the test memory size. + */ +uint64_t guest_test_phys_mem; + +/* + * Guest virtual memory offset of the testing memory slot. + * Must not conflict with identity mapped test code. + */ +uint64_t guest_test_virt_mem = DEFAULT_GUEST_TEST_MEM; + +struct perf_test_args perf_test_args; + +/* + * Continuously write to the first 8 bytes of each page in the + * specified region. + */ +void guest_code(uint32_t vcpu_id) +{ + struct vcpu_args *vcpu_args = &perf_test_args.vcpu_args[vcpu_id]; + uint64_t gva; + uint64_t pages; + int i; + + /* Make sure vCPU args data structure is not corrupt. */ + GUEST_ASSERT(vcpu_args->vcpu_id == vcpu_id); + + gva = vcpu_args->gva; + pages = vcpu_args->pages; + + for (i = 0; i < pages; i++) { + uint64_t addr = gva + (i * perf_test_args.guest_page_size); + + *(uint64_t *)addr = 0x0123456789ABCDEF; + } + + GUEST_SYNC(1); +} + +static void add_vcpus(struct kvm_vm *vm, int vcpus, uint64_t vcpu_memory_bytes) +{ + vm_paddr_t vcpu_gpa; + struct vcpu_args *vcpu_args; + int vcpu_id; + + for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { + vcpu_args = &perf_test_args.vcpu_args[vcpu_id]; + + vm_vcpu_add_default(vm, vcpu_id, guest_code); + +#ifdef __x86_64__ + vcpu_set_cpuid(vm, vcpu_id, kvm_get_supported_cpuid()); +#endif + + vcpu_args->vcpu_id = vcpu_id; + vcpu_args->gva = guest_test_virt_mem + + (vcpu_id * vcpu_memory_bytes); + vcpu_args->pages = vcpu_memory_bytes / + perf_test_args.guest_page_size; + + vcpu_gpa = guest_test_phys_mem + (vcpu_id * vcpu_memory_bytes); + pr_debug("Added VCPU %d with test mem gpa [%lx, %lx)\n", + vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_memory_bytes); + } +} + +struct kvm_vm *create_vm(enum vm_guest_mode mode, int vcpus, + uint64_t vcpu_memory_bytes) +{ + struct kvm_vm *vm; + uint64_t pages = DEFAULT_GUEST_PHY_PAGES; + uint64_t guest_num_pages; + + /* Account for a few pages per-vCPU for stacks */ + pages += DEFAULT_STACK_PGS * vcpus; + + /* + * Reserve twice the ammount of memory needed to map the test region and + * the page table / stacks region, at 4k, for page tables. Do the + * calculation with 4K page size: the smallest of all archs. (e.g., 64K + * page size guest will need even less memory for page tables). + */ + pages += (2 * pages) / PTES_PER_4K_PT; + pages += ((2 * vcpus * vcpu_memory_bytes) >> PAGE_SHIFT_4K) / + PTES_PER_4K_PT; + pages = vm_adjust_num_guest_pages(mode, pages); + + pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode)); + + vm = _vm_create(mode, pages, O_RDWR); + kvm_vm_elf_load(vm, program_invocation_name, 0, 0); +#ifdef __x86_64__ + vm_create_irqchip(vm); +#endif + + perf_test_args.vm = vm; + perf_test_args.guest_page_size = vm_get_page_size(vm); + perf_test_args.host_page_size = getpagesize(); + + TEST_ASSERT(vcpu_memory_bytes % perf_test_args.guest_page_size == 0, + "Guest memory size is not guest page size aligned."); + + guest_num_pages = (vcpus * vcpu_memory_bytes) / + perf_test_args.guest_page_size; + guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages); + + /* + * If there should be more memory in the guest test region than there + * can be pages in the guest, it will definitely cause problems. + */ + TEST_ASSERT(guest_num_pages < vm_get_max_gfn(vm), + "Requested more guest memory than address space allows.\n" + " guest pages: %lx max gfn: %x vcpus: %d wss: %lx]\n", + guest_num_pages, vm_get_max_gfn(vm), vcpus, + vcpu_memory_bytes); + + TEST_ASSERT(vcpu_memory_bytes % perf_test_args.host_page_size == 0, + "Guest memory size is not host page size aligned."); + + guest_test_phys_mem = (vm_get_max_gfn(vm) - guest_num_pages) * + perf_test_args.guest_page_size; + guest_test_phys_mem &= ~(perf_test_args.host_page_size - 1); + +#ifdef __s390x__ + /* Align to 1M (segment size) */ + guest_test_phys_mem &= ~((1 << 20) - 1); +#endif + + pr_info("guest physical test memory offset: 0x%lx\n", guest_test_phys_mem); + + /* Add an extra memory slot for testing */ + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, + guest_test_phys_mem, + TEST_MEM_SLOT_INDEX, + guest_num_pages, 0); + + /* Do mapping for the demand paging memory slot */ + virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, guest_num_pages, 0); + + ucall_init(vm, NULL); + + add_vcpus(vm, vcpus, vcpu_memory_bytes); + + sync_global_to_guest(vm, perf_test_args); + + return vm; +} +

-- 2.29.1.341.ge80a0c044ae-goog

Andrew Jones

4 Nov 4 Nov

12:16 p.m.

New subject: [PATCH v2 2/5] KVM: selftests: Factor code out of demand_paging_test

On Tue, Nov 03, 2020 at 03:49:49PM -0800, Ben Gardon wrote:

...

Much of the code in demand_paging_test can be reused by other, similar multi-vCPU-memory-touching-perfromance-tests. Factor that common code

performance

...

out for reuse.

No functional change expected.

Signed-off-by: Ben Gardon bgardon@google.com

tools/testing/selftests/kvm/Makefile | 6 +- .../selftests/kvm/demand_paging_test.c | 202 ++---------------- .../selftests/kvm/include/perf_test_util.h | 50 +++++ .../selftests/kvm/lib/perf_test_util.c | 161 ++++++++++++++ 4 files changed, 235 insertions(+), 184 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/perf_test_util.h create mode 100644 tools/testing/selftests/kvm/lib/perf_test_util.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 30afbad36cd55..9b2bebb64175b 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -33,8 +33,10 @@ ifeq ($(ARCH),s390) UNAME_M := s390x endif -LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c lib/test_util.c -LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c +LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c \
lib/test_util.c lib/perf_test_util.c
+LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c \
lib/x86_64/ucall.c
LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 32a42eafc6b5c..682805dd8c2ac 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -21,18 +21,12 @@ #include <linux/bitops.h> #include <linux/userfaultfd.h> -#include "test_util.h" -#include "kvm_util.h" +#include "perf_test_util.h" #include "processor.h" +#include "test_util.h"

I don't think we moved out everything used from kvm_util.h, so we shouldn't remove that include.

Is there a reason to include test_util.h last?

...

#ifdef __NR_userfaultfd -/* The memory slot index demand page */ -#define TEST_MEM_SLOT_INDEX 1

-/* Default guest test virtual memory offset */ -#define DEFAULT_GUEST_TEST_MEM 0xc0000000

#define DEFAULT_GUEST_TEST_MEM_SIZE (1 << 30) /* 1G */ #ifdef PRINT_PER_PAGE_UPDATES @@ -47,74 +41,14 @@ #define PER_VCPU_DEBUG(...) _no_printf(__VA_ARGS__) #endif -#define MAX_VCPUS 512

-/*

Guest/Host shared variables. Ensure addr_gva2hva() and/or

sync_global_to/from_guest() are used when accessing from

the host. READ/WRITE_ONCE() should also be used with anything

that may change.

*/

-static uint64_t host_page_size; -static uint64_t guest_page_size;

static char *guest_data_prototype; -/*

Guest physical memory offset of the testing memory slot.

This will be set to the topmost valid physical address minus

the test memory size.

*/

-static uint64_t guest_test_phys_mem;

-/*

Guest virtual memory offset of the testing memory slot.

Must not conflict with identity mapped test code.

*/

-static uint64_t guest_test_virt_mem = DEFAULT_GUEST_TEST_MEM;

-struct vcpu_args {

uint64_t gva;

uint64_t pages;

/* Only used by the host userspace part of the vCPU thread */

int vcpu_id;

struct kvm_vm *vm;

-};

-static struct vcpu_args vcpu_args[MAX_VCPUS];

-/*

Continuously write to the first 8 bytes of each page in the demand paging

memory region.

*/

-static void guest_code(uint32_t vcpu_id) -{
uint64_t gva;

uint64_t pages;

int i;

/* Make sure vCPU args data structure is not corrupt. */

GUEST_ASSERT(vcpu_args[vcpu_id].vcpu_id == vcpu_id);

gva = vcpu_args[vcpu_id].gva;

pages = vcpu_args[vcpu_id].pages;

for (i = 0; i < pages; i++) {
uint64_t addr = gva + (i * guest_page_size);
*(uint64_t *)addr = 0x0123456789ABCDEF;
}

GUEST_SYNC(1);
-}

static void *vcpu_worker(void *data) { int ret;

struct vcpu_args *args = (struct vcpu_args *)data;

struct kvm_vm *vm = args->vm;

int vcpu_id = args->vcpu_id;

struct vcpu_args *vcpu_args = (struct vcpu_args *)data;

int vcpu_id = vcpu_args->vcpu_id;

struct kvm_vm *vm = perf_test_args.vm; struct kvm_run *run; struct timespec start, end, ts_diff;

@@ -140,39 +74,6 @@ static void *vcpu_worker(void *data) return NULL; } -#define PAGE_SHIFT_4K 12 -#define PTES_PER_4K_PT 512

-static struct kvm_vm *create_vm(enum vm_guest_mode mode, int vcpus,
		uint64_t vcpu_memory_bytes)
-{
struct kvm_vm *vm;

uint64_t pages = DEFAULT_GUEST_PHY_PAGES;

/* Account for a few pages per-vCPU for stacks */

pages += DEFAULT_STACK_PGS * vcpus;

/*
* Reserve twice the ammount of memory needed to map the test region and
* the page table / stacks region, at 4k, for page tables. Do the
* calculation with 4K page size: the smallest of all archs. (e.g., 64K
* page size guest will need even less memory for page tables).
*/
pages += (2 * pages) / PTES_PER_4K_PT;

pages += ((2 * vcpus * vcpu_memory_bytes) >> PAGE_SHIFT_4K) /
 PTES_PER_4K_PT;
pages = vm_adjust_num_guest_pages(mode, pages);

pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode));

vm = _vm_create(mode, pages, O_RDWR);

kvm_vm_elf_load(vm, program_invocation_name, 0, 0);
-#ifdef __x86_64__

vm_create_irqchip(vm);

-#endif

return vm;

-}

static int handle_uffd_page_request(int uffd, uint64_t addr) { pid_t tid; @@ -185,7 +86,7 @@ static int handle_uffd_page_request(int uffd, uint64_t addr) copy.src = (uint64_t)guest_data_prototype; copy.dst = addr;

copy.len = host_page_size;

copy.len = perf_test_args.host_page_size; copy.mode = 0;

clock_gettime(CLOCK_MONOTONIC, &start); @@ -202,7 +103,7 @@ static int handle_uffd_page_request(int uffd, uint64_t addr) PER_PAGE_DEBUG("UFFDIO_COPY %d \t%ld ns\n", tid, timespec_to_ns(timespec_sub(end, start))); PER_PAGE_DEBUG("Paged in %ld bytes at 0x%lx from thread %d\n",
       host_page_size, addr, tid);
       perf_test_args.host_page_size, addr, tid);
return 0; } @@ -359,60 +260,15 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, struct timespec start, end, ts_diff; int *pipefds = NULL; struct kvm_vm *vm;

uint64_t guest_num_pages; int vcpu_id; int r;

vm = create_vm(mode, vcpus, vcpu_memory_bytes);
guest_page_size = vm_get_page_size(vm);

TEST_ASSERT(vcpu_memory_bytes % guest_page_size == 0,
    "Guest memory size is not guest page size aligned.");
guest_num_pages = (vcpus * vcpu_memory_bytes) / guest_page_size;

guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);

/*
* If there should be more memory in the guest test region than there
* can be pages in the guest, it will definitely cause problems.
*/
TEST_ASSERT(guest_num_pages < vm_get_max_gfn(vm),
    "Requested more guest memory than address space allows.\n"
    "    guest pages: %lx max gfn: %x vcpus: %d wss: %lx]\n",
    guest_num_pages, vm_get_max_gfn(vm), vcpus,
    vcpu_memory_bytes);
host_page_size = getpagesize();

TEST_ASSERT(vcpu_memory_bytes % host_page_size == 0,
    "Guest memory size is not host page size aligned.");
guest_test_phys_mem = (vm_get_max_gfn(vm) - guest_num_pages) *
	      guest_page_size;
guest_test_phys_mem &= ~(host_page_size - 1);
-#ifdef __s390x__

/* Align to 1M (segment size) */

guest_test_phys_mem &= ~((1 << 20) - 1);

-#endif
pr_info("guest physical test memory offset: 0x%lx\n", guest_test_phys_mem);

/* Add an extra memory slot for testing demand paging */

vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
		    guest_test_phys_mem,
		    TEST_MEM_SLOT_INDEX,
		    guest_num_pages, 0);
/* Do mapping for the demand paging memory slot */

virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, guest_num_pages, 0);

ucall_init(vm, NULL);

guest_data_prototype = malloc(host_page_size);
guest_data_prototype = malloc(perf_test_args.host_page_size); TEST_ASSERT(guest_data_prototype, "Failed to allocate buffer for guest data pattern");

memset(guest_data_prototype, 0xAB, host_page_size);

memset(guest_data_prototype, 0xAB, perf_test_args.host_page_size);

vcpu_threads = malloc(vcpus * sizeof(*vcpu_threads)); TEST_ASSERT(vcpu_threads, "Memory allocation failed"); @@ -427,22 +283,18 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, pipefds = malloc(sizeof(int) * vcpus * 2); TEST_ASSERT(pipefds, "Unable to allocate memory for pipefd");

}
for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) {
vm_paddr_t vcpu_gpa;
void *vcpu_hva;
vm_vcpu_add_default(vm, vcpu_id, guest_code);
for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) {
	vm_paddr_t vcpu_gpa;
	void *vcpu_hva;
vcpu_gpa = guest_test_phys_mem + (vcpu_id * vcpu_memory_bytes);
PER_VCPU_DEBUG("Added VCPU %d with test mem gpa [%lx, %lx)\n",
	       vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_memory_bytes);
	vcpu_gpa = guest_test_phys_mem + (vcpu_id * vcpu_memory_bytes);
	PER_VCPU_DEBUG("Added VCPU %d with test mem gpa [%lx, %lx)\n",
		       vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_memory_bytes);
/* Cache the HVA pointer of the region */
vcpu_hva = addr_gpa2hva(vm, vcpu_gpa);
	/* Cache the HVA pointer of the region */
	vcpu_hva = addr_gpa2hva(vm, vcpu_gpa);
if (use_uffd) {
/*
 * Set up user fault fd to handle demand paging
 * requests.
@@ -459,30 +311,15 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, if (r < 0) exit(-r); }

-#ifdef __x86_64__
vcpu_set_cpuid(vm, vcpu_id, kvm_get_supported_cpuid());
-#endif
vcpu_args[vcpu_id].vm = vm;
vcpu_args[vcpu_id].vcpu_id = vcpu_id;
vcpu_args[vcpu_id].gva = guest_test_virt_mem +
			 (vcpu_id * vcpu_memory_bytes);
vcpu_args[vcpu_id].pages = vcpu_memory_bytes / guest_page_size;
}
/* Export the shared variables to the guest */

sync_global_to_guest(vm, host_page_size);

sync_global_to_guest(vm, guest_page_size);

sync_global_to_guest(vm, vcpu_args);

pr_info("Finished creating vCPUs and starting uffd threads\n");

clock_gettime(CLOCK_MONOTONIC, &start); for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { pthread_create(&vcpu_threads[vcpu_id], NULL, vcpu_worker,
	       &vcpu_args[vcpu_id]);
	       &perf_test_args.vcpu_args[vcpu_id]);
}
pr_info("Started all vCPUs\n"); @@ -513,7 +350,8 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, pr_info("Total guest execution time: %ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec); pr_info("Overall demand paging rate: %f pgs/sec\n",
guest_num_pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
perf_test_args.vcpu_args[0].pages * vcpus /
((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0));
ucall_uninit(vm); kvm_vm_free(vm); diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h new file mode 100644 index 0000000000000..4d52b9ee13c42 --- /dev/null +++ b/tools/testing/selftests/kvm/include/perf_test_util.h @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +/*

tools/testing/selftests/kvm/include/perf_test_util.h

This new file doesn't yet have anything to do with performance testing, and since it's going to be common between performance tests and functional tests then 'perf' doesn't belong in the name. Actually, I'm not even sure we need a new file at this point. Why not just use test_util.h?

...

Copyright (C) 2020, Google LLC.

*/

+#ifndef SELFTEST_KVM_PERF_TEST_UTIL_H +#define SELFTEST_KVM_PERF_TEST_UTIL_H

+#include "kvm_util.h" +#include "processor.h"

+#define MAX_VCPUS 512

It's best if this patch is only code movement, but I suggest we apply a patch before this one that gets rid of this define, replacing it with kvm_check_cap(KVM_CAP_MAX_VCPUS)

...

+#define PAGE_SHIFT_4K 12 +#define PTES_PER_4K_PT 512

This assumes an 8 byte page table descriptor, which is probably fine, but it's another candidate for a generalization patch before this one. I think we need something that adds arch-specific API enabling us to ask how many descriptors are in a minimally-sized page, e.g.

min_page_size = get_min_page_size(); ptrs_per_pte = get_per_pte_ptrs(min_page_size);

...

+#define TEST_MEM_SLOT_INDEX 1

This isn't the type of define that belongs in a common header.

...

+/* Default guest test virtual memory offset */ +#define DEFAULT_GUEST_TEST_MEM 0xc0000000

+extern uint64_t guest_test_phys_mem; +extern uint64_t guest_test_virt_mem;

+struct vcpu_args {

The structure name is too generic for its purpose. I think we need to do a handful of renames in a patch or two prior to moving the code out. The names should be as general as possible, but "vcpu_args" goes to far, considering the members are specific for the context of these types of tests.

...

uint64_t gva;

uint64_t pages;

/* Only used by the host userspace part of the vCPU thread */

int vcpu_id;

+};

+struct perf_test_args {

This structure isn't 'perf' specific and probably won't be.

I'm starting to think that these structures should be private to the lib/some-name.c file that uses them. Accessors can be exported to get/set specific values from the unit tests.

...

struct kvm_vm *vm;

uint64_t host_page_size;

uint64_t guest_page_size;

struct vcpu_args vcpu_args[MAX_VCPUS];

+};

+extern struct perf_test_args perf_test_args;

+void guest_code(uint32_t vcpu_id);

I don't think this function needs to be exported from lib/some-name.c.

...

+struct kvm_vm *create_vm(enum vm_guest_mode mode, int vcpus,
	 uint64_t vcpu_memory_bytes);

I'd rather not export this half-generic vm_create() variant. I'd rather we create an actually generic vm_create_default_vcpus() instead and then use it.

...

+#endif /* SELFTEST_KVM_PERF_TEST_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c new file mode 100644 index 0000000000000..fa7efbac9ef8a --- /dev/null +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c @@ -0,0 +1,161 @@ +// SPDX-License-Identifier: GPL-2.0-only +/*

tools/testing/selftests/kvm/lib/perf_test_util.c

Copyright (C) 2020, Google LLC.

*/

+#define _GNU_SOURCE /* for program_invocation_name */

+#include "perf_test_util.h"

+/*

Guest physical memory offset of the testing memory slot.

This will be set to the topmost valid physical address minus

the test memory size.

*/

+uint64_t guest_test_phys_mem;

+/*

Guest virtual memory offset of the testing memory slot.

Must not conflict with identity mapped test code.

*/

+uint64_t guest_test_virt_mem = DEFAULT_GUEST_TEST_MEM;

+struct perf_test_args perf_test_args;

+/*

Continuously write to the first 8 bytes of each page in the

specified region.

*/

+void guest_code(uint32_t vcpu_id) +{
struct vcpu_args *vcpu_args = &perf_test_args.vcpu_args[vcpu_id];

uint64_t gva;

uint64_t pages;

int i;

/* Make sure vCPU args data structure is not corrupt. */

GUEST_ASSERT(vcpu_args->vcpu_id == vcpu_id);

gva = vcpu_args->gva;

pages = vcpu_args->pages;

for (i = 0; i < pages; i++) {
uint64_t addr = gva + (i * perf_test_args.guest_page_size);
*(uint64_t *)addr = 0x0123456789ABCDEF;
}

GUEST_SYNC(1);
+}

+static void add_vcpus(struct kvm_vm *vm, int vcpus, uint64_t vcpu_memory_bytes) +{
vm_paddr_t vcpu_gpa;

struct vcpu_args *vcpu_args;

int vcpu_id;

for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) {
vcpu_args = &perf_test_args.vcpu_args[vcpu_id];
vm_vcpu_add_default(vm, vcpu_id, guest_code);
+#ifdef __x86_64__
vcpu_set_cpuid(vm, vcpu_id, kvm_get_supported_cpuid());
+#endif
vcpu_args->vcpu_id = vcpu_id;
vcpu_args->gva = guest_test_virt_mem +
		 (vcpu_id * vcpu_memory_bytes);
vcpu_args->pages = vcpu_memory_bytes /
		   perf_test_args.guest_page_size;
vcpu_gpa = guest_test_phys_mem + (vcpu_id * vcpu_memory_bytes);
pr_debug("Added VCPU %d with test mem gpa [%lx, %lx)\n",
	 vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_memory_bytes);
}
+}

+struct kvm_vm *create_vm(enum vm_guest_mode mode, int vcpus,
	 uint64_t vcpu_memory_bytes)
+{
struct kvm_vm *vm;

uint64_t pages = DEFAULT_GUEST_PHY_PAGES;

uint64_t guest_num_pages;

/* Account for a few pages per-vCPU for stacks */

pages += DEFAULT_STACK_PGS * vcpus;

/*
* Reserve twice the ammount of memory needed to map the test region and
* the page table / stacks region, at 4k, for page tables. Do the
* calculation with 4K page size: the smallest of all archs. (e.g., 64K
* page size guest will need even less memory for page tables).
*/
pages += (2 * pages) / PTES_PER_4K_PT;

pages += ((2 * vcpus * vcpu_memory_bytes) >> PAGE_SHIFT_4K) /
 PTES_PER_4K_PT;
pages = vm_adjust_num_guest_pages(mode, pages);

pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode));

vm = _vm_create(mode, pages, O_RDWR);

kvm_vm_elf_load(vm, program_invocation_name, 0, 0);
+#ifdef __x86_64__

vm_create_irqchip(vm);

+#endif
perf_test_args.vm = vm;

perf_test_args.guest_page_size = vm_get_page_size(vm);

perf_test_args.host_page_size = getpagesize();

TEST_ASSERT(vcpu_memory_bytes % perf_test_args.guest_page_size == 0,
    "Guest memory size is not guest page size aligned.");
guest_num_pages = (vcpus * vcpu_memory_bytes) /
	  perf_test_args.guest_page_size;
guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages);

/*
* If there should be more memory in the guest test region than there
* can be pages in the guest, it will definitely cause problems.
*/
TEST_ASSERT(guest_num_pages < vm_get_max_gfn(vm),
    "Requested more guest memory than address space allows.\n"
    "    guest pages: %lx max gfn: %x vcpus: %d wss: %lx]\n",
    guest_num_pages, vm_get_max_gfn(vm), vcpus,
    vcpu_memory_bytes);
TEST_ASSERT(vcpu_memory_bytes % perf_test_args.host_page_size == 0,
    "Guest memory size is not host page size aligned.");
guest_test_phys_mem = (vm_get_max_gfn(vm) - guest_num_pages) *
	      perf_test_args.guest_page_size;
guest_test_phys_mem &= ~(perf_test_args.host_page_size - 1);
+#ifdef __s390x__

/* Align to 1M (segment size) */

guest_test_phys_mem &= ~((1 << 20) - 1);

+#endif
pr_info("guest physical test memory offset: 0x%lx\n", guest_test_phys_mem);

/* Add an extra memory slot for testing */

vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
		    guest_test_phys_mem,
		    TEST_MEM_SLOT_INDEX,
		    guest_num_pages, 0);
/* Do mapping for the demand paging memory slot */

virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, guest_num_pages, 0);

ucall_init(vm, NULL);

add_vcpus(vm, vcpus, vcpu_memory_bytes);

sync_global_to_guest(vm, perf_test_args);

return vm;
+}

A code movement patch should be movement and nothing else. Refactoring should be done first within the same file (probably with more than one patch) and then the moving of the factored out functions to a new file in another patch. Otherwise reviewing it is nearly impossible.

If you don't mind I'd like to try and cleanup / generalize / refactor demand_paging_test.c and dirty_log_test.c with a few patches first for you to base this work on. I can probably get something posted today or tomorrow.

Thanks, drew

Peter Xu

3 p.m.

New subject: [PATCH v2 2/5] KVM: selftests: Factor code out of demand_paging_test

On Wed, Nov 04, 2020 at 01:16:31PM +0100, Andrew Jones wrote:

...

If you don't mind I'd like to try and cleanup / generalize / refactor demand_paging_test.c and dirty_log_test.c with a few patches first for you to base this work on. I can probably get something posted today or tomorrow.

Drew,

Would you consider picking up the two patches below in the dirty ring series if you plan to rework the dirty log tests? I got your r-b so I am making bold to think I'm ok to ask this; I just want to avoid another potential conflict within the series.

Thanks!

[1] https://lore.kernel.org/kvm/20201023183358.50607-11-peterx@redhat.com/ [2] https://lore.kernel.org/kvm/20201023183358.50607-12-peterx@redhat.com/

-- Peter Xu

Andrew Jones

3:28 p.m.

New subject: [PATCH v2 2/5] KVM: selftests: Factor code out of demand_paging_test

On Wed, Nov 04, 2020 at 10:00:17AM -0500, Peter Xu wrote:

...

On Wed, Nov 04, 2020 at 01:16:31PM +0100, Andrew Jones wrote:

...
If you don't mind I'd like to try and cleanup / generalize / refactor demand_paging_test.c and dirty_log_test.c with a few patches first for you to base this work on. I can probably get something posted today or tomorrow.

Drew,

Would you consider picking up the two patches below in the dirty ring series if you plan to rework the dirty log tests? I got your r-b so I am making bold to think I'm ok to ask this; I just want to avoid another potential conflict within the series.

Sure, no problem.

I'll go ahead and get that cleanup / refactor series out.

Thanks, drew

...

Thanks!

[1] https://lore.kernel.org/kvm/20201023183358.50607-11-peterx@redhat.com/ [2] https://lore.kernel.org/kvm/20201023183358.50607-12-peterx@redhat.com/

-- Peter Xu

Ben Gardon

5:01 p.m.

New subject: [PATCH v2 2/5] KVM: selftests: Factor code out of demand_paging_test

On Wed, Nov 4, 2020 at 7:28 AM Andrew Jones drjones@redhat.com wrote:

...

On Wed, Nov 04, 2020 at 10:00:17AM -0500, Peter Xu wrote:

...
On Wed, Nov 04, 2020 at 01:16:31PM +0100, Andrew Jones wrote:

...
If you don't mind I'd like to try and cleanup / generalize / refactor demand_paging_test.c and dirty_log_test.c with a few patches first for you to base this work on. I can probably get something posted today or tomorrow.

Drew,

Would you consider picking up the two patches below in the dirty ring series if you plan to rework the dirty log tests? I got your r-b so I am making bold to think I'm ok to ask this; I just want to avoid another potential conflict within the series.

Sure, no problem.

I'll go ahead and get that cleanup / refactor series out.

Thanks Drew! I agree this will all be a lot cleaner when refactored for multi-vcpu tests generally.

...

Thanks, drew

...
Thanks!

[1] https://lore.kernel.org/kvm/20201023183358.50607-11-peterx@redhat.com/ [2] https://lore.kernel.org/kvm/20201023183358.50607-12-peterx@redhat.com/

-- Peter Xu

Ben Gardon

3 Nov 3 Nov

11:49 p.m.

New subject: [PATCH v2 3/5] KVM: selftests: Simplify demand_paging_test with timespec_diff_now

Add a helper function to get the current time and return the time since a given start time. Use that function to simplify the timekeeping in the demand paging test.

Signed-off-by: Ben Gardon bgardon@google.com --- .../selftests/kvm/demand_paging_test.c | 26 +++++++++---------- .../testing/selftests/kvm/include/test_util.h | 1 + tools/testing/selftests/kvm/lib/test_util.c | 15 +++++++++-- 3 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 682805dd8c2ac..63ea7c06e1141 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -50,7 +50,8 @@ static void *vcpu_worker(void *data) int vcpu_id = vcpu_args->vcpu_id; struct kvm_vm *vm = perf_test_args.vm; struct kvm_run *run; - struct timespec start, end, ts_diff; + struct timespec start; + struct timespec ts_diff;

vcpu_args_set(vm, vcpu_id, 1, vcpu_id); run = vcpu_state(vm, vcpu_id); @@ -66,8 +67,7 @@ static void *vcpu_worker(void *data) exit_reason_str(run->exit_reason)); }

- clock_gettime(CLOCK_MONOTONIC, &end); - ts_diff = timespec_sub(end, start); + ts_diff = timespec_elapsed(start); PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds\n", vcpu_id, ts_diff.tv_sec, ts_diff.tv_nsec);

@@ -78,7 +78,7 @@ static int handle_uffd_page_request(int uffd, uint64_t addr) { pid_t tid; struct timespec start; - struct timespec end; + struct timespec ts_diff; struct uffdio_copy copy; int r;

@@ -98,10 +98,10 @@ static int handle_uffd_page_request(int uffd, uint64_t addr) return r; }

- clock_gettime(CLOCK_MONOTONIC, &end); + ts_diff = timespec_elapsed(start);

PER_PAGE_DEBUG("UFFDIO_COPY %d \t%ld ns\n", tid, - timespec_to_ns(timespec_sub(end, start))); + timespec_to_ns(ts_diff)); PER_PAGE_DEBUG("Paged in %ld bytes at 0x%lx from thread %d\n", perf_test_args.host_page_size, addr, tid);

@@ -123,7 +123,8 @@ static void *uffd_handler_thread_fn(void *arg) int pipefd = uffd_args->pipefd; useconds_t delay = uffd_args->delay; int64_t pages = 0; - struct timespec start, end, ts_diff; + struct timespec start; + struct timespec ts_diff;

clock_gettime(CLOCK_MONOTONIC, &start); while (!quit_uffd_thread) { @@ -192,8 +193,7 @@ static void *uffd_handler_thread_fn(void *arg) pages++; }

- clock_gettime(CLOCK_MONOTONIC, &end); - ts_diff = timespec_sub(end, start); + ts_diff = timespec_elapsed(start); PER_VCPU_DEBUG("userfaulted %ld pages over %ld.%.9lds. (%f/sec)\n", pages, ts_diff.tv_sec, ts_diff.tv_nsec, pages / ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0)); @@ -257,7 +257,8 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, pthread_t *vcpu_threads; pthread_t *uffd_handler_threads = NULL; struct uffd_handler_args *uffd_args = NULL; - struct timespec start, end, ts_diff; + struct timespec start; + struct timespec ts_diff; int *pipefds = NULL; struct kvm_vm *vm; int vcpu_id; @@ -330,9 +331,9 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, PER_VCPU_DEBUG("Joined thread for vCPU %d\n", vcpu_id); }

- pr_info("All vCPU threads joined\n"); + ts_diff = timespec_elapsed(start);

- clock_gettime(CLOCK_MONOTONIC, &end); + pr_info("All vCPU threads joined\n");

if (use_uffd) { char c; @@ -346,7 +347,6 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, } }

- ts_diff = timespec_sub(end, start); pr_info("Total guest execution time: %ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec); pr_info("Overall demand paging rate: %f pgs/sec\n", diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 5eb01bf51b86f..a1564f98223d9 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -64,5 +64,6 @@ int64_t timespec_to_ns(struct timespec ts); struct timespec timespec_add_ns(struct timespec ts, int64_t ns); struct timespec timespec_add(struct timespec ts1, struct timespec ts2); struct timespec timespec_sub(struct timespec ts1, struct timespec ts2); +struct timespec timespec_elapsed(struct timespec start);

#endif /* SELFTEST_KVM_TEST_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index 689e97c27ee24..c2cee1ea20a31 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -4,10 +4,13 @@ * * Copyright (C) 2020, Google LLC. */ -#include <stdlib.h> + +#include <assert.h> #include <ctype.h> #include <limits.h> -#include <assert.h> +#include <stdlib.h> +#include <time.h> + #include "test_util.h"

/* @@ -81,6 +84,14 @@ struct timespec timespec_sub(struct timespec ts1, struct timespec ts2) return timespec_add_ns((struct timespec){0}, ns1 - ns2); }

+struct timespec timespec_elapsed(struct timespec start) +{ + struct timespec end; + + clock_gettime(CLOCK_MONOTONIC, &end); + return timespec_sub(end, start); +} + void print_skip(const char *fmt, ...) { va_list ap;

-- 2.29.1.341.ge80a0c044ae-goog

Ben Gardon

11:49 p.m.

New subject: [PATCH v2 4/5] KVM: selftests: Add wrfract to common guest code

Wrfract will be used by the dirty logging perf test introduced later in this series to dirty memory sparsely.

Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/demand_paging_test.c | 2 ++ tools/testing/selftests/kvm/include/perf_test_util.h | 1 + tools/testing/selftests/kvm/lib/perf_test_util.c | 5 ++++- 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 63ea7c06e1141..72b8890c0dc3b 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -264,6 +264,8 @@ static void run_test(enum vm_guest_mode mode, bool use_uffd, int vcpu_id; int r;

+ perf_test_args.wr_fract = 1; + vm = create_vm(mode, vcpus, vcpu_memory_bytes);

guest_data_prototype = malloc(perf_test_args.host_page_size); diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h index 4d52b9ee13c42..645b942ae0229 100644 --- a/tools/testing/selftests/kvm/include/perf_test_util.h +++ b/tools/testing/selftests/kvm/include/perf_test_util.h @@ -36,6 +36,7 @@ struct perf_test_args { struct kvm_vm *vm; uint64_t host_page_size; uint64_t guest_page_size; + int wr_fract;

struct vcpu_args vcpu_args[MAX_VCPUS]; }; diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c index fa7efbac9ef8a..1abb6b1321c3c 100644 --- a/tools/testing/selftests/kvm/lib/perf_test_util.c +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c @@ -44,7 +44,10 @@ void guest_code(uint32_t vcpu_id) for (i = 0; i < pages; i++) { uint64_t addr = gva + (i * perf_test_args.guest_page_size);

- *(uint64_t *)addr = 0x0123456789ABCDEF; + if (i % perf_test_args.wr_fract == 0) + *(uint64_t *)addr = 0x0123456789ABCDEF; + else + READ_ONCE(*(uint64_t *)addr); }

GUEST_SYNC(1);

-- 2.29.1.341.ge80a0c044ae-goog

Ben Gardon

11:49 p.m.

New subject: [PATCH v2 5/5] KVM: selftests: Introduce the dirty log perf test

The dirty log perf test will time verious dirty logging operations (enabling dirty logging, dirtying memory, getting the dirty log, clearing the dirty log, and disabling dirty logging) in order to quantify dirty logging performance. This test can be used to inform future performance improvements to KVM's dirty logging infrastructure.

Signed-off-by: Ben Gardon bgardon@google.com --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/dirty_log_perf_test.c | 381 ++++++++++++++++++ .../testing/selftests/kvm/include/test_util.h | 1 + .../selftests/kvm/lib/perf_test_util.c | 18 +- tools/testing/selftests/kvm/lib/test_util.c | 7 + 6 files changed, 401 insertions(+), 8 deletions(-) create mode 100644 tools/testing/selftests/kvm/dirty_log_perf_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index d2c2d6205008c..c8adc4f6e8f6c 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -24,6 +24,7 @@ /clear_dirty_log_test /demand_paging_test /dirty_log_test +/dirty_log_perf_test /kvm_create_max_vcpus /set_memory_region_test /steal_time diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 9b2bebb64175b..518ca3edafa29 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -63,6 +63,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/user_msr_test TEST_GEN_PROGS_x86_64 += clear_dirty_log_test TEST_GEN_PROGS_x86_64 += demand_paging_test TEST_GEN_PROGS_x86_64 += dirty_log_test +TEST_GEN_PROGS_x86_64 += dirty_log_perf_test TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus TEST_GEN_PROGS_x86_64 += set_memory_region_test TEST_GEN_PROGS_x86_64 += steal_time diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c new file mode 100644 index 0000000000000..bfbfec2313e22 --- /dev/null +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -0,0 +1,381 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KVM dirty page logging performance test + * + * Based on dirty_log_test.c + * + * Copyright (C) 2018, Red Hat, Inc. + * Copyright (C) 2020, Google, Inc. + */ + +#define _GNU_SOURCE /* for program_invocation_name */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <time.h> +#include <pthread.h> +#include <linux/bitmap.h> +#include <linux/bitops.h> + +#include "kvm_util.h" +#include "perf_test_util.h" +#include "processor.h" +#include "test_util.h" + +/* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/ +#define TEST_HOST_LOOP_N 2UL + +#define DEFAULT_VCPU_MEMORY_BYTES (1UL << 30) /* 1G */ + +/* Host variables */ +static bool host_quit; +static int iteration; +static int vcpu_last_completed_iteration[MAX_VCPUS]; + +static void *vcpu_worker(void *data) +{ + int ret; + struct kvm_vm *vm = perf_test_args.vm; + uint64_t pages_count = 0; + struct kvm_run *run; + struct timespec start; + struct timespec ts_diff; + struct timespec total = (struct timespec){0}; + struct timespec avg; + struct vcpu_args *vcpu_args = (struct vcpu_args *)data; + int vcpu_id = vcpu_args->vcpu_id; + + vcpu_args_set(vm, vcpu_id, 1, vcpu_id); + run = vcpu_state(vm, vcpu_id); + + while (!READ_ONCE(host_quit)) { + int current_iteration = READ_ONCE(iteration); + + clock_gettime(CLOCK_MONOTONIC, &start); + ret = _vcpu_run(vm, vcpu_id); + ts_diff = timespec_elapsed(start); + + TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); + TEST_ASSERT(get_ucall(vm, vcpu_id, NULL) == UCALL_SYNC, + "Invalid guest sync status: exit_reason=%s\n", + exit_reason_str(run->exit_reason)); + + pr_debug("Got sync event from vCPU %d\n", vcpu_id); + vcpu_last_completed_iteration[vcpu_id] = current_iteration; + pr_debug("vCPU %d updated last completed iteration to %d\n", + vcpu_id, vcpu_last_completed_iteration[vcpu_id]); + + if (current_iteration) { + pages_count += vcpu_args->pages; + total = timespec_add(total, ts_diff); + pr_debug("vCPU %d iteration %d dirty memory time: %ld.%.9lds\n", + vcpu_id, current_iteration, ts_diff.tv_sec, + ts_diff.tv_nsec); + } else { + pr_debug("vCPU %d iteration %d populate memory time: %ld.%.9lds\n", + vcpu_id, current_iteration, ts_diff.tv_sec, + ts_diff.tv_nsec); + } + + while (current_iteration == READ_ONCE(iteration) && + !READ_ONCE(host_quit)) {} + } + + avg = timespec_div(total, vcpu_last_completed_iteration[vcpu_id]); + pr_debug("\nvCPU %d dirtied 0x%lx pages over %d iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n", + vcpu_id, pages_count, vcpu_last_completed_iteration[vcpu_id], + total.tv_sec, total.tv_nsec, avg.tv_sec, avg.tv_nsec); + + return NULL; +} + +#ifdef USE_CLEAR_DIRTY_LOG +static u64 dirty_log_manual_caps; +#endif + +static void run_test(enum vm_guest_mode mode, unsigned long iterations, + uint64_t phys_offset, int vcpus, + uint64_t vcpu_memory_bytes, int wr_fract) +{ + pthread_t *vcpu_threads; + struct kvm_vm *vm; + unsigned long *bmap; + uint64_t guest_num_pages; + uint64_t host_num_pages; + int vcpu_id; + struct timespec start; + struct timespec ts_diff; + struct timespec get_dirty_log_total = (struct timespec){0}; + struct timespec vcpu_dirty_total = (struct timespec){0}; + struct timespec avg; +#ifdef USE_CLEAR_DIRTY_LOG + struct kvm_enable_cap cap = {}; + struct timespec clear_dirty_log_total = (struct timespec){0}; +#endif + + perf_test_args.wr_fract = wr_fract; + + vm = create_vm(mode, vcpus, vcpu_memory_bytes); + + guest_num_pages = (vcpus * vcpu_memory_bytes) >> vm_get_page_shift(vm); + guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages); + host_num_pages = vm_num_host_pages(mode, guest_num_pages); + bmap = bitmap_alloc(host_num_pages); + +#ifdef USE_CLEAR_DIRTY_LOG + cap.cap = KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2; + cap.args[0] = dirty_log_manual_caps; + vm_enable_cap(vm, &cap); +#endif + + vcpu_threads = malloc(vcpus * sizeof(*vcpu_threads)); + TEST_ASSERT(vcpu_threads, "Memory allocation failed"); + + + /* Start the iterations */ + iteration = 0; + host_quit = false; + + clock_gettime(CLOCK_MONOTONIC, &start); + for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { + vcpu_last_completed_iteration[vcpu_id] = -1; + + pthread_create(&vcpu_threads[vcpu_id], NULL, vcpu_worker, + &perf_test_args.vcpu_args[vcpu_id]); + } + + /* Allow the vCPUs to populate memory */ + pr_debug("Starting iteration %d - Populating\n", iteration); + for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { + while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != + iteration) {} + } + + ts_diff = timespec_elapsed(start); + pr_info("Populate memory time: %ld.%.9lds\n", + ts_diff.tv_sec, ts_diff.tv_nsec); + + /* Enable dirty logging */ + clock_gettime(CLOCK_MONOTONIC, &start); + vm_mem_region_set_flags(vm, TEST_MEM_SLOT_INDEX, + KVM_MEM_LOG_DIRTY_PAGES); + ts_diff = timespec_elapsed(start); + pr_info("Enabling dirty logging time: %ld.%.9lds\n\n", + ts_diff.tv_sec, ts_diff.tv_nsec); + + while (iteration < iterations) { + /* + * Incrementing the iteration number will start the vCPUs + * dirtying memory again. + */ + clock_gettime(CLOCK_MONOTONIC, &start); + iteration++; + + pr_debug("Starting iteration %d\n", iteration); + for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { + while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != + iteration) {} + } + + ts_diff = timespec_elapsed(start); + vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff); + pr_info("Iteration %d dirty memory time: %ld.%.9lds\n", + iteration, ts_diff.tv_sec, ts_diff.tv_nsec); + + clock_gettime(CLOCK_MONOTONIC, &start); + kvm_vm_get_dirty_log(vm, TEST_MEM_SLOT_INDEX, bmap); + + ts_diff = timespec_elapsed(start); + get_dirty_log_total = timespec_add(get_dirty_log_total, + ts_diff); + pr_info("Iteration %d get dirty log time: %ld.%.9lds\n", + iteration, ts_diff.tv_sec, ts_diff.tv_nsec); + +#ifdef USE_CLEAR_DIRTY_LOG + clock_gettime(CLOCK_MONOTONIC, &start); + kvm_vm_clear_dirty_log(vm, TEST_MEM_SLOT_INDEX, bmap, 0, + host_num_pages); + + ts_diff = timespec_elapsed(start); + clear_dirty_log_total = timespec_add(clear_dirty_log_total, + ts_diff); + pr_info("Iteration %d clear dirty log time: %ld.%.9lds\n", + iteration, ts_diff.tv_sec, ts_diff.tv_nsec); +#endif + } + + /* Tell the vcpu thread to quit */ + host_quit = true; + for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) + pthread_join(vcpu_threads[vcpu_id], NULL); + + /* Disable dirty logging */ + clock_gettime(CLOCK_MONOTONIC, &start); + vm_mem_region_set_flags(vm, TEST_MEM_SLOT_INDEX, 0); + ts_diff = timespec_elapsed(start); + pr_info("Disabling dirty logging time: %ld.%.9lds\n", + ts_diff.tv_sec, ts_diff.tv_nsec); + + avg = timespec_div(get_dirty_log_total, iterations); + pr_info("Get dirty log over %lu iterations took %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n", + iterations, get_dirty_log_total.tv_sec, + get_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec); + +#ifdef USE_CLEAR_DIRTY_LOG + avg = timespec_div(clear_dirty_log_total, iterations); + pr_info("Clear dirty log over %lu iterations took %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n", + iterations, clear_dirty_log_total.tv_sec, + clear_dirty_log_total.tv_nsec, avg.tv_sec, avg.tv_nsec); +#endif + + free(bmap); + free(vcpu_threads); + ucall_uninit(vm); + kvm_vm_free(vm); +} + +struct guest_mode { + bool supported; + bool enabled; +}; +static struct guest_mode guest_modes[NUM_VM_MODES]; + +#define guest_mode_init(mode, supported, enabled) ({ \ + guest_modes[mode] = (struct guest_mode){ supported, enabled }; \ +}) + +static void help(char *name) +{ + int i; + + puts(""); + printf("usage: %s [-h] [-i iterations] [-p offset] " + "[-m mode] [-b vcpu bytes] [-v vcpus]\n", name); + puts(""); + printf(" -i: specify iteration counts (default: %"PRIu64")\n", + TEST_HOST_LOOP_N); + printf(" -p: specify guest physical test memory offset\n" + " Warning: a low offset can conflict with the loaded test code.\n"); + printf(" -m: specify the guest mode ID to test " + "(default: test all supported modes)\n" + " This option may be used multiple times.\n" + " Guest mode IDs:\n"); + for (i = 0; i < NUM_VM_MODES; ++i) { + printf(" %d: %s%s\n", i, vm_guest_mode_string(i), + guest_modes[i].supported ? " (supported)" : ""); + } + printf(" -b: specify the size of the memory region which should be\n" + " dirtied by each vCPU. e.g. 10M or 3G.\n" + " (default: 1G)\n"); + printf(" -f: specify the fraction of pages which should be written to\n" + " as opposed to simply read, in the form\n" + " 1/<fraction of pages to write>.\n" + " (default: 1 i.e. all pages are written to.)\n"); + printf(" -v: specify the number of vCPUs to run.\n"); + puts(""); + exit(0); +} + +int main(int argc, char *argv[]) +{ + unsigned long iterations = TEST_HOST_LOOP_N; + uint64_t vcpu_memory_bytes = DEFAULT_VCPU_MEMORY_BYTES; + bool mode_selected = false; + uint64_t phys_offset = 0; + unsigned int mode; + int opt, i; + int wr_fract = 1; + int vcpus = 1; + +#ifdef USE_CLEAR_DIRTY_LOG + dirty_log_manual_caps = + kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2); + if (!dirty_log_manual_caps) { + print_skip("KVM_CLEAR_DIRTY_LOG not available"); + exit(KSFT_SKIP); + } + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | + KVM_DIRTY_LOG_INITIALLY_SET); +#endif + +#ifdef __x86_64__ + guest_mode_init(VM_MODE_PXXV48_4K, true, true); +#endif +#ifdef __aarch64__ + guest_mode_init(VM_MODE_P40V48_4K, true, true); + guest_mode_init(VM_MODE_P40V48_64K, true, true); + + { + unsigned int limit = kvm_check_cap(KVM_CAP_ARM_VM_IPA_SIZE); + + if (limit >= 52) + guest_mode_init(VM_MODE_P52V48_64K, true, true); + if (limit >= 48) { + guest_mode_init(VM_MODE_P48V48_4K, true, true); + guest_mode_init(VM_MODE_P48V48_64K, true, true); + } + } +#endif +#ifdef __s390x__ + guest_mode_init(VM_MODE_P40V48_4K, true, true); +#endif + + while ((opt = getopt(argc, argv, "hi:p:m:b:f:v:")) != -1) { + switch (opt) { + case 'i': + iterations = strtol(optarg, NULL, 10); + break; + case 'p': + phys_offset = strtoull(optarg, NULL, 0); + break; + case 'm': + if (!mode_selected) { + for (i = 0; i < NUM_VM_MODES; ++i) + guest_modes[i].enabled = false; + mode_selected = true; + } + mode = strtoul(optarg, NULL, 10); + TEST_ASSERT(mode < NUM_VM_MODES, + "Guest mode ID %d too big", mode); + guest_modes[mode].enabled = true; + break; + case 'b': + vcpu_memory_bytes = parse_size(optarg); + break; + case 'f': + wr_fract = atoi(optarg); + TEST_ASSERT(wr_fract >= 1, + "Write fraction cannot be less than one"); + break; + case 'v': + vcpus = atoi(optarg); + TEST_ASSERT(vcpus > 0, + "Must have a positive number of vCPUs"); + TEST_ASSERT(vcpus <= MAX_VCPUS, + "This test does not currently support\n" + "more than %d vCPUs.", MAX_VCPUS); + break; + case 'h': + default: + help(argv[0]); + break; + } + } + + TEST_ASSERT(iterations > 2, "Iterations must be greater than two"); + + pr_info("Test iterations: %"PRIu64"\n", iterations); + + for (i = 0; i < NUM_VM_MODES; ++i) { + if (!guest_modes[i].enabled) + continue; + TEST_ASSERT(guest_modes[i].supported, + "Guest mode ID %d (%s) not supported.", + i, vm_guest_mode_string(i)); + run_test(i, iterations, phys_offset, vcpus, vcpu_memory_bytes, + wr_fract); + } + + return 0; +} diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index a1564f98223d9..b86090ef82dac 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -65,5 +65,6 @@ struct timespec timespec_add_ns(struct timespec ts, int64_t ns); struct timespec timespec_add(struct timespec ts1, struct timespec ts2); struct timespec timespec_sub(struct timespec ts1, struct timespec ts2); struct timespec timespec_elapsed(struct timespec start); +struct timespec timespec_div(struct timespec ts, int divisor);

#endif /* SELFTEST_KVM_TEST_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c index 1abb6b1321c3c..ccc5d67f38cc4 100644 --- a/tools/testing/selftests/kvm/lib/perf_test_util.c +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c @@ -41,16 +41,18 @@ void guest_code(uint32_t vcpu_id) gva = vcpu_args->gva; pages = vcpu_args->pages;

- for (i = 0; i < pages; i++) { - uint64_t addr = gva + (i * perf_test_args.guest_page_size); + while (true) { + for (i = 0; i < pages; i++) { + uint64_t addr = gva + (i * perf_test_args.guest_page_size);

- if (i % perf_test_args.wr_fract == 0) - *(uint64_t *)addr = 0x0123456789ABCDEF; - else - READ_ONCE(*(uint64_t *)addr); - } + if (i % perf_test_args.wr_fract == 0) + *(uint64_t *)addr = 0x0123456789ABCDEF; + else + READ_ONCE(*(uint64_t *)addr); + }

- GUEST_SYNC(1); + GUEST_SYNC(1); + } }

static void add_vcpus(struct kvm_vm *vm, int vcpus, uint64_t vcpu_memory_bytes) diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index c2cee1ea20a31..5f87ed32caf56 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -92,6 +92,13 @@ struct timespec timespec_elapsed(struct timespec start) return timespec_sub(end, start); }

+struct timespec timespec_div(struct timespec ts, int divisor) +{ + int64_t ns = timespec_to_ns(ts) / divisor; + + return timespec_add_ns((struct timespec){0}, ns); +} + void print_skip(const char *fmt, ...) { va_list ap;

-- 2.29.1.341.ge80a0c044ae-goog

Andrew Jones

11 Nov 11 Nov

12:34 p.m.

On Tue, Nov 03, 2020 at 03:49:47PM -0800, Ben Gardon wrote:

...

Currently KVM lacks a simple, userspace agnostic, performance benchmark for dirty logging. Such a benchmark will be beneficial for ensuring that dirty logging performance does not regress, and to give a common baseline for validating performance improvements. The dirty log perf test introduced in this series builds on aspects of the existing demand paging perf test and provides time-based performance metrics for enabling and disabling dirty logging, getting the dirty log, and dirtying memory.

While the test currently only has a build target for x86, I expect it will work on, or be easily modified to support other architectures.

This series was tested by running the following invocations on an Intel Skylake machine after apply all commits in the series: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected.

v1 -> v2 changes:

Considering v1 got applied,

...

(in response to comments from Peter Xu)

Removed pr_debugs from main test thread while waiting on vCPUs to reduce log spam

I didn't look at this. Maybe you and Peter can decide if a pr_debug cleanup patch needs to be sent.

...

Fixed a bug in iteration counting that caused the population stage to be counted as part of the first dirty logging pass

Fixed a bug in which the test failed to wait for the population stage for all but the first vCPU.

I didn't try to confirm that these fixes were in. I do see that Paolo added 6d6a18fdde8b ("KVM: selftests: allow two iterations of dirty_log_perf_test"), which sounds like it might be fixing the same thing as your first "Fixed" bullet above. Anyway, you may want to check the current code to see if any additional fixes should be sent.

...

Refactored the common code in perf_test_util.c/h

I did this part in my "Cleanups, take 2" series

...

Moved testing description to cover letter

Renamed timespec_diff_now to timespec_elapsed

These two would have been nice, but oh well...

Thanks, drew

1798

days inactive

1806

days old

linux-kselftest-mirror@lists.linaro.org

11 comments

participants

tags (0)

participants (3)

Andrew Jones
Ben Gardon
Peter Xu