[RFC V1 PATCH 0/5] selftests: KVM: selftests for fd-based approach of supporting private memory

List overview All Threads
Download

newer

older

[PATCH] Makefile: Fix separate...

[PATCH bpf-next 0/5] bpf...

Vishal Annapurve

8 Apr 2022 8 Apr '22

9:05 p.m.

This series implements selftests targeting the feature floated by Chao via: https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.in...

Below changes aim to test the fd based approach for guest private memory in context of normal (non-confidential) VMs executing on non-confidential platforms.

Confidential platforms along with the confidentiality aware software stack support a notion of private/shared accesses from the confidential VMs. Generally, a bit in the GPA conveys the shared/private-ness of the access. Non-confidential platforms don't have a notion of private or shared accesses from the guest VMs. To support this notion, KVM_HC_MAP_GPA_RANGE is modified to allow marking an access from a VM within a GPA range as always shared or private. Any suggestions regarding implementing this ioctl alternatively/cleanly are appreciated.

priv_memfd_test.c file adds a suite of two basic selftests to access private memory from the guest via private/shared access and checking if the contents can be leaked to/accessed by vmm via shared memory view.

Test results: 1) PMPAT - PrivateMemoryPrivateAccess test passes 2) PMSAT - PrivateMemorySharedAccess test fails currently and needs more analysis to understand the reason of failure.

Important - Below patch is needed to ensure host kernel crash is avoided while running these tests: https://github.com/vishals4gh/linux/commit/b9adedf777ad84af39042e9c19899600a...

Github link for the patches posted as part of this series: https://github.com/vishals4gh/linux/commits/priv_memfd_selftests_v1 Note that this series is dependent on Chao's v5 patches mentioned above applied on top of 5.17.

Vishal Annapurve (5): x86: kvm: HACK: Allow testing of priv memfd approach selftests: kvm: Fix inline assembly for hypercall selftests: kvm: Add a basic selftest test priv memfd selftests: kvm: priv_memfd_test: Add support for memory conversion selftests: kvm: priv_memfd_test: Add shared access test

arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/mmu/mmu.c | 9 +- arch/x86/kvm/x86.c | 16 +- include/linux/kvm_host.h | 3 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/lib/x86_64/processor.c | 2 +- tools/testing/selftests/kvm/priv_memfd_test.c | 410 ++++++++++++++++++ virt/kvm/kvm_main.c | 2 +- 8 files changed, 436 insertions(+), 8 deletions(-) create mode 100644 tools/testing/selftests/kvm/priv_memfd_test.c

-- 2.35.1.1178.g4f1659d476-goog

Show replies by date

Vishal Annapurve

8 Apr 8 Apr

9:05 p.m.

New subject: [RFC V1 PATCH 1/5] x86: kvm: HACK: Allow testing of priv memfd approach

Add plumbing in KVM logic to allow private memfd series: https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.in... to be tested with non-confidential VMs.

1) Existing hypercall KVM_HC_MAP_GPA_RANGE is modified to support marking pages of the guest memory as privately accessed or accessed in a shared fashion.

2) kvm_vcpu_is_private_gfn is defined to allow guest accesses to be categorized as shared or private based on the values set by KVM_HC_MAP_GPA_RANGE hypercall.

3) KVM_MEM_PRIVATE flag for memslots is marked as always supported.

Signed-off-by: Vishal Annapurve vannapurve@google.com --- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/mmu/mmu.c | 9 +++++---- arch/x86/kvm/x86.c | 16 ++++++++++++++-- include/linux/kvm_host.h | 3 +++ virt/kvm/kvm_main.c | 2 +- 5 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 6e64b27b2c1e..3bc9add4095d 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -102,6 +102,7 @@ struct kvm_clock_pairing { #define KVM_MAP_GPA_RANGE_PAGE_SZ_2M (1 << 0) #define KVM_MAP_GPA_RANGE_PAGE_SZ_1G (1 << 1) #define KVM_MAP_GPA_RANGE_ENC_STAT(n) (n << 4) +#define KVM_MARK_GPA_RANGE_ENC_ACCESS (1 << 8) #define KVM_MAP_GPA_RANGE_ENCRYPTED KVM_MAP_GPA_RANGE_ENC_STAT(1) #define KVM_MAP_GPA_RANGE_DECRYPTED KVM_MAP_GPA_RANGE_ENC_STAT(0)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b1a30a751db0..ee9bc36011de 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3895,10 +3895,11 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,

static bool kvm_vcpu_is_private_gfn(struct kvm_vcpu *vcpu, gfn_t gfn) { - /* - * At this time private gfn has not been supported yet. Other patch - * that enables it should change this. - */ + gpa_t priv_gfn_end = vcpu->priv_gfn + vcpu->priv_pages; + + if ((gfn >= vcpu->priv_gfn) && (gfn < priv_gfn_end)) + return true; + return false; }

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 11a949928a85..3b17fa7f2192 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9186,8 +9186,20 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) if (!(vcpu->kvm->arch.hypercall_exit_enabled & (1 << KVM_HC_MAP_GPA_RANGE))) break;

- if (!PAGE_ALIGNED(gpa) || !npages || - gpa_to_gfn(gpa) + npages <= gpa_to_gfn(gpa)) { + if (!PAGE_ALIGNED(gpa) || + gpa_to_gfn(gpa) + npages < gpa_to_gfn(gpa)) { + ret = -KVM_EINVAL; + break; + } + + if (attrs & KVM_MARK_GPA_RANGE_ENC_ACCESS) { + vcpu->priv_gfn = gpa_to_gfn(gpa); + vcpu->priv_pages = npages; + ret = 0; + break; + } + + if (!npages) { ret = -KVM_EINVAL; break; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 0150e952a131..7c12a0bdb495 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -311,6 +311,9 @@ struct kvm_vcpu { u64 requests; unsigned long guest_debug;

+ uint64_t priv_gfn; + uint64_t priv_pages; + struct mutex mutex; struct kvm_run *run;

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index df5311755a40..a31a58aa1b79 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1487,7 +1487,7 @@ static void kvm_replace_memslot(struct kvm *kvm,

bool __weak kvm_arch_private_memory_supported(struct kvm *kvm) { - return false; + return true; }

static int check_memory_region_flags(struct kvm *kvm,

-- 2.35.1.1178.g4f1659d476-goog

Vishal Annapurve

9:05 p.m.

New subject: [RFC V1 PATCH 2/5] selftests: kvm: Fix inline assembly for hypercall

Fix inline assembly for hypercall to explicitly set eax with hypercall number to allow the implementation to work even in cases where compiler would inline the function.

Signed-off-by: Vishal Annapurve vannapurve@google.com --- tools/testing/selftests/kvm/lib/x86_64/processor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 9f000dfb5594..4d88e1a553bf 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -1461,7 +1461,7 @@ uint64_t kvm_hypercall(uint64_t nr, uint64_t a0, uint64_t a1, uint64_t a2,

asm volatile("vmcall" : "=a"(r) - : "b"(a0), "c"(a1), "d"(a2), "S"(a3)); + : "a"(nr), "b"(a0), "c"(a1), "d"(a2), "S"(a3)); return r; }

-- 2.35.1.1178.g4f1659d476-goog

Vishal Annapurve

9:05 p.m.

New subject: [RFC V1 PATCH 3/5] selftests: kvm: Add a basic selftest to test private memory

Add KVM selftest to access private memory privately from the guest to test that memory updates from guest and userspace vmm don't affect each other.

Signed-off-by: Vishal Annapurve vannapurve@google.com --- tools/testing/selftests/kvm/Makefile | 1 + tools/testing/selftests/kvm/priv_memfd_test.c | 257 ++++++++++++++++++ 2 files changed, 258 insertions(+) create mode 100644 tools/testing/selftests/kvm/priv_memfd_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 21c2dbd21a81..f2f9a8546c66 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -97,6 +97,7 @@ TEST_GEN_PROGS_x86_64 += max_guest_memory_test TEST_GEN_PROGS_x86_64 += memslot_modification_stress_test TEST_GEN_PROGS_x86_64 += memslot_perf_test TEST_GEN_PROGS_x86_64 += rseq_test +TEST_GEN_PROGS_x86_64 += priv_memfd_test TEST_GEN_PROGS_x86_64 += set_memory_region_test TEST_GEN_PROGS_x86_64 += steal_time TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test diff --git a/tools/testing/selftests/kvm/priv_memfd_test.c b/tools/testing/selftests/kvm/priv_memfd_test.c new file mode 100644 index 000000000000..11ccdb853a84 --- /dev/null +++ b/tools/testing/selftests/kvm/priv_memfd_test.c @@ -0,0 +1,257 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include <fcntl.h> +#include <sched.h> +#include <signal.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/ioctl.h> + +#include <linux/compiler.h> +#include <linux/kernel.h> +#include <linux/kvm_para.h> +#include <linux/memfd.h> + +#include <test_util.h> +#include <kvm_util.h> +#include <processor.h> + +#define TEST_MEM_GPA 0xb0000000 +#define TEST_MEM_SIZE 0x2000 +#define TEST_MEM_END (TEST_MEM_GPA + TEST_MEM_SIZE) +#define SHARED_MEM_DATA_BYTE 0x66 +#define PRIV_MEM_DATA_BYTE 0x99 + +#define TEST_MEM_SLOT 10 + +#define VCPU_ID 0 + +#define VM_STAGE_PROCESSED(x) pr_info("Processed stage %s\n", #x) + +typedef bool (*vm_stage_handler_fn)(struct kvm_vm *, + void *, uint64_t); +typedef void (*guest_code_fn)(void); +struct test_run_helper { + char *test_desc; + vm_stage_handler_fn vmst_handler; + guest_code_fn guest_fn; + void *shared_mem; + int priv_memfd; +}; + +static bool verify_byte_pattern(void *mem, uint8_t byte, uint32_t size) +{ + uint8_t *buf = (uint8_t *)mem; + + for (uint32_t i = 0; i < size; i++) { + if (buf[i] != byte) + return false; + } + + return true; +} + +/* Test to verify guest private accesses on private memory with following steps: + * 1) Upon entry, guest signals VMM that it has started. + * 2) VMM populates the shared memory with known pattern and continues guest + * execution. + * 3) Guest writes a different pattern on the private memory and signals VMM + * that it has updated private memory. + * 4) VMM verifies its shared memory contents to be same as the data populated + * in step 2 and continues guest execution. + * 5) Guest verifies its private memory contents to be same as the data + * populated in step 3 and marks the end of the guest execution. + */ +#define PMPAT_ID 0 +#define PMPAT_DESC "PrivateMemoryPrivateAccessTest" + +/* Guest code execution stages for private mem access test */ +#define PMPAT_GUEST_STARTED 0ULL +#define PMPAT_GUEST_PRIV_MEM_UPDATED 1ULL + +static bool pmpat_handle_vm_stage(struct kvm_vm *vm, + void *test_info, + uint64_t stage) +{ + void *shared_mem = ((struct test_run_helper *)test_info)->shared_mem; + + switch (stage) { + case PMPAT_GUEST_STARTED: { + /* Initialize the contents of shared memory */ + memset(shared_mem, SHARED_MEM_DATA_BYTE, TEST_MEM_SIZE); + VM_STAGE_PROCESSED(PMPAT_GUEST_STARTED); + break; + } + case PMPAT_GUEST_PRIV_MEM_UPDATED: { + /* verify host updated data is still intact */ + TEST_ASSERT(verify_byte_pattern(shared_mem, + SHARED_MEM_DATA_BYTE, TEST_MEM_SIZE), + "Shared memory view mismatch"); + VM_STAGE_PROCESSED(PMPAT_GUEST_PRIV_MEM_UPDATED); + break; + } + default: + printf("Unhandled VM stage %ld\n", stage); + return false; + } + + return true; +} + +static void pmpat_guest_code(void) +{ + void *priv_mem = (void *)TEST_MEM_GPA; + int ret; + + GUEST_SYNC(PMPAT_GUEST_STARTED); + + /* Mark the GPA range to be treated as always accessed privately */ + ret = kvm_hypercall(KVM_HC_MAP_GPA_RANGE, TEST_MEM_GPA, + TEST_MEM_SIZE >> MIN_PAGE_SHIFT, + KVM_MARK_GPA_RANGE_ENC_ACCESS, 0); + GUEST_ASSERT_1(ret == 0, ret); + + memset(priv_mem, PRIV_MEM_DATA_BYTE, TEST_MEM_SIZE); + GUEST_SYNC(PMPAT_GUEST_PRIV_MEM_UPDATED); + + GUEST_ASSERT(verify_byte_pattern(priv_mem, + PRIV_MEM_DATA_BYTE, TEST_MEM_SIZE)); + + GUEST_DONE(); +} + +static struct test_run_helper priv_memfd_testsuite[] = { + [PMPAT_ID] = { + .test_desc = PMPAT_DESC, + .vmst_handler = pmpat_handle_vm_stage, + .guest_fn = pmpat_guest_code, + }, +}; + +static void vcpu_work(struct kvm_vm *vm, uint32_t test_id) +{ + struct kvm_run *run; + struct ucall uc; + uint64_t cmd; + + /* + * Loop until the guest is done. + */ + run = vcpu_state(vm, VCPU_ID); + + while (true) { + vcpu_run(vm, VCPU_ID); + + if (run->exit_reason == KVM_EXIT_IO) { + cmd = get_ucall(vm, VCPU_ID, &uc); + if (cmd != UCALL_SYNC) + break; + + if (!priv_memfd_testsuite[test_id].vmst_handler( + vm, &priv_memfd_testsuite[test_id], uc.args[1])) + break; + + continue; + } + + TEST_FAIL("Unhandled VCPU exit reason %d\n", run->exit_reason); + break; + } + + if (run->exit_reason == KVM_EXIT_IO && cmd == UCALL_ABORT) + TEST_FAIL("%s at %s:%ld, val = %lu", (const char *)uc.args[0], + __FILE__, uc.args[1], uc.args[2]); +} + +static void priv_memory_region_add(struct kvm_vm *vm, void *mem, uint32_t slot, + uint32_t size, uint64_t guest_addr, + uint32_t priv_fd, uint64_t priv_offset) +{ + struct kvm_userspace_memory_region_ext region_ext; + int ret; + + region_ext.region.slot = slot; + region_ext.region.flags = KVM_MEM_PRIVATE; + region_ext.region.guest_phys_addr = guest_addr; + region_ext.region.memory_size = size; + region_ext.region.userspace_addr = (uintptr_t) mem; + region_ext.private_fd = priv_fd; + region_ext.private_offset = priv_offset; + ret = ioctl(vm_get_fd(vm), KVM_SET_USER_MEMORY_REGION, &region_ext); + TEST_ASSERT(ret == 0, "Failed to register user region for gpa 0x%lx\n", + guest_addr); +} + +/* Do private access to the guest's private memory */ +static void setup_and_execute_test(uint32_t test_id) +{ + struct kvm_vm *vm; + int priv_memfd; + int ret; + void *shared_mem; + struct kvm_enable_cap cap; + + vm = vm_create_default(VCPU_ID, 0, + priv_memfd_testsuite[test_id].guest_fn); + + /* Allocate shared memory */ + shared_mem = mmap(NULL, TEST_MEM_SIZE, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0); + TEST_ASSERT(shared_mem != MAP_FAILED, "Failed to mmap() host"); + + /* Allocate private memory */ + priv_memfd = memfd_create("vm_private_mem", MFD_INACCESSIBLE); + TEST_ASSERT(priv_memfd != -1, "Failed to create priv_memfd"); + ret = fallocate(priv_memfd, 0, 0, TEST_MEM_SIZE); + TEST_ASSERT(ret != -1, "fallocate failed"); + + priv_memory_region_add(vm, shared_mem, + TEST_MEM_SLOT, TEST_MEM_SIZE, + TEST_MEM_GPA, priv_memfd, 0); + + pr_info("Mapping test memory pages 0x%x page_size 0x%x\n", + TEST_MEM_SIZE/vm_get_page_size(vm), + vm_get_page_size(vm)); + virt_map(vm, TEST_MEM_GPA, TEST_MEM_GPA, + (TEST_MEM_SIZE/vm_get_page_size(vm))); + + /* Enable exit on KVM_HC_MAP_GPA_RANGE */ + pr_info("Enabling exit on map_gpa_range hypercall\n"); + ret = ioctl(vm_get_fd(vm), KVM_CHECK_EXTENSION, KVM_CAP_EXIT_HYPERCALL); + TEST_ASSERT(ret & (1 << KVM_HC_MAP_GPA_RANGE), + "VM exit on MAP_GPA_RANGE HC not supported"); + cap.cap = KVM_CAP_EXIT_HYPERCALL; + cap.flags = 0; + cap.args[0] = (1 << KVM_HC_MAP_GPA_RANGE); + ret = ioctl(vm_get_fd(vm), KVM_ENABLE_CAP, &cap); + TEST_ASSERT(ret == 0, + "Failed to enable exit on MAP_GPA_RANGE hypercall\n"); + + priv_memfd_testsuite[test_id].shared_mem = shared_mem; + priv_memfd_testsuite[test_id].priv_memfd = priv_memfd; + vcpu_work(vm, test_id); + + munmap(shared_mem, TEST_MEM_SIZE); + priv_memfd_testsuite[test_id].shared_mem = NULL; + close(priv_memfd); + priv_memfd_testsuite[test_id].priv_memfd = -1; + kvm_vm_free(vm); +} + +int main(int argc, char *argv[]) +{ + /* Tell stdout not to buffer its content */ + setbuf(stdout, NULL); + + for (uint32_t i = 0; i < ARRAY_SIZE(priv_memfd_testsuite); i++) { + pr_info("=== Starting test %s... ===\n", + priv_memfd_testsuite[i].test_desc); + setup_and_execute_test(i); + pr_info("--- completed test %s ---\n\n", + priv_memfd_testsuite[i].test_desc); + } + + return 0; +}

-- 2.35.1.1178.g4f1659d476-goog

Vishal Annapurve

9:05 p.m.

New subject: [RFC V1 PATCH 4/5] selftests: kvm: priv_memfd_test: Add support for memory conversion

Add handling of explicit private/shared memory conversion using KVM_HC_MAP_GPA_RANGE and implicit memory conversion by handling KVM_EXIT_MEMORY_ERROR.

Signed-off-by: Vishal Annapurve vannapurve@google.com --- tools/testing/selftests/kvm/priv_memfd_test.c | 87 +++++++++++++++++++ 1 file changed, 87 insertions(+)

diff --git a/tools/testing/selftests/kvm/priv_memfd_test.c b/tools/testing/selftests/kvm/priv_memfd_test.c index 11ccdb853a84..0e6c19501f27 100644 --- a/tools/testing/selftests/kvm/priv_memfd_test.c +++ b/tools/testing/selftests/kvm/priv_memfd_test.c @@ -129,6 +129,83 @@ static struct test_run_helper priv_memfd_testsuite[] = { }, };

+static void handle_vm_exit_hypercall(struct kvm_run *run, + uint32_t test_id) +{ + uint64_t gpa, npages, attrs; + int priv_memfd = + priv_memfd_testsuite[test_id].priv_memfd; + int ret; + int fallocate_mode; + + if (run->hypercall.nr != KVM_HC_MAP_GPA_RANGE) { + TEST_FAIL("Unhandled Hypercall %lld\n", + run->hypercall.nr); + } + + gpa = run->hypercall.args[0]; + npages = run->hypercall.args[1]; + attrs = run->hypercall.args[2]; + + if ((gpa >= TEST_MEM_GPA) && ((gpa + + (npages << MIN_PAGE_SHIFT)) <= TEST_MEM_END)) { + TEST_FAIL("Unhandled gpa 0x%lx npages %ld\n", + gpa, npages); + } + + if (attrs & KVM_MAP_GPA_RANGE_ENCRYPTED) + fallocate_mode = 0; + else { + fallocate_mode = (FALLOC_FL_PUNCH_HOLE | + FALLOC_FL_KEEP_SIZE); + } + pr_info("Converting off 0x%lx pages 0x%lx to %s\n", + (gpa - TEST_MEM_GPA), npages, + fallocate_mode ? + "shared" : "private"); + ret = fallocate(priv_memfd, fallocate_mode, + (gpa - TEST_MEM_GPA), + npages << MIN_PAGE_SHIFT); + TEST_ASSERT(ret != -1, + "fallocate failed in hc handling"); + run->hypercall.ret = 0; +} + +static void handle_vm_exit_memory_error(struct kvm_run *run, + uint32_t test_id) +{ + uint64_t gpa, size, flags; + int ret; + int priv_memfd = + priv_memfd_testsuite[test_id].priv_memfd; + int fallocate_mode; + + gpa = run->memory.gpa; + size = run->memory.size; + flags = run->memory.flags; + + if ((gpa < TEST_MEM_GPA) || ((gpa + size) + > TEST_MEM_END)) { + TEST_FAIL("Unhandled gpa 0x%lx size 0x%lx\n", + gpa, size); + } + + if (flags & KVM_MEMORY_EXIT_FLAG_PRIVATE) + fallocate_mode = 0; + else { + fallocate_mode = (FALLOC_FL_PUNCH_HOLE | + FALLOC_FL_KEEP_SIZE); + } + pr_info("Converting off 0x%lx size 0x%lx to %s\n", + (gpa - TEST_MEM_GPA), size, + fallocate_mode ? + "shared" : "private"); + ret = fallocate(priv_memfd, fallocate_mode, + (gpa - TEST_MEM_GPA), size); + TEST_ASSERT(ret != -1, + "fallocate failed in memory error handling"); +} + static void vcpu_work(struct kvm_vm *vm, uint32_t test_id) { struct kvm_run *run; @@ -155,6 +232,16 @@ static void vcpu_work(struct kvm_vm *vm, uint32_t test_id) continue; }

+ if (run->exit_reason == KVM_EXIT_HYPERCALL) { + handle_vm_exit_hypercall(run, test_id); + continue; + } + + if (run->exit_reason == KVM_EXIT_MEMORY_ERROR) { + handle_vm_exit_memory_error(run, test_id); + continue; + } + TEST_FAIL("Unhandled VCPU exit reason %d\n", run->exit_reason); break; }

-- 2.35.1.1178.g4f1659d476-goog

Vishal Annapurve

9:05 p.m.

New subject: [RFC V1 PATCH 5/5] selftests: kvm: priv_memfd_test: Add shared access test

Add a test to access private memory in shared fashion which should exercise implicit memory conversion path using KVM_EXIT_MEMORY_ERROR.

Signed-off-by: Vishal Annapurve vannapurve@google.com --- tools/testing/selftests/kvm/priv_memfd_test.c | 66 +++++++++++++++++++ 1 file changed, 66 insertions(+)

diff --git a/tools/testing/selftests/kvm/priv_memfd_test.c b/tools/testing/selftests/kvm/priv_memfd_test.c index 0e6c19501f27..607fdc149c7d 100644 --- a/tools/testing/selftests/kvm/priv_memfd_test.c +++ b/tools/testing/selftests/kvm/priv_memfd_test.c @@ -121,12 +121,78 @@ static void pmpat_guest_code(void) GUEST_DONE(); }

+/* Test to verify guest shared accesses on private memory with following steps: + * 1) Upon entry, guest signals VMM that it has started. + * 2) VMM populates the shared memory with known pattern and continues guest + * execution. + * 3) Guest reads private gpa range in a shared fashion and verifies that it + * reads what VMM has written in step2. + * 3) Guest writes a different pattern on the shared memory and signals VMM + * that it has updated the shared memory. + * 4) VMM verifies shared memory contents to be same as the data populated + * in step 3 and continues guest execution. + */ +#define PMSAT_ID 1 +#define PMSAT_DESC "PrivateMemorySharedAccessTest" + +/* Guest code execution stages for private mem access test */ +#define PMSAT_GUEST_STARTED 0ULL +#define PMSAT_GUEST_TEST_MEM_UPDATED 1ULL + +static bool pmsat_handle_vm_stage(struct kvm_vm *vm, + void *test_info, + uint64_t stage) +{ + void *shared_mem = ((struct test_run_helper *)test_info)->shared_mem; + + switch (stage) { + case PMSAT_GUEST_STARTED: { + /* Initialize the contents of shared memory */ + memset(shared_mem, SHARED_MEM_DATA_BYTE, TEST_MEM_SIZE); + VM_STAGE_PROCESSED(PMSAT_GUEST_STARTED); + break; + } + case PMSAT_GUEST_TEST_MEM_UPDATED: { + /* verify data to be same as what guest wrote */ + TEST_ASSERT(verify_byte_pattern(shared_mem, + PRIV_MEM_DATA_BYTE, TEST_MEM_SIZE), + "Shared memory view mismatch"); + VM_STAGE_PROCESSED(PMSAT_GUEST_PRIV_MEM_UPDATED); + break; + } + default: + printf("Unhandled VM stage %ld\n", stage); + return false; + } + + return true; +} + +static void pmsat_guest_code(void) +{ + void *shared_mem = (void *)TEST_MEM_GPA; + + GUEST_SYNC(PMSAT_GUEST_STARTED); + GUEST_ASSERT(verify_byte_pattern(shared_mem, + SHARED_MEM_DATA_BYTE, TEST_MEM_SIZE)); + + memset(shared_mem, PRIV_MEM_DATA_BYTE, TEST_MEM_SIZE); + GUEST_SYNC(PMSAT_GUEST_TEST_MEM_UPDATED); + + GUEST_DONE(); +} + static struct test_run_helper priv_memfd_testsuite[] = { [PMPAT_ID] = { .test_desc = PMPAT_DESC, .vmst_handler = pmpat_handle_vm_stage, .guest_fn = pmpat_guest_code, }, + [PMSAT_ID] = { + .test_desc = PMSAT_DESC, + .vmst_handler = pmsat_handle_vm_stage, + .guest_fn = pmsat_guest_code, + }, };

static void handle_vm_exit_hypercall(struct kvm_run *run,

-- 2.35.1.1178.g4f1659d476-goog

Nikunj A. Dadhania

11 Apr 11 Apr

12:01 p.m.

On 4/9/2022 2:35 AM, Vishal Annapurve wrote:

...

This series implements selftests targeting the feature floated by Chao via: https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.in...

Thanks for working on this.

...

Below changes aim to test the fd based approach for guest private memory in context of normal (non-confidential) VMs executing on non-confidential platforms.

Confidential platforms along with the confidentiality aware software stack support a notion of private/shared accesses from the confidential VMs. Generally, a bit in the GPA conveys the shared/private-ness of the access. Non-confidential platforms don't have a notion of private or shared accesses from the guest VMs. To support this notion, KVM_HC_MAP_GPA_RANGE is modified to allow marking an access from a VM within a GPA range as always shared or private. Any suggestions regarding implementing this ioctl alternatively/cleanly are appreciated.

priv_memfd_test.c file adds a suite of two basic selftests to access private memory from the guest via private/shared access and checking if the contents can be leaked to/accessed by vmm via shared memory view.

Test results:

PMPAT - PrivateMemoryPrivateAccess test passes

PMSAT - PrivateMemorySharedAccess test fails currently and needs more

analysis to understand the reason of failure.

That could be because of the return code (*r = -1) from the KVM_EXIT_MEMORY_ERROR. This gets interpreted as -EPERM in the VMM when the vcpu_run exits.

+ vcpu->run->exit_reason = KVM_EXIT_MEMORY_ERROR; + vcpu->run->memory.flags = flags; + vcpu->run->memory.padding = 0; + vcpu->run->memory.gpa = fault->gfn << PAGE_SHIFT; + vcpu->run->memory.size = PAGE_SIZE; + fault->pfn = -1; + *r = -1; + return true;

Regards Nikunj

[1] https://lore.kernel.org/all/20220310140911.50924-10-chao.p.peng@linux.intel....

Chao Peng

12 Apr 12 Apr

8:25 a.m.

On Mon, Apr 11, 2022 at 05:31:09PM +0530, Nikunj A. Dadhania wrote:

...

On 4/9/2022 2:35 AM, Vishal Annapurve wrote:

...
This series implements selftests targeting the feature floated by Chao via: https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.in...

Thanks for working on this.

...
Below changes aim to test the fd based approach for guest private memory in context of normal (non-confidential) VMs executing on non-confidential platforms.

Confidential platforms along with the confidentiality aware software stack support a notion of private/shared accesses from the confidential VMs. Generally, a bit in the GPA conveys the shared/private-ness of the access. Non-confidential platforms don't have a notion of private or shared accesses from the guest VMs. To support this notion, KVM_HC_MAP_GPA_RANGE is modified to allow marking an access from a VM within a GPA range as always shared or private. Any suggestions regarding implementing this ioctl alternatively/cleanly are appreciated.

priv_memfd_test.c file adds a suite of two basic selftests to access private memory from the guest via private/shared access and checking if the contents can be leaked to/accessed by vmm via shared memory view.

Test results:

PMPAT - PrivateMemoryPrivateAccess test passes

PMSAT - PrivateMemorySharedAccess test fails currently and needs more

analysis to understand the reason of failure.

That could be because of the return code (*r = -1) from the KVM_EXIT_MEMORY_ERROR. This gets interpreted as -EPERM in the VMM when the vcpu_run exits.

vcpu->run->exit_reason = KVM_EXIT_MEMORY_ERROR;

vcpu->run->memory.flags = flags;

vcpu->run->memory.padding = 0;

vcpu->run->memory.gpa = fault->gfn << PAGE_SHIFT;

vcpu->run->memory.size = PAGE_SIZE;

fault->pfn = -1;

*r = -1;

return true;

That's true. The current private mem patch treats KVM_EXIT_MEMORY_ERROR as error for KVM_RUN. That behavior needs to be discussed, but right now (v5) it hits the ASSERT in tools/testing/selftests/kvm/lib/kvm_util.c before you have chance to handle KVM_EXIT_MEMORY_ERROR in this patch series.

void vcpu_run(struct kvm_vm *vm, uint32_t vcpuid) { int ret = _vcpu_run(vm, vcpuid); TEST_ASSERT(ret == 0, "KVM_RUN IOCTL failed, " "rc: %i errno: %i", ret, errno); }

Thanks, Chao

...

Regards Nikunj

[1] https://lore.kernel.org/all/20220310140911.50924-10-chao.p.peng@linux.intel....

Andy Lutomirski

13 Apr 13 Apr

12:16 a.m.

On Fri, Apr 8, 2022, at 2:05 PM, Vishal Annapurve wrote:

...

This series implements selftests targeting the feature floated by Chao via: https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.in...

Below changes aim to test the fd based approach for guest private memory in context of normal (non-confidential) VMs executing on non-confidential platforms.

Confidential platforms along with the confidentiality aware software stack support a notion of private/shared accesses from the confidential VMs. Generally, a bit in the GPA conveys the shared/private-ness of the access. Non-confidential platforms don't have a notion of private or shared accesses from the guest VMs. To support this notion, KVM_HC_MAP_GPA_RANGE is modified to allow marking an access from a VM within a GPA range as always shared or private. Any suggestions regarding implementing this ioctl alternatively/cleanly are appreciated.

This is fantastic. I do think we need to decide how this should work in general. We have a few platforms with somewhat different properties:

TDX: The guest decides, per memory access (using a GPA bit), whether an access is private or shared. In principle, the same address could be *both* and be distinguished by only that bit, and the two addresses would refer to different pages.

SEV: The guest decides, per memory access (using a GPA bit), whether an access is private or shared. At any given time, a physical address (with that bit masked off) can be private, shared, or invalid, but it can't be valid as private and shared at the same time.

pKVM (currently, as I understand it): the guest decides by hypercall, in advance of an access, which addresses are private and which are shared.

This series, if I understood it correctly, is like TDX except with no hardware security.

Sean or Chao, do you have a clear sense of whether the current fd-based private memory proposal can cleanly support SEV and pKVM? What, if anything, needs to be done on the API side to get that working well? I don't think we need to support SEV or pKVM right away to get this merged, but I do think we should understand how the API can map to them.

Michael Roth

1:42 p.m.

On Tue, Apr 12, 2022 at 05:16:22PM -0700, Andy Lutomirski wrote:

...

On Fri, Apr 8, 2022, at 2:05 PM, Vishal Annapurve wrote:

...
This series implements selftests targeting the feature floated by Chao via: https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.in...

Below changes aim to test the fd based approach for guest private memory in context of normal (non-confidential) VMs executing on non-confidential platforms.

Confidential platforms along with the confidentiality aware software stack support a notion of private/shared accesses from the confidential VMs. Generally, a bit in the GPA conveys the shared/private-ness of the access. Non-confidential platforms don't have a notion of private or shared accesses from the guest VMs. To support this notion, KVM_HC_MAP_GPA_RANGE is modified to allow marking an access from a VM within a GPA range as always shared or private. Any suggestions regarding implementing this ioctl alternatively/cleanly are appreciated.

This is fantastic. I do think we need to decide how this should work in general. We have a few platforms with somewhat different properties:

TDX: The guest decides, per memory access (using a GPA bit), whether an access is private or shared. In principle, the same address could be *both* and be distinguished by only that bit, and the two addresses would refer to different pages.

SEV: The guest decides, per memory access (using a GPA bit), whether an access is private or shared. At any given time, a physical address (with that bit masked off) can be private, shared, or invalid, but it can't be valid as private and shared at the same time.

pKVM (currently, as I understand it): the guest decides by hypercall, in advance of an access, which addresses are private and which are shared.

This series, if I understood it correctly, is like TDX except with no hardware security.

Sean or Chao, do you have a clear sense of whether the current fd-based private memory proposal can cleanly support SEV and pKVM? What, if anything, needs to be done on the API side to get that working well? I don't think we need to support SEV or pKVM right away to get this merged, but I do think we should understand how the API can map to them.

I've been looking at porting the SEV-SNP hypervisor patches over to using memfd, and I hit an issue that I think is generally applicable to SEV/SEV-ES as well. Namely at guest init time we have something like the following flow:

VMM: - allocate shared memory to back the guest and map it into guest address space - initialize shared memory with initialize memory contents (namely the BIOS) - ask KVM to encrypt these pages in-place and measure them to generate the initial measured payload for attestation, via KVM_SEV_LAUNCH_UPDATE with the GPA for each range of memory to encrypt. KVM: - issue SEV_LAUNCH_UPDATE firmware command, which takes an HPA as input and does an in-place encryption/measure of the page.

With current v5 of the memfd/UPM series, I think the expected flow is that we would fallocate() these ranges from the private fd backend in advance of calling KVM_SEV_LAUNCH_UPDATE (if VMM does it after we'd destroy the initial guest payload, since they'd be replaced by newly-allocated pages). But if VMM does it before, VMM has no way to initialize the guest memory contents, since mmap()/pwrite() are disallowed due to MFD_INACCESSIBLE.

I think something similar to your proposal[1] here of making pread()/pwrite() possible for private-fd-backed memory that's been flagged as "shareable" would work for this case. Although here the "shareable" flag could be removed immediately upon successful completion of the SEV_LAUNCH_UPDATE firmware command.

I think with TDX this isn't an issue because their analagous TDH.MEM.PAGE.ADD seamcall takes a pair of source/dest HPA as input params, so the VMM wouldn't need write access to dest HPA at any point, just source HPA.

[1] https://lwn.net/ml/linux-kernel/eefc3c74-acca-419c-8947-726ce2458446@www.fas...

Chao Peng

14 Apr 14 Apr

10:07 a.m.

On Wed, Apr 13, 2022 at 08:42:00AM -0500, Michael Roth wrote:

...

On Tue, Apr 12, 2022 at 05:16:22PM -0700, Andy Lutomirski wrote:

...
On Fri, Apr 8, 2022, at 2:05 PM, Vishal Annapurve wrote:

...
This series implements selftests targeting the feature floated by Chao via: https://lore.kernel.org/linux-mm/20220310140911.50924-1-chao.p.peng@linux.in...

Below changes aim to test the fd based approach for guest private memory in context of normal (non-confidential) VMs executing on non-confidential platforms.

Confidential platforms along with the confidentiality aware software stack support a notion of private/shared accesses from the confidential VMs. Generally, a bit in the GPA conveys the shared/private-ness of the access. Non-confidential platforms don't have a notion of private or shared accesses from the guest VMs. To support this notion, KVM_HC_MAP_GPA_RANGE is modified to allow marking an access from a VM within a GPA range as always shared or private. Any suggestions regarding implementing this ioctl alternatively/cleanly are appreciated.

This is fantastic. I do think we need to decide how this should work in general. We have a few platforms with somewhat different properties:

TDX: The guest decides, per memory access (using a GPA bit), whether an access is private or shared. In principle, the same address could be *both* and be distinguished by only that bit, and the two addresses would refer to different pages.

SEV: The guest decides, per memory access (using a GPA bit), whether an access is private or shared. At any given time, a physical address (with that bit masked off) can be private, shared, or invalid, but it can't be valid as private and shared at the same time.

pKVM (currently, as I understand it): the guest decides by hypercall, in advance of an access, which addresses are private and which are shared.

This series, if I understood it correctly, is like TDX except with no hardware security.

Sean or Chao, do you have a clear sense of whether the current fd-based private memory proposal can cleanly support SEV and pKVM? What, if anything, needs to be done on the API side to get that working well? I don't think we need to support SEV or pKVM right away to get this merged, but I do think we should understand how the API can map to them.

I've been looking at porting the SEV-SNP hypervisor patches over to using memfd, and I hit an issue that I think is generally applicable to SEV/SEV-ES as well. Namely at guest init time we have something like the following flow:

VMM: - allocate shared memory to back the guest and map it into guest address space - initialize shared memory with initialize memory contents (namely the BIOS) - ask KVM to encrypt these pages in-place and measure them to generate the initial measured payload for attestation, via KVM_SEV_LAUNCH_UPDATE with the GPA for each range of memory to encrypt. KVM: - issue SEV_LAUNCH_UPDATE firmware command, which takes an HPA as input and does an in-place encryption/measure of the page.

With current v5 of the memfd/UPM series, I think the expected flow is that we would fallocate() these ranges from the private fd backend in advance of calling KVM_SEV_LAUNCH_UPDATE (if VMM does it after we'd destroy the initial guest payload, since they'd be replaced by newly-allocated pages). But if VMM does it before, VMM has no way to initialize the guest memory contents, since mmap()/pwrite() are disallowed due to MFD_INACCESSIBLE.

OK, so for SEV, basically VMM puts vBIOS directly into guest memory and then do in-place measurement.

TDX has no problem because TDX temporarily uses a VMM buffer (vs. guest memory) to hold the vBIOS and then asks SEAM-MODULE to measure and copy that to guest memory.

Maybe something like SHM_LOCK should be used instead of the aggressive MFD_INACCESSIBLE. Before VMM calling SHM_LOCK on the memfd, the content can be changed but after that it's not visible to userspace VMM. This gives userspace a chance to modify the data in private page.

Chao

...

I think something similar to your proposal[1] here of making pread()/pwrite() possible for private-fd-backed memory that's been flagged as "shareable" would work for this case. Although here the "shareable" flag could be removed immediately upon successful completion of the SEV_LAUNCH_UPDATE firmware command.

I think with TDX this isn't an issue because their analagous TDH.MEM.PAGE.ADD seamcall takes a pair of source/dest HPA as input params, so the VMM wouldn't need write access to dest HPA at any point, just source HPA.

[1] https://lwn.net/ml/linux-kernel/eefc3c74-acca-419c-8947-726ce2458446@www.fas...

1191

days inactive

1197

days old

linux-kselftest-mirror@lists.linaro.org

10 comments

participants

tags (0)

participants (5)

Andy Lutomirski
Chao Peng
Michael Roth
Nikunj A. Dadhania
Vishal Annapurve