These patches and are also available at:
https://github.com/mdroth/linux/commits/sev-selftests-rfc1
They are based on top of v5 of Brijesh's SEV-SNP hypervisor patches[1] to allow for SEV-SNP testing and provide some context for the overall design, but the SEV/SEV-ES patches can be carved out into a separate series as needed.
== OVERVIEW ==
This series introduces a set of memory encryption-related parameter/hooks in the core kselftest library, then uses the hooks to implement a small library for creating/managing SEV, SEV-ES, SEV-SNP guests. This library is then used to implement a basic boot/memory test that's run for all variants of SEV/SEV-ES/SEV-SNP guest types, as well as a set of SEV-SNP tests that cover various permutations of pvalidate/page-state changes.
- Patches 1-7 implement SEV boot tests and should run against existing kernels - Patch 8 is a KVM changes that's required to allow SEV-ES/SEV-SNP guests to boot with an externally generated page table, and is a host kernel prequisite for the remaining patches in the series. - Patches 9-12 extend the boot tests to cover SEV-ES - Patches 13-16 extend the boot testst to cover SEV-SNP, and introduce an additional test for page-state changes.
Any review/comments are greatly appreciated!
[1] https://lore.kernel.org/linux-mm/20210820155918.7518-1-brijesh.singh@amd.com...
---------------------------------------------------------------- Michael Roth (16): KVM: selftests: move vm_phy_pages_alloc() earlier in file KVM: selftests: add hooks for managing encrypted guest memory KVM: selftests: handle encryption bits in page tables KVM: selftests: set CPUID before setting sregs in vcpu creation KVM: selftests: add support for encrypted vm_vaddr_* allocations KVM: selftests: add library for creating/interacting with SEV guests KVM: selftests: add SEV boot tests KVM: SVM: include CR3 in initial VMSA state for SEV-ES guests KVM: selftests: account for error code in #VC exception frame KVM: selftests: add support for creating SEV-ES guests KVM: selftests: add library for handling SEV-ES-related exits KVM: selftests: add SEV-ES boot tests KVM: selftests: add support for creating SEV-SNP guests KVM: selftests: add helpers for SEV-SNP-related instructions/exits KVM: selftests: add SEV-SNP boot tests KVM: selftests: add SEV-SNP tests for page-state changes
arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 22 ++ arch/x86/kvm/vmx/vmx.c | 8 + arch/x86/kvm/x86.c | 3 +- tools/testing/selftests/kvm/.gitignore | 2 + tools/testing/selftests/kvm/Makefile | 3 + tools/testing/selftests/kvm/include/kvm_util.h | 8 + tools/testing/selftests/kvm/include/x86_64/sev.h | 70 ++++ .../selftests/kvm/include/x86_64/sev_exitlib.h | 20 ++ tools/testing/selftests/kvm/include/x86_64/svm.h | 35 ++ .../selftests/kvm/include/x86_64/svm_util.h | 2 + tools/testing/selftests/kvm/lib/kvm_util.c | 249 +++++++++----- .../testing/selftests/kvm/lib/kvm_util_internal.h | 10 + tools/testing/selftests/kvm/lib/x86_64/handlers.S | 4 +- tools/testing/selftests/kvm/lib/x86_64/processor.c | 30 +- tools/testing/selftests/kvm/lib/x86_64/sev.c | 381 +++++++++++++++++++++ .../testing/selftests/kvm/lib/x86_64/sev_exitlib.c | 326 ++++++++++++++++++ .../selftests/kvm/x86_64/sev_all_boot_test.c | 367 ++++++++++++++++++++ .../selftests/kvm/x86_64/sev_snp_psc_test.c | 378 ++++++++++++++++++++ 20 files changed, 1820 insertions(+), 100 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/x86_64/sev.h create mode 100644 tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h create mode 100644 tools/testing/selftests/kvm/lib/x86_64/sev.c create mode 100644 tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c create mode 100644 tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c create mode 100644 tools/testing/selftests/kvm/x86_64/sev_snp_psc_test.c
Subsequent patches will break some of this code out into file-local helper functions, which will be used by functions like vm_vaddr_alloc(), which currently are defined earlier in the file, so a forward declaration would be needed.
Instead, move it earlier in the file, just above vm_vaddr_alloc() and and friends, which are the main users.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/lib/kvm_util.c | 146 ++++++++++----------- 1 file changed, 73 insertions(+), 73 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 10a8ed691c66..92f59adddebe 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1145,6 +1145,79 @@ void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid) list_add(&vcpu->list, &vm->vcpus); }
+/* + * Physical Contiguous Page Allocator + * + * Input Args: + * vm - Virtual Machine + * num - number of pages + * paddr_min - Physical address minimum + * memslot - Memory region to allocate page from + * + * Output Args: None + * + * Return: + * Starting physical address + * + * Within the VM specified by vm, locates a range of available physical + * pages at or above paddr_min. If found, the pages are marked as in use + * and their base address is returned. A TEST_ASSERT failure occurs if + * not enough pages are available at or above paddr_min. + */ +vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, + vm_paddr_t paddr_min, uint32_t memslot) +{ + struct userspace_mem_region *region; + sparsebit_idx_t pg, base; + + TEST_ASSERT(num > 0, "Must allocate at least one page"); + + TEST_ASSERT((paddr_min % vm->page_size) == 0, "Min physical address " + "not divisible by page size.\n" + " paddr_min: 0x%lx page_size: 0x%x", + paddr_min, vm->page_size); + + region = memslot2region(vm, memslot); + base = pg = paddr_min >> vm->page_shift; + + do { + for (; pg < base + num; ++pg) { + if (!sparsebit_is_set(region->unused_phy_pages, pg)) { + base = pg = sparsebit_next_set(region->unused_phy_pages, pg); + break; + } + } + } while (pg && pg != base + num); + + if (pg == 0) { + fprintf(stderr, "No guest physical page available, " + "paddr_min: 0x%lx page_size: 0x%x memslot: %u\n", + paddr_min, vm->page_size, memslot); + fputs("---- vm dump ----\n", stderr); + vm_dump(stderr, vm, 2); + abort(); + } + + for (pg = base; pg < base + num; ++pg) + sparsebit_clear(region->unused_phy_pages, pg); + + return base * vm->page_size; +} + +vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min, + uint32_t memslot) +{ + return vm_phy_pages_alloc(vm, 1, paddr_min, memslot); +} + +/* Arbitrary minimum physical address used for virtual translation tables. */ +#define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000 + +vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm) +{ + return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0); +} + /* * VM Virtual Address Unused Gap * @@ -2149,79 +2222,6 @@ const char *exit_reason_str(unsigned int exit_reason) return "Unknown"; }
-/* - * Physical Contiguous Page Allocator - * - * Input Args: - * vm - Virtual Machine - * num - number of pages - * paddr_min - Physical address minimum - * memslot - Memory region to allocate page from - * - * Output Args: None - * - * Return: - * Starting physical address - * - * Within the VM specified by vm, locates a range of available physical - * pages at or above paddr_min. If found, the pages are marked as in use - * and their base address is returned. A TEST_ASSERT failure occurs if - * not enough pages are available at or above paddr_min. - */ -vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, - vm_paddr_t paddr_min, uint32_t memslot) -{ - struct userspace_mem_region *region; - sparsebit_idx_t pg, base; - - TEST_ASSERT(num > 0, "Must allocate at least one page"); - - TEST_ASSERT((paddr_min % vm->page_size) == 0, "Min physical address " - "not divisible by page size.\n" - " paddr_min: 0x%lx page_size: 0x%x", - paddr_min, vm->page_size); - - region = memslot2region(vm, memslot); - base = pg = paddr_min >> vm->page_shift; - - do { - for (; pg < base + num; ++pg) { - if (!sparsebit_is_set(region->unused_phy_pages, pg)) { - base = pg = sparsebit_next_set(region->unused_phy_pages, pg); - break; - } - } - } while (pg && pg != base + num); - - if (pg == 0) { - fprintf(stderr, "No guest physical page available, " - "paddr_min: 0x%lx page_size: 0x%x memslot: %u\n", - paddr_min, vm->page_size, memslot); - fputs("---- vm dump ----\n", stderr); - vm_dump(stderr, vm, 2); - abort(); - } - - for (pg = base; pg < base + num; ++pg) - sparsebit_clear(region->unused_phy_pages, pg); - - return base * vm->page_size; -} - -vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min, - uint32_t memslot) -{ - return vm_phy_pages_alloc(vm, 1, paddr_min, memslot); -} - -/* Arbitrary minimum physical address used for virtual translation tables. */ -#define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000 - -vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm) -{ - return vm_phy_page_alloc(vm, KVM_GUEST_PAGE_TABLE_MIN_PADDR, 0); -} - /* * Address Guest Virtual to Host Virtual *
On Tue, Oct 5, 2021 at 4:46 PM Michael Roth michael.roth@amd.com wrote:
Why move the function implementation? Maybe just adding a declaration at the top of kvm_util.c should suffice.
On Mon, Oct 18, 2021 at 08:00:00AM -0700, Mingwei Zhang wrote:
At least from working on other projects I'd gotten the impression that forward function declarations should be avoided if they can be solved by moving the function above the caller. Certainly don't mind taking your suggestion and dropping this patch if that's not the case here though.
On Wed, Oct 20, 2021 at 8:47 PM Michael Roth michael.roth@amd.com wrote:
Understood. Yes, I think it would be better to follow your experience then. I was thinking that if you move the code and then potentially git blame on that function might point to you :)
Thanks. -Mingwei
On Tue, Oct 26, 2021 at 8:52 AM Mingwei Zhang mizhang@google.com wrote:
Reviewed-by: Mingwei Zhang mizhang@google.com
Thanks. -Mingwei
VM implementations that make use of encrypted memory need a way to configure things like the encryption/shared bit position for page table handling, the default encryption policy for internal allocations made by the core library, and a way to fetch the list/bitmap of encrypted pages to do the actual memory encryption. Add an interface to configure these parameters. Also introduce a sparsebit map to track allocations/mappings that should be treated as encrypted, and provide a way for VM implementations to retrieve it to handle operations related memory encryption.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../testing/selftests/kvm/include/kvm_util.h | 6 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 63 +++++++++++++++++-- .../selftests/kvm/lib/kvm_util_internal.h | 10 +++ 3 files changed, 75 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 010b59b13917..f417de80596c 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -348,6 +348,12 @@ int vm_create_device(struct kvm_vm *vm, struct kvm_create_device *cd);
void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
+void vm_set_memory_encryption(struct kvm_vm *vm, bool enc_by_default, bool has_enc_bit, + uint8_t enc_bit); +struct sparsebit *vm_get_encrypted_phy_pages(struct kvm_vm *vm, int slot, + vm_paddr_t *gpa_start, + uint64_t *size); + /* Common ucalls */ enum { UCALL_NONE, diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 92f59adddebe..c58f930dedd2 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -631,6 +631,7 @@ static void __vm_mem_region_delete(struct kvm_vm *vm, "rc: %i errno: %i", ret, errno);
sparsebit_free(®ion->unused_phy_pages); + sparsebit_free(®ion->encrypted_phy_pages); ret = munmap(region->mmap_start, region->mmap_size); TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno);
@@ -924,6 +925,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, }
region->unused_phy_pages = sparsebit_alloc(); + region->encrypted_phy_pages = sparsebit_alloc(); sparsebit_set_num(region->unused_phy_pages, guest_paddr >> vm->page_shift, npages); region->region.slot = slot; @@ -1153,6 +1155,7 @@ void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid) * num - number of pages * paddr_min - Physical address minimum * memslot - Memory region to allocate page from + * encrypt - Whether to treat the pages as encrypted * * Output Args: None * @@ -1164,11 +1167,13 @@ void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid) * and their base address is returned. A TEST_ASSERT failure occurs if * not enough pages are available at or above paddr_min. */ -vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, - vm_paddr_t paddr_min, uint32_t memslot) +static vm_paddr_t +_vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, vm_paddr_t paddr_min, + uint32_t memslot, bool encrypt) { struct userspace_mem_region *region; sparsebit_idx_t pg, base; + vm_paddr_t gpa;
TEST_ASSERT(num > 0, "Must allocate at least one page");
@@ -1198,10 +1203,25 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, abort(); }
- for (pg = base; pg < base + num; ++pg) + for (pg = base; pg < base + num; ++pg) { sparsebit_clear(region->unused_phy_pages, pg); + if (encrypt) + sparsebit_set(region->encrypted_phy_pages, pg); + } + + gpa = base * vm->page_size;
- return base * vm->page_size; + if (encrypt && vm->memcrypt.has_enc_bit) + gpa |= (1ULL << vm->memcrypt.enc_bit); + + return gpa; +} + +vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, + vm_paddr_t paddr_min, uint32_t memslot) +{ + return _vm_phy_pages_alloc(vm, 1, paddr_min, memslot, + vm->memcrypt.enc_by_default); }
vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min, @@ -2146,6 +2166,10 @@ void vm_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent) region->host_mem); fprintf(stream, "%*sunused_phy_pages: ", indent + 2, ""); sparsebit_dump(stream, region->unused_phy_pages, 0); + if (vm->memcrypt.enabled) { + fprintf(stream, "%*sencrypted_phy_pages: ", indent + 2, ""); + sparsebit_dump(stream, region->encrypted_phy_pages, 0); + } } fprintf(stream, "%*sMapped Virtual Pages:\n", indent, ""); sparsebit_dump(stream, vm->vpages_mapped, indent + 2); @@ -2343,3 +2367,34 @@ int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid)
return ioctl(vcpu->fd, KVM_GET_STATS_FD, NULL); } + +void vm_set_memory_encryption(struct kvm_vm *vm, bool enc_by_default, bool has_enc_bit, + uint8_t enc_bit) +{ + vm->memcrypt.enabled = true; + vm->memcrypt.enc_by_default = enc_by_default; + vm->memcrypt.has_enc_bit = has_enc_bit; + vm->memcrypt.enc_bit = enc_bit; +} + +struct sparsebit * +vm_get_encrypted_phy_pages(struct kvm_vm *vm, int slot, vm_paddr_t *gpa_start, + uint64_t *size) +{ + struct userspace_mem_region *region; + struct sparsebit *encrypted_phy_pages; + + if (!vm->memcrypt.enabled) + return NULL; + + region = memslot2region(vm, slot); + if (!region) + return NULL; + + encrypted_phy_pages = sparsebit_alloc(); + sparsebit_copy(encrypted_phy_pages, region->encrypted_phy_pages); + *size = region->region.memory_size; + *gpa_start = region->region.guest_phys_addr; + + return encrypted_phy_pages; +} diff --git a/tools/testing/selftests/kvm/lib/kvm_util_internal.h b/tools/testing/selftests/kvm/lib/kvm_util_internal.h index a03febc24ba6..99ccab86115c 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util_internal.h +++ b/tools/testing/selftests/kvm/lib/kvm_util_internal.h @@ -16,6 +16,7 @@ struct userspace_mem_region { struct kvm_userspace_memory_region region; struct sparsebit *unused_phy_pages; + struct sparsebit *encrypted_phy_pages; int fd; off_t offset; void *host_mem; @@ -44,6 +45,14 @@ struct userspace_mem_regions { DECLARE_HASHTABLE(slot_hash, 9); };
+/* Memory encryption policy/configuration. */ +struct vm_memcrypt { + bool enabled; + int8_t enc_by_default; + bool has_enc_bit; + int8_t enc_bit; +}; + struct kvm_vm { int mode; unsigned long type; @@ -67,6 +76,7 @@ struct kvm_vm { vm_vaddr_t idt; vm_vaddr_t handlers; uint32_t dirty_ring_size; + struct vm_memcrypt memcrypt; };
struct vcpu *vcpu_find(struct kvm_vm *vm, uint32_t vcpuid);
On 10/5/21 4:44 PM, Michael Roth wrote:
For readability, it's probably better to adopt a standard naming convention for structures, members and functions ? For example,
encrypted_phy_pages -> enc_phy_pages
struct vm_memcrypt { -> struct vm_mem_enc {
struct vm_memcrypt memcrypt -> struct vm_mem_enc mem_enc
vm_get_encrypted_phy_pages() -> vm_get_enc_phy_pages
Do we have to make a copy for the sparsebit? Why not just return the pointer? By looking at your subsequent patches, I find that this data structure seems to be just read-only?
-Mingwei
On Mon, Oct 18, 2021 at 08:00:00AM -0700, Mingwei Zhang wrote:
Yes, it's only intended to be used for read access. But I'll if I can enforce that without the need to use a copy.
-Mingwei
On Wed, Oct 20, 2021 at 8:46 PM Michael Roth michael.roth@amd.com wrote:
Understood. Thanks for the clarification. Yeah, I think both making a copy and returning a const pointer should work. I will leave that to you then.
Thanks. -Mingwei
On Tue, Oct 26, 2021 at 8:48 AM Mingwei Zhang mizhang@google.com wrote:
Reviewed-by: Mingwei Zhang mizhang@google.com
Thanks. -Mingwei
SEV guests rely on an encyption bit which resides within the range that current code treats as address bits. Guest code will expect these bits to be set appropriately in their page tables, whereas helpers like addr_gpa2hva() will expect these bits to be masked away prior to translation. Add proper handling for these cases.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../testing/selftests/kvm/include/kvm_util.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 23 +++++++++++++++- .../selftests/kvm/lib/x86_64/processor.c | 26 +++++++++---------- 3 files changed, 36 insertions(+), 14 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index f417de80596c..4bf686d664cc 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -152,6 +152,7 @@ void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa); void *addr_gva2hva(struct kvm_vm *vm, vm_vaddr_t gva); vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva); void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa); +vm_paddr_t addr_raw2gpa(struct kvm_vm *vm, vm_vaddr_t gpa_raw);
/* * Address Guest Virtual to Guest Physical diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index c58f930dedd2..ef88fdc7e46b 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1443,6 +1443,26 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, } }
+/* + * Mask off any special bits from raw GPA + * + * Input Args: + * vm - Virtual Machine + * gpa_raw - Raw VM physical address + * + * Output Args: None + * + * Return: + * GPA with special bits (e.g. shared/encrypted) masked off. + */ +vm_paddr_t addr_raw2gpa(struct kvm_vm *vm, vm_paddr_t gpa_raw) +{ + if (!vm->memcrypt.has_enc_bit) + return gpa_raw; + + return gpa_raw & ~(1ULL << vm->memcrypt.enc_bit); +} + /* * Address VM Physical to Host Virtual * @@ -1460,9 +1480,10 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, * address providing the memory to the vm physical address is returned. * A TEST_ASSERT failure occurs if no region containing gpa exists. */ -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa) +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw) { struct userspace_mem_region *region; + vm_paddr_t gpa = addr_raw2gpa(vm, gpa_raw);
region = userspace_mem_region_find(vm, gpa, gpa); if (!region) { diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 28cb881f440d..0bbd88fe1127 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -198,7 +198,7 @@ static void *virt_get_pte(struct kvm_vm *vm, uint64_t pt_pfn, uint64_t vaddr, static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm, uint64_t pt_pfn, uint64_t vaddr, - uint64_t paddr, + uint64_t paddr_raw, int level, enum x86_page_size page_size) { @@ -208,10 +208,9 @@ static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm, pte->writable = true; pte->present = true; pte->page_size = (level == page_size); - if (pte->page_size) - pte->pfn = paddr >> vm->page_shift; - else - pte->pfn = vm_alloc_page_table(vm) >> vm->page_shift; + if (!pte->page_size) + paddr_raw = vm_alloc_page_table(vm); + pte->pfn = paddr_raw >> vm->page_shift; } else { /* * Entry already present. Assert that the caller doesn't want @@ -228,12 +227,13 @@ static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm, return pte; }
-void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, +void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr_raw, enum x86_page_size page_size) { const uint64_t pg_size = 1ull << ((page_size * 9) + 12); struct pageUpperEntry *pml4e, *pdpe, *pde; struct pageTableEntry *pte; + uint64_t paddr = addr_raw2gpa(vm, paddr_raw);
TEST_ASSERT(vm->mode == VM_MODE_PXXV48_4K, "Unknown or unsupported guest mode, mode: 0x%x", vm->mode); @@ -256,15 +256,15 @@ void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, * early if a hugepage was created. */ pml4e = virt_create_upper_pte(vm, vm->pgd >> vm->page_shift, - vaddr, paddr, 3, page_size); + vaddr, paddr_raw, 3, page_size); if (pml4e->page_size) return;
- pdpe = virt_create_upper_pte(vm, pml4e->pfn, vaddr, paddr, 2, page_size); + pdpe = virt_create_upper_pte(vm, pml4e->pfn, vaddr, paddr_raw, 2, page_size); if (pdpe->page_size) return;
- pde = virt_create_upper_pte(vm, pdpe->pfn, vaddr, paddr, 1, page_size); + pde = virt_create_upper_pte(vm, pdpe->pfn, vaddr, paddr_raw, 1, page_size); if (pde->page_size) return;
@@ -272,14 +272,14 @@ void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, pte = virt_get_pte(vm, pde->pfn, vaddr, 0); TEST_ASSERT(!pte->present, "PTE already present for 4k page at vaddr: 0x%lx\n", vaddr); - pte->pfn = paddr >> vm->page_shift; + pte->pfn = paddr_raw >> vm->page_shift; pte->writable = true; pte->present = 1; }
-void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr) +void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr_raw) { - __virt_pg_map(vm, vaddr, paddr, X86_PAGE_SIZE_4K); + __virt_pg_map(vm, vaddr, paddr_raw, X86_PAGE_SIZE_4K); }
static struct pageTableEntry *_vm_get_page_table_entry(struct kvm_vm *vm, int vcpuid, @@ -587,7 +587,7 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva) if (!pte[index[0]].present) goto unmapped_gva;
- return (pte[index[0]].pfn * vm->page_size) + (gva & 0xfffu); + return addr_raw2gpa(vm, ((uint64_t)pte[index[0]].pfn * vm->page_size)) + (gva & 0xfffu);
unmapped_gva: TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
On 06/10/21 01:44, Michael Roth wrote:
This is not what you're doing below in addr_gpa2hva, though---or did I misunderstand?
I may be wrong due to not actually having written the code, but I'd prefer if most of these APIs worked only if the C bit has already been stripped. In general it's quite unlikely for host code to deal with C=1 pages, so it's worth pointing out explicitly the cases where it does.
Paolo
On Thu, Oct 21, 2021 at 05:26:26PM +0200, Paolo Bonzini wrote:
The confusion is warranted, addr_gpa2hva() *doesn't* expect the C bit to be masked in advance so the wording is pretty confusing.
I think I was referring the fact that internally it doesn't need/want the C-bit, in this case it just masks it away as a convenience to callers, as opposed to the other functions modified in the patch that actually make use of it.
It's convenient because page table walkers/mappers make use of addr_gpa2hva() to do things like silently mask away C-bits via when translating PTEs to host addresses. We easily convert those callers from:
addr_gpa2hva(paddr)
to this:
addr_gpa2hva(addr_raw2gpa(paddr))
but now all new code needs to consider whether it might be dealing with C-bits or not prior to deciding to pass it to addr_gpa2hva() (or not really think about it, and add addr_gpa2raw() "just in case"). So since it's always harmless to mask it away silently addr_gpa2hva(), the logic/code seems to benefit a good deal if we indicate clearly that addr_gpa2hva() can accept a 'raw' GPA, and will ignore it completely.
But not a big deal either way if you prefer to keep that explicit. And commit message still needs to be clarified.
I've tried to indicate functions that expect the C-bit by adding the 'raw_' prefix to the gpa/paddr parameters, but as you pointed out with addr_gpa2hva() it's already a bit inconsistent in that regard, and there's a couple cases like virt_map() where I should use the 'raw_' prefix as well that I've missed here.
So that should be addressed, and maybe some additional comments/assertions might be warranted to guard against cases where the C-bit is passed in unexpectedly.
But I should probably re-assess why the C-bit is being passed around in the first place:
- vm_phy_page[s]_alloc() is the main 'source' for 'raw' GPAs with the C-bit set. it determines this based on vm_memcrypt encryption policy, and updates the encryption bitmask as well. - vm_phy_page[s]_alloc() is callable both in kvm_util lib as well as individual tests. - in theory, encoding the C-bit in the returned vm_paddr_t means that vm_phy_page[s]_alloc() callers can pass that directly into virt_map/virt_pg_map() and this will "just work" for both encrypted/non-encrypted guests. - by masking it away in addr_gpa2hva(), existing tests/code flow mostly "just works" as well.
But taking a closer look, in cases where vm_phy_page[s]_alloc() is called directly by tests, like set_memory_region_test, emulator_error_test, and smm_test, that raw GPA is compared to hardcoded non-raw GPAs, so they'd still end up needing fixups to work with the proposed transparent-SEV-mode stuff. And future code would need to be written to account for this, so it doesn't really "just work" after all..
So it's worth considering the alternative approach of *not* encoding the C-bit into GPAs returned by vm_phy_page[s]_alloc(). That would likely involve introducing something like addr_gpa2raw(), which adds in the C-bit according to the encryption bitmap as-needed. If we do that:
- virt_map()/virt_pg_map() still need to accept 'raw' GPAs, since they need to deal with cases where pages are being mapping that weren't allocated by vm_phy_page[s]_alloc(), and so aren't recorded in the bitmap. in those cases it is up to test code to provide the C-bit when needed (e.g. things like separate linear mappings for pa()-like stuff in guest code).
- for cases where vm_phy_page[s]_alloc() determines whether the page is encrypted, addr_gpa2raw() needs to be used to add back the C-bit prior to passing it to virt_map()/virt_pg_map(), both in the library and the test code. vm_vaddr_* allocations would handle all this under the covers as they do now.
So test code would need to consider cases where addr_gpa2raw() needs to be used to set the C-bit (which is basically only when they want to mix usage of the vm_phy_page[s]_alloc with their own mapping of the guest page tables, which doesn't seem to be done in any existing tests anyway).
The library code would need these addr_gpa2raw() hooks in places where it calls virt_*map() internally. Probably just a handful of places though.
Assuming there's no issues with this alternative approach that I may be missing, I'll look at doing it this way for the next spin.
Even in this alternative approach though, having addr_gpa2hva() silently mask away C-bit still seems useful for the reasons above, but again, no strong feelings one way or the other on that.
On 24/10/21 18:49, Michael Roth wrote:
Yes, and it seems like a more rare case in general.
Either that, or you can have virt_*map that consults the encryption bitmap, and virt_*map_enc (or _raw, doesn't matter) that takes the encryption state explicitly as a bool.
Ok, keep it the way you find more useful.
Paolo
Exception 29 (#VC) pushes an error_code parameter on the stack. Update the exception list to reflect this.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/lib/x86_64/handlers.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/handlers.S b/tools/testing/selftests/kvm/lib/x86_64/handlers.S index 7629819734af..19715a58f5d2 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/handlers.S +++ b/tools/testing/selftests/kvm/lib/x86_64/handlers.S @@ -76,6 +76,8 @@ idt_handler_code: HANDLERS has_error=1 from=10 to=14 HANDLERS has_error=0 from=15 to=16 HANDLERS has_error=1 from=17 to=17 - HANDLERS has_error=0 from=18 to=255 + HANDLERS has_error=0 from=18 to=28 + HANDLERS has_error=1 from=29 to=29 + HANDLERS has_error=0 from=30 to=255
.section .note.GNU-stack, "", %progbits
Only a couple KVM_SEV_* ioctls need to be handled differently for SEV-ES. Do so when the specified policy indicates SEV-ES support.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/lib/x86_64/sev.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c index adda3b396566..d01b0f637ced 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/sev.c +++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c @@ -238,13 +238,17 @@ struct sev_vm *sev_vm_create(uint32_t policy, uint64_t npages) return NULL; sev->sev_policy = policy;
- kvm_sev_ioctl(sev, KVM_SEV_INIT, NULL); + if (sev->sev_policy & SEV_POLICY_ES) + kvm_sev_ioctl(sev, KVM_SEV_ES_INIT, NULL); + else + kvm_sev_ioctl(sev, KVM_SEV_INIT, NULL);
vm_set_memory_encryption(vm, true, true, sev->enc_bit); vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, 0, 0, npages, 0); sev_register_user_range(sev, addr_gpa2hva(vm, 0), npages * vm_get_page_size(vm));
- pr_info("SEV guest created, policy: 0x%x, size: %lu KB\n", + pr_info("%s guest created, policy: 0x%x, size: %lu KB\n", + (sev->sev_policy & SEV_POLICY_ES) ? "SEV-ES" : "SEV", sev->sev_policy, npages * vm_get_page_size(vm) / 1024);
return sev; @@ -269,6 +273,9 @@ void sev_vm_launch(struct sev_vm *sev) "Unexpected guest state: %d", ksev_status.state);
sev_encrypt(sev); + + if (sev->sev_policy & SEV_POLICY_ES) + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_UPDATE_VMSA, NULL); }
void sev_vm_measure(struct sev_vm *sev, uint8_t *measurement)
Add (or copy from kernel) routines related to handling #VC exceptions (only for cpuid currently) or issuing vmgexits. These will be used mostly by guest code.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/Makefile | 2 +- .../kvm/include/x86_64/sev_exitlib.h | 14 + .../selftests/kvm/include/x86_64/svm.h | 35 +++ .../selftests/kvm/include/x86_64/svm_util.h | 1 + .../selftests/kvm/lib/x86_64/sev_exitlib.c | 249 ++++++++++++++++++ 5 files changed, 300 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h create mode 100644 tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index aa8901bdbd22..7b3261cc60a3 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -35,7 +35,7 @@ endif
LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S -LIBKVM_x86_64 += lib/x86_64/sev.c +LIBKVM_x86_64 += lib/x86_64/sev.c lib/x86_64/sev_exitlib.c LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
diff --git a/tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h b/tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h new file mode 100644 index 000000000000..4b67b4004dfa --- /dev/null +++ b/tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * VC/vmgexit/GHCB-related helpers for SEV-ES/SEV-SNP guests. + * + * Copyright (C) 2021 Advanced Micro Devices + */ + +#ifndef SELFTEST_KVM_SEV_EXITLIB_H +#define SELFTEST_KVM_SEV_EXITLIB_H + +int sev_es_handle_vc(void *ghcb, u64 ghcb_gpa, struct ex_regs *regs); +void sev_es_terminate(int reason); + +#endif /* SELFTEST_KVM_SEV_EXITLIB_H */ diff --git a/tools/testing/selftests/kvm/include/x86_64/svm.h b/tools/testing/selftests/kvm/include/x86_64/svm.h index f4ea2355dbc2..d633caea4b7d 100644 --- a/tools/testing/selftests/kvm/include/x86_64/svm.h +++ b/tools/testing/selftests/kvm/include/x86_64/svm.h @@ -204,6 +204,41 @@ struct __attribute__ ((__packed__)) vmcb_save_area { u64 br_to; u64 last_excp_from; u64 last_excp_to; + + /* + * The following part of the save area is valid only for + * SEV-ES guests when referenced through the GHCB or for + * saving to the host save area. + */ + u8 reserved_7[80]; + u32 pkru; + u8 reserved_7a[20]; + u64 reserved_8; /* rax already available at 0x01f8 */ + u64 rcx; + u64 rdx; + u64 rbx; + u64 reserved_9; /* rsp already available at 0x01d8 */ + u64 rbp; + u64 rsi; + u64 rdi; + u64 r8; + u64 r9; + u64 r10; + u64 r11; + u64 r12; + u64 r13; + u64 r14; + u64 r15; + u8 reserved_10[16]; + u64 sw_exit_code; + u64 sw_exit_info_1; + u64 sw_exit_info_2; + u64 sw_scratch; + u64 sev_features; + u8 reserved_11[48]; + u64 xcr0; + u8 valid_bitmap[16]; + u64 x87_state_gpa; };
struct __attribute__ ((__packed__)) vmcb { diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h index b7531c83b8ae..4319bb6f4691 100644 --- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h +++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h @@ -16,6 +16,7 @@ #define CPUID_SVM_BIT 2 #define CPUID_SVM BIT_ULL(CPUID_SVM_BIT)
+#define SVM_EXIT_CPUID 0x072 #define SVM_EXIT_VMMCALL 0x081
struct svm_test_data { diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c b/tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c new file mode 100644 index 000000000000..b3f7b0297e5b --- /dev/null +++ b/tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c @@ -0,0 +1,249 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * GHCB/#VC/instruction helpers for use with SEV-ES/SEV-SNP guests. + * + * Partially copied from arch/x86/kernel/sev*.c + * + * Copyright (C) 2021 Advanced Micro Devices + */ + +#include <linux/bitops.h> +#include <kvm_util.h> /* needed by kvm_util_internal.h */ +#include "../kvm_util_internal.h" /* needed by processor.h */ +#include "processor.h" /* for struct ex_regs */ +#include "svm_util.h" /* for additional SVM_EXIT_* definitions */ +#include "svm.h" /* for VMCB/VMSA layout */ +#include "sev_exitlib.h" + +#define PAGE_SHIFT 12 + +#define MSR_SEV_ES_GHCB 0xc0010130 + +#define VMGEXIT() { asm volatile("rep; vmmcall\n\r"); } + +#define GHCB_PROTOCOL_MAX 1 +#define GHCB_DEFAULT_USAGE 0 + +/* Guest-requested termination codes */ +#define GHCB_TERMINATE 0x100UL +#define GHCB_TERMINATE_REASON(reason_set, reason_val) \ + (((((u64)reason_set) & 0x7) << 12) | \ + ((((u64)reason_val) & 0xff) << 16)) + +#define GHCB_TERMINATE_REASON_UNSPEC 0 + +/* GHCB MSR protocol for CPUID */ +#define GHCB_CPUID_REQ_EAX 0 +#define GHCB_CPUID_REQ_EBX 1 +#define GHCB_CPUID_REQ_ECX 2 +#define GHCB_CPUID_REQ_EDX 3 +#define GHCB_CPUID_REQ_CODE 0x4UL +#define GHCB_CPUID_REQ(fn, reg) \ + (GHCB_CPUID_REQ_CODE | (((uint64_t)reg & 3) << 30) | (((uint64_t)fn) << 32)) +#define GHCB_CPUID_RESP_CODE 0x5UL +#define GHCB_CPUID_RESP(resp) ((resp) & 0xfff) + +/* GHCB MSR protocol for GHCB registration */ +#define GHCB_REG_GPA_REQ_CODE 0x12UL +#define GHCB_REG_GPA_REQ(gfn) \ + (((unsigned long)((gfn) & GENMASK_ULL(51, 0)) << 12) | GHCB_REG_GPA_REQ_CODE) +#define GHCB_REG_GPA_RESP_CODE 0x13UL +#define GHCB_REG_GPA_RESP(resp) ((resp) & GENMASK_ULL(11, 0)) +#define GHCB_REG_GPA_RESP_VAL(resp) ((resp) >> 12) + +/* GHCB format/accessors */ + +struct ghcb { + struct vmcb_save_area save; + u8 reserved_save[2048 - sizeof(struct vmcb_save_area)]; + u8 shared_buffer[2032]; + u8 reserved_1[10]; + u16 protocol_version; + u32 ghcb_usage; +}; + +#define GHCB_BITMAP_IDX(field) \ + (offsetof(struct vmcb_save_area, field) / sizeof(u64)) + +#define DEFINE_GHCB_ACCESSORS(field) \ + static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb) \ + { \ + return test_bit(GHCB_BITMAP_IDX(field), \ + (unsigned long *)&ghcb->save.valid_bitmap); \ + } \ + \ + static inline u64 ghcb_get_##field(struct ghcb *ghcb) \ + { \ + return ghcb->save.field; \ + } \ + \ + static inline u64 ghcb_get_##field##_if_valid(struct ghcb *ghcb) \ + { \ + return ghcb_##field##_is_valid(ghcb) ? ghcb->save.field : 0; \ + } \ + \ + static inline void ghcb_set_##field(struct ghcb *ghcb, u64 value) \ + { \ + __set_bit(GHCB_BITMAP_IDX(field), \ + (unsigned long *)&ghcb->save.valid_bitmap); \ + ghcb->save.field = value; \ + } + +DEFINE_GHCB_ACCESSORS(cpl) +DEFINE_GHCB_ACCESSORS(rip) +DEFINE_GHCB_ACCESSORS(rsp) +DEFINE_GHCB_ACCESSORS(rax) +DEFINE_GHCB_ACCESSORS(rcx) +DEFINE_GHCB_ACCESSORS(rdx) +DEFINE_GHCB_ACCESSORS(rbx) +DEFINE_GHCB_ACCESSORS(rbp) +DEFINE_GHCB_ACCESSORS(rsi) +DEFINE_GHCB_ACCESSORS(rdi) +DEFINE_GHCB_ACCESSORS(r8) +DEFINE_GHCB_ACCESSORS(r9) +DEFINE_GHCB_ACCESSORS(r10) +DEFINE_GHCB_ACCESSORS(r11) +DEFINE_GHCB_ACCESSORS(r12) +DEFINE_GHCB_ACCESSORS(r13) +DEFINE_GHCB_ACCESSORS(r14) +DEFINE_GHCB_ACCESSORS(r15) +DEFINE_GHCB_ACCESSORS(sw_exit_code) +DEFINE_GHCB_ACCESSORS(sw_exit_info_1) +DEFINE_GHCB_ACCESSORS(sw_exit_info_2) +DEFINE_GHCB_ACCESSORS(sw_scratch) +DEFINE_GHCB_ACCESSORS(xcr0) + +static uint64_t sev_es_rdmsr_ghcb(void) +{ + uint64_t lo, hi; + + asm volatile("rdmsr" + : "=a" (lo), "=d" (hi) + : "c" (MSR_SEV_ES_GHCB)); + + return ((hi << 32) | lo); +} + +static void sev_es_wrmsr_ghcb(uint64_t val) +{ + uint64_t lo, hi; + + lo = val & 0xFFFFFFFF; + hi = val >> 32; + + asm volatile("wrmsr" + :: "c" (MSR_SEV_ES_GHCB), "a" (lo), "d" (hi) + : "memory"); +} + +void sev_es_terminate(int reason) +{ + uint64_t val = GHCB_TERMINATE; + + val |= GHCB_TERMINATE_REASON(2, reason); + + sev_es_wrmsr_ghcb(val); + VMGEXIT(); + + while (true) + asm volatile("hlt" : : : "memory"); +} + +static int sev_es_ghcb_hv_call(struct ghcb *ghcb, u64 ghcb_gpa, u64 exit_code) +{ + ghcb->protocol_version = GHCB_PROTOCOL_MAX; + ghcb->ghcb_usage = GHCB_DEFAULT_USAGE; + + ghcb_set_sw_exit_code(ghcb, exit_code); + ghcb_set_sw_exit_info_1(ghcb, 0); + ghcb_set_sw_exit_info_2(ghcb, 0); + + sev_es_wrmsr_ghcb(ghcb_gpa); + + VMGEXIT(); + + /* Only #VC exceptions are currently handled. */ + if ((ghcb->save.sw_exit_info_1 & 0xffffffff) == 1) + sev_es_terminate(GHCB_TERMINATE_REASON_UNSPEC); + + return 0; +} + +static int handle_vc_cpuid(struct ghcb *ghcb, u64 ghcb_gpa, struct ex_regs *regs) +{ + int ret; + + ghcb_set_rax(ghcb, regs->rax); + ghcb_set_rcx(ghcb, regs->rcx); + + /* ignore additional XSAVE states for now */ + ghcb_set_xcr0(ghcb, 1); + + ret = sev_es_ghcb_hv_call(ghcb, ghcb_gpa, SVM_EXIT_CPUID); + if (ret) + return ret; + + if (!(ghcb_rax_is_valid(ghcb) && + ghcb_rbx_is_valid(ghcb) && + ghcb_rcx_is_valid(ghcb) && + ghcb_rdx_is_valid(ghcb))) + return 1; + + regs->rax = ghcb->save.rax; + regs->rbx = ghcb->save.rbx; + regs->rcx = ghcb->save.rcx; + regs->rdx = ghcb->save.rdx; + + regs->rip += 2; + + return 0; +} + +static int handle_msr_vc_cpuid(struct ex_regs *regs) +{ + uint32_t fn = regs->rax & 0xFFFFFFFF; + uint64_t resp; + + sev_es_wrmsr_ghcb(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX)); + VMGEXIT(); + resp = sev_es_rdmsr_ghcb(); + if (GHCB_CPUID_RESP(resp) != GHCB_CPUID_RESP_CODE) + return 1; + regs->rax = resp >> 32; + + sev_es_wrmsr_ghcb(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX)); + VMGEXIT(); + resp = sev_es_rdmsr_ghcb(); + if (GHCB_CPUID_RESP(resp) != GHCB_CPUID_RESP_CODE) + return 1; + regs->rbx = resp >> 32; + + sev_es_wrmsr_ghcb(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX)); + VMGEXIT(); + resp = sev_es_rdmsr_ghcb(); + if (GHCB_CPUID_RESP(resp) != GHCB_CPUID_RESP_CODE) + return 1; + regs->rcx = resp >> 32; + + sev_es_wrmsr_ghcb(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX)); + VMGEXIT(); + resp = sev_es_rdmsr_ghcb(); + if (GHCB_CPUID_RESP(resp) != GHCB_CPUID_RESP_CODE) + return 1; + regs->rdx = resp >> 32; + + regs->rip += 2; + + return 0; +} + +int sev_es_handle_vc(void *ghcb, u64 ghcb_gpa, struct ex_regs *regs) +{ + if (regs->error_code != SVM_EXIT_CPUID) + return 1; + + if (!ghcb) + return handle_msr_vc_cpuid(regs); + + return handle_vc_cpuid(ghcb, ghcb_gpa, regs); +}
Extend the existing SEV boot tests to also cover SEV-ES guests. Also add some tests for handling #VC exceptions for cpuid instructions using both MSR-based and GHCB-based vmgexits.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../selftests/kvm/x86_64/sev_all_boot_test.c | 63 ++++++++++++++++++- 1 file changed, 62 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c b/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c index 8df7143ac17d..58c57c4c0ec1 100644 --- a/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c +++ b/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c @@ -18,6 +18,7 @@ #include "svm_util.h" #include "linux/psp-sev.h" #include "sev.h" +#include "sev_exitlib.h"
#define VCPU_ID 2 #define PAGE_SIZE 4096 @@ -31,6 +32,10 @@
#define TOTAL_PAGES (512 + SHARED_PAGES + PRIVATE_PAGES)
+/* Globals for use by #VC handler. */ +static void *ghcb0_gva; +static vm_paddr_t ghcb0_gpa; + static void fill_buf(uint8_t *buf, size_t pages, size_t stride, uint8_t val) { int i, j; @@ -171,6 +176,47 @@ guest_sev_code(struct sev_sync_data *sync, uint8_t *shared_buf, uint8_t *private guest_test_done(sync); }
+static void vc_handler(struct ex_regs *regs) +{ + sev_es_handle_vc(ghcb0_gva, ghcb0_gpa, regs); +} + +static void __attribute__((__flatten__)) +guest_sev_es_code(struct sev_sync_data *sync, uint8_t *shared_buf, + uint8_t *private_buf, uint64_t ghcb_gpa, void *ghcb_gva) +{ + uint32_t eax, ebx, ecx, edx, token = 1; + uint64_t sev_status; + + guest_test_start(sync); + +again: + /* Check CPUID values via GHCB MSR protocol. */ + eax = 0x8000001f; + ecx = 0; + cpuid(&eax, &ebx, &ecx, &edx); + + /* Check SEV bit. */ + SEV_GUEST_ASSERT(sync, token++, eax & (1 << 1)); + /* Check SEV-ES bit. */ + SEV_GUEST_ASSERT(sync, token++, eax & (1 << 3)); + + if (!ghcb0_gva) { + ghcb0_gva = ghcb_gva; + ghcb0_gpa = ghcb_gpa; + /* Check CPUID bits again using GHCB-based protocol. */ + goto again; + } + + /* Check SEV and SEV-ES enabled bits (bits 0 and 1, respectively). */ + sev_status = rdmsr(MSR_AMD64_SEV); + SEV_GUEST_ASSERT(sync, token++, (sev_status & 0x3) == 3); + + guest_test_common(sync, shared_buf, private_buf); + + guest_test_done(sync); +} + static void setup_test_common(struct sev_vm *sev, void *guest_code, vm_vaddr_t *sync_vaddr, vm_vaddr_t *shared_vaddr, vm_vaddr_t *private_vaddr) @@ -216,7 +262,18 @@ static void test_sev(void *guest_code, uint64_t policy) setup_test_common(sev, guest_code, &sync_vaddr, &shared_vaddr, &private_vaddr);
/* Set up guest params. */ - vcpu_args_set(vm, VCPU_ID, 4, sync_vaddr, shared_vaddr, private_vaddr); + if (policy & SEV_POLICY_ES) { + vm_vaddr_t ghcb_vaddr = vm_vaddr_alloc_shared(vm, PAGE_SIZE, 0); + + vcpu_args_set(vm, VCPU_ID, 6, sync_vaddr, shared_vaddr, private_vaddr, + addr_gva2gpa(vm, ghcb_vaddr), ghcb_vaddr); + /* Set up VC handler. */ + vm_init_descriptor_tables(vm); + vm_install_exception_handler(vm, 29, vc_handler); + vcpu_init_descriptor_tables(vm, VCPU_ID); + } else { + vcpu_args_set(vm, VCPU_ID, 4, sync_vaddr, shared_vaddr, private_vaddr); + }
sync = addr_gva2hva(vm, sync_vaddr); shared_buf = addr_gva2hva(vm, shared_vaddr); @@ -248,5 +305,9 @@ int main(int argc, char *argv[]) test_sev(guest_sev_code, SEV_POLICY_NO_DBG); test_sev(guest_sev_code, 0);
+ /* SEV-ES tests */ + test_sev(guest_sev_es_code, SEV_POLICY_ES | SEV_POLICY_NO_DBG); + test_sev(guest_sev_es_code, SEV_POLICY_ES); + return 0; }
SEV-SNP uses an entirely different set of KVM_SEV_* ioctls to manage guests. The needed vm_memcrypt callbacks are different as well. Address these differences by extending the SEV library with a new set of interfaces specific to creating/managing SEV-SNP guests.
These guests will still use a struct sev_vm under the covers, so some existing sev_*() helpers are still applicable.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../selftests/kvm/include/x86_64/sev.h | 8 ++ tools/testing/selftests/kvm/lib/x86_64/sev.c | 77 ++++++++++++++++++- 2 files changed, 82 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/x86_64/sev.h b/tools/testing/selftests/kvm/include/x86_64/sev.h index d2f41b131ecc..f3e088c03bdd 100644 --- a/tools/testing/selftests/kvm/include/x86_64/sev.h +++ b/tools/testing/selftests/kvm/include/x86_64/sev.h @@ -18,6 +18,10 @@ #define SEV_POLICY_NO_DBG (1UL << 0) #define SEV_POLICY_ES (1UL << 2)
+#define SNP_POLICY_SMT (1ULL << 16) +#define SNP_POLICY_RSVD (1ULL << 17) +#define SNP_POLICY_DBG (1ULL << 19) + #define SEV_GUEST_ASSERT(sync, token, _cond) do { \ if (!(_cond)) \ sev_guest_abort(sync, token, 0); \ @@ -59,4 +63,8 @@ void sev_vm_launch(struct sev_vm *sev); void sev_vm_measure(struct sev_vm *sev, uint8_t *measurement); void sev_vm_launch_finish(struct sev_vm *sev);
+struct sev_vm *sev_snp_vm_create(uint64_t policy, uint64_t npages); +void sev_snp_vm_free(struct sev_vm *sev); +void sev_snp_vm_launch(struct sev_vm *sev); + #endif /* SELFTEST_KVM_SEV_H */ diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c index d01b0f637ced..939d7d5dff41 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/sev.c +++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c @@ -20,6 +20,7 @@ struct sev_vm { int fd; int enc_bit; uint32_t sev_policy; + uint64_t snp_policy; };
/* Helpers for coordinating between guests and test harness. */ @@ -119,6 +120,12 @@ void kvm_sev_ioctl(struct sev_vm *sev, int cmd, void *data)
/* Local helpers. */
+static bool sev_snp_enabled(struct sev_vm *sev) +{ + /* RSVD is always 1 for SNP guests. */ + return sev->snp_policy & SNP_POLICY_RSVD; +} + static void sev_register_user_range(struct sev_vm *sev, void *hva, uint64_t size) { @@ -147,6 +154,21 @@ sev_encrypt_phy_range(struct sev_vm *sev, vm_paddr_t gpa, uint64_t size) kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_UPDATE_DATA, &ksev_update_data); }
+static void +sev_snp_encrypt_phy_range(struct sev_vm *sev, vm_paddr_t gpa, uint64_t size) +{ + struct kvm_sev_snp_launch_update update_data = {0}; + + pr_debug("encrypt_phy_range: addr: 0x%lx, size: %lu\n", gpa, size); + + update_data.uaddr = (__u64)addr_gpa2hva(sev->vm, gpa); + update_data.start_gfn = gpa >> PAGE_SHIFT; + update_data.len = size; + update_data.page_type = KVM_SEV_SNP_PAGE_TYPE_NORMAL; + + kvm_sev_ioctl(sev, KVM_SEV_SNP_LAUNCH_UPDATE, &update_data); +} + static void sev_encrypt(struct sev_vm *sev) { struct sparsebit *enc_phy_pages; @@ -171,9 +193,14 @@ static void sev_encrypt(struct sev_vm *sev) if (pg_cnt <= 0) pg_cnt = 1;
- sev_encrypt_phy_range(sev, - gpa_start + pg * vm_get_page_size(vm), - pg_cnt * vm_get_page_size(vm)); + if (sev_snp_enabled(sev)) + sev_snp_encrypt_phy_range(sev, + gpa_start + pg * vm_get_page_size(vm), + pg_cnt * vm_get_page_size(vm)); + else + sev_encrypt_phy_range(sev, + gpa_start + pg * vm_get_page_size(vm), + pg_cnt * vm_get_page_size(vm)); pg += pg_cnt; }
@@ -308,3 +335,47 @@ void sev_vm_launch_finish(struct sev_vm *sev) TEST_ASSERT(ksev_status.state == SEV_GSTATE_RUNNING, "Unexpected guest state: %d", ksev_status.state); } + +/* SEV-SNP VM implementation. */ + +struct sev_vm *sev_snp_vm_create(uint64_t policy, uint64_t npages) +{ + struct kvm_snp_init init = {0}; + struct sev_vm *sev; + struct kvm_vm *vm; + + vm = vm_create(VM_MODE_DEFAULT, 0, O_RDWR); + sev = sev_common_create(vm); + if (!sev) + return NULL; + sev->snp_policy = policy | SNP_POLICY_RSVD; + + kvm_sev_ioctl(sev, KVM_SEV_SNP_INIT, &init); + vm_set_memory_encryption(vm, true, true, sev->enc_bit); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, 0, 0, npages, 0); + sev_register_user_range(sev, addr_gpa2hva(vm, 0), npages * vm_get_page_size(vm)); + + pr_info("SEV-SNP guest created, policy: 0x%lx, size: %lu KB\n", + sev->snp_policy, npages * vm_get_page_size(vm) / 1024); + + return sev; +} + +void sev_snp_vm_free(struct sev_vm *sev) +{ + kvm_vm_free(sev->vm); + sev_common_free(sev); +} + +void sev_snp_vm_launch(struct sev_vm *sev) +{ + struct kvm_sev_snp_launch_start launch_start = {0}; + struct kvm_sev_snp_launch_update launch_finish = {0}; + + launch_start.policy = sev->snp_policy; + kvm_sev_ioctl(sev, KVM_SEV_SNP_LAUNCH_START, &launch_start); + + sev_encrypt(sev); + + kvm_sev_ioctl(sev, KVM_SEV_SNP_LAUNCH_FINISH, &launch_finish); +}
Extend the existing sev_exitlib with helpers for handling pvalidate instructions and issuing page-state changes via the GHCB MSR protocol.
Subsequent SEV-SNP-related tests will make use of these in guest code.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../kvm/include/x86_64/sev_exitlib.h | 6 ++ .../selftests/kvm/lib/x86_64/sev_exitlib.c | 77 +++++++++++++++++++ 2 files changed, 83 insertions(+)
diff --git a/tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h b/tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h index 4b67b4004dfa..5c7356f9e925 100644 --- a/tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h +++ b/tools/testing/selftests/kvm/include/x86_64/sev_exitlib.h @@ -8,7 +8,13 @@ #ifndef SELFTEST_KVM_SEV_EXITLIB_H #define SELFTEST_KVM_SEV_EXITLIB_H
+#define PVALIDATE_NO_UPDATE 255 + int sev_es_handle_vc(void *ghcb, u64 ghcb_gpa, struct ex_regs *regs); void sev_es_terminate(int reason); +void snp_register_ghcb(u64 ghcb_gpa); +void snp_psc_set_shared(u64 gpa); +void snp_psc_set_private(u64 gpa); +int snp_pvalidate(void *ptr, bool rmp_psize, bool validate);
#endif /* SELFTEST_KVM_SEV_EXITLIB_H */ diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c b/tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c index b3f7b0297e5b..546b402d5015 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c +++ b/tools/testing/selftests/kvm/lib/x86_64/sev_exitlib.c @@ -51,6 +51,19 @@ #define GHCB_REG_GPA_RESP(resp) ((resp) & GENMASK_ULL(11, 0)) #define GHCB_REG_GPA_RESP_VAL(resp) ((resp) >> 12)
+/* GHCB MSR protocol for Page State Change */ +#define GHCB_PSC_REQ_PRIVATE 1 +#define GHCB_PSC_REQ_SHARED 2 +#define GHCB_PSC_REQ_PSMASH 3 +#define GHCB_PSC_REQ_UNSMASH 4 +#define GHCB_PSC_REQ_CODE 0x14UL +#define GHCB_PSC_REQ(gfn, op) \ + (((unsigned long)((op) & 0xf) << 52) | \ + ((unsigned long)((gfn) & ~(1ULL << 40)) << 12) | \ + GHCB_PSC_REQ_CODE) +#define GHCB_PSC_RESP_CODE 0x15UL +#define GHCB_PSC_RESP(resp) ((resp) & GENMASK_ULL(11, 0)) + /* GHCB format/accessors */
struct ghcb { @@ -247,3 +260,67 @@ int sev_es_handle_vc(void *ghcb, u64 ghcb_gpa, struct ex_regs *regs)
return handle_vc_cpuid(ghcb, ghcb_gpa, regs); } + +void snp_register_ghcb(u64 ghcb_gpa) +{ + u64 gfn = ghcb_gpa >> PAGE_SHIFT; + u64 resp; + + sev_es_wrmsr_ghcb(GHCB_REG_GPA_REQ(gfn)); + VMGEXIT(); + + resp = sev_es_rdmsr_ghcb(); + if (GHCB_REG_GPA_RESP(resp) != GHCB_REG_GPA_RESP_CODE || + GHCB_REG_GPA_RESP_VAL(resp) != gfn) + sev_es_terminate(GHCB_TERMINATE_REASON_UNSPEC); +} + +static void snp_psc_request(u64 gfn, int op) +{ + u64 resp; + + sev_es_wrmsr_ghcb(GHCB_PSC_REQ(gfn, op)); + VMGEXIT(); + + resp = sev_es_rdmsr_ghcb(); + if (GHCB_PSC_RESP(resp) != GHCB_PSC_RESP_CODE) + sev_es_terminate(GHCB_TERMINATE_REASON_UNSPEC); +} + +void snp_psc_set_shared(u64 gpa) +{ + snp_psc_request(gpa >> PAGE_SHIFT, GHCB_PSC_REQ_SHARED); +} + +void snp_psc_set_private(u64 gpa) +{ + snp_psc_request(gpa >> PAGE_SHIFT, GHCB_PSC_REQ_PRIVATE); +} + +/* From arch/x86/include/asm/asm.h */ +#ifdef __GCC_ASM_FLAG_OUTPUTS__ +# define CC_SET(c) "\n\t/* output condition code " #c "*/\n" +# define CC_OUT(c) "=@cc" #c +#else +# define CC_SET(c) "\n\tset" #c " %[_cc_" #c "]\n" +# define CC_OUT(c) [_cc_ ## c] "=qm" +#endif + +int snp_pvalidate(void *ptr, bool rmp_psize, bool validate) +{ + uint64_t gva = (uint64_t)ptr; + bool no_rmpupdate; + int rc; + + /* "pvalidate" mnemonic support in binutils 2.36 and newer */ + asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFF\n\t" + CC_SET(c) + : CC_OUT(c) (no_rmpupdate), "=a"(rc) + : "a"(gva), "c"(rmp_psize), "d"(validate) + : "memory", "cc"); + + if (no_rmpupdate) + return PVALIDATE_NO_UPDATE; + + return rc; +}
Extend the existing SEV/SEV-ES boot tests to also cover SEV-SNP guests. Also add a basic test to check validation state of initial guest memory.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../selftests/kvm/include/x86_64/svm_util.h | 1 + .../selftests/kvm/x86_64/sev_all_boot_test.c | 86 +++++++++++++++---- 2 files changed, 71 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h index 4319bb6f4691..6c51fc304ce9 100644 --- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h +++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h @@ -18,6 +18,7 @@
#define SVM_EXIT_CPUID 0x072 #define SVM_EXIT_VMMCALL 0x081 +#define SVM_EXIT_NOT_VALIDATED 0x404
struct svm_test_data { /* VMCB */ diff --git a/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c b/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c index 58c57c4c0ec1..3d8048efa25f 100644 --- a/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c +++ b/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c @@ -217,6 +217,48 @@ guest_sev_es_code(struct sev_sync_data *sync, uint8_t *shared_buf, guest_test_done(sync); }
+static void __attribute__((__flatten__)) +guest_sev_snp_code(struct sev_sync_data *sync, uint8_t *shared_buf, + uint8_t *private_buf, uint64_t ghcb_gpa, void *ghcb_gva) +{ + uint32_t eax, ebx, ecx, edx, token = 1; + uint64_t sev_status; + int ret; + + guest_test_start(sync); + +again: + /* Check CPUID values via GHCB MSR protocol. */ + eax = 0x8000001f; + ecx = 0; + cpuid(&eax, &ebx, &ecx, &edx); + + /* Check SEV bit. */ + SEV_GUEST_ASSERT(sync, token++, eax & (1 << 1)); + /* Check SEV-ES bit. */ + SEV_GUEST_ASSERT(sync, token++, eax & (1 << 3)); + + if (!ghcb0_gva) { + ghcb0_gva = ghcb_gva; + ghcb0_gpa = ghcb_gpa; + snp_register_ghcb(ghcb0_gpa); + /* Check CPUID bits again using GHCB-based protocol. */ + goto again; + } + + /* Check SEV/SEV-ES/SEV-SNP enabled bits (bits 0, 1, and 3, respectively). */ + sev_status = rdmsr(MSR_AMD64_SEV); + SEV_GUEST_ASSERT(sync, token++, (sev_status & 0x7) == 7); + + /* Confirm private data was validated by FW prior to boot. */ + ret = snp_pvalidate(private_buf, 0, true); + SEV_GUEST_ASSERT(sync, token++, ret == PVALIDATE_NO_UPDATE); + + guest_test_common(sync, shared_buf, private_buf); + + guest_test_done(sync); +} + static void setup_test_common(struct sev_vm *sev, void *guest_code, vm_vaddr_t *sync_vaddr, vm_vaddr_t *shared_vaddr, vm_vaddr_t *private_vaddr) @@ -244,7 +286,7 @@ setup_test_common(struct sev_vm *sev, void *guest_code, vm_vaddr_t *sync_vaddr, fill_buf(private_buf, PRIVATE_PAGES, PAGE_STRIDE, 0x42); }
-static void test_sev(void *guest_code, uint64_t policy) +static void test_sev(void *guest_code, bool snp, uint64_t policy) { vm_vaddr_t sync_vaddr, shared_vaddr, private_vaddr; uint8_t *shared_buf, *private_buf; @@ -254,7 +296,8 @@ static void test_sev(void *guest_code, uint64_t policy) struct kvm_vm *vm; int i;
- sev = sev_vm_create(policy, TOTAL_PAGES); + sev = snp ? sev_snp_vm_create(policy, TOTAL_PAGES) + : sev_vm_create(policy, TOTAL_PAGES); if (!sev) return; vm = sev_get_vm(sev); @@ -262,7 +305,7 @@ static void test_sev(void *guest_code, uint64_t policy) setup_test_common(sev, guest_code, &sync_vaddr, &shared_vaddr, &private_vaddr);
/* Set up guest params. */ - if (policy & SEV_POLICY_ES) { + if (snp || (policy & SEV_POLICY_ES)) { vm_vaddr_t ghcb_vaddr = vm_vaddr_alloc_shared(vm, PAGE_SIZE, 0);
vcpu_args_set(vm, VCPU_ID, 6, sync_vaddr, shared_vaddr, private_vaddr, @@ -280,34 +323,45 @@ static void test_sev(void *guest_code, uint64_t policy) private_buf = addr_gva2hva(vm, private_vaddr);
/* Allocations/setup done. Encrypt initial guest payload. */ - sev_vm_launch(sev); + if (snp) { + sev_snp_vm_launch(sev); + } else { + sev_vm_launch(sev);
- /* Dump the initial measurement. A test to actually verify it would be nice. */ - sev_vm_measure(sev, measurement); - pr_info("guest measurement: "); - for (i = 0; i < 32; ++i) - pr_info("%02x", measurement[i]); - pr_info("\n"); + /* Dump the initial measurement. A test to actually verify it would be nice. */ + sev_vm_measure(sev, measurement); + pr_info("guest measurement: "); + for (i = 0; i < 32; ++i) + pr_info("%02x", measurement[i]); + pr_info("\n");
- sev_vm_launch_finish(sev); + sev_vm_launch_finish(sev); + }
/* Guest is ready to run. Do the tests. */ check_test_start(vm, sync); check_test_common(vm, sync, shared_buf, private_buf); check_test_done(vm, sync);
- sev_vm_free(sev); + if (snp) + sev_snp_vm_free(sev); + else + sev_vm_free(sev); }
int main(int argc, char *argv[]) { /* SEV tests */ - test_sev(guest_sev_code, SEV_POLICY_NO_DBG); - test_sev(guest_sev_code, 0); + test_sev(guest_sev_code, false, SEV_POLICY_NO_DBG); + test_sev(guest_sev_code, false, 0);
/* SEV-ES tests */ - test_sev(guest_sev_es_code, SEV_POLICY_ES | SEV_POLICY_NO_DBG); - test_sev(guest_sev_es_code, SEV_POLICY_ES); + test_sev(guest_sev_es_code, false, SEV_POLICY_ES | SEV_POLICY_NO_DBG); + test_sev(guest_sev_es_code, false, SEV_POLICY_ES); + + /* SEV-SNP tests */ + test_sev(guest_sev_snp_code, true, SNP_POLICY_SMT); + test_sev(guest_sev_snp_code, true, SNP_POLICY_SMT | SNP_POLICY_DBG);
return 0; }
With SEV-SNP guests memory is marked as private/shared or validated/un-validated in the RMP table. Transitioning memory between these states can be done within a guest via pvalidate instructions and the page-state changes via GHCB protocol. Add a number of tests to cover various permutations of the operations across shared/private guest memory.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/x86_64/sev_snp_psc_test.c | 378 ++++++++++++++++++ 3 files changed, 380 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/sev_snp_psc_test.c
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index 824f100bec2a..cad9ebe7728d 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -39,6 +39,7 @@ /x86_64/xss_msr_test /x86_64/vmx_pmu_msrs_test /x86_64/sev_all_boot_test +/x86_64/sev_snp_psc_test /access_tracking_perf_test /demand_paging_test /dirty_log_test diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 7b3261cc60a3..b95fb86f12aa 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -73,6 +73,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_pmu_msrs_test TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test TEST_GEN_PROGS_x86_64 += x86_64/sev_all_boot_test +TEST_GEN_PROGS_x86_64 += x86_64/sev_snp_psc_test TEST_GEN_PROGS_x86_64 += access_tracking_perf_test TEST_GEN_PROGS_x86_64 += demand_paging_test TEST_GEN_PROGS_x86_64 += dirty_log_test diff --git a/tools/testing/selftests/kvm/x86_64/sev_snp_psc_test.c b/tools/testing/selftests/kvm/x86_64/sev_snp_psc_test.c new file mode 100644 index 000000000000..695abcd14792 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/sev_snp_psc_test.c @@ -0,0 +1,378 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * SEV-SNP tests for pvalidate and page-state changes. + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include <fcntl.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/ioctl.h> + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "svm_util.h" +#include "linux/psp-sev.h" +#include "sev.h" +#include "sev_exitlib.h" + +#define VCPU_ID 0 +#define PAGE_SHIFT 12 +#define PAGE_SIZE (1UL << PAGE_SHIFT) +#define PAGE_STRIDE 64 + +/* NOTE: private/shared pages must each number at least 4 and be power of 2. */ + +#define SHARED_PAGES 512 +#define SHARED_VADDR_MIN 0x1000000 + +#define PRIVATE_PAGES 512 +#define PRIVATE_VADDR_MIN (SHARED_VADDR_MIN + SHARED_PAGES * PAGE_SIZE) + +#define TOTAL_PAGES (512 + SHARED_PAGES + PRIVATE_PAGES) +#define LINEAR_MAP_GVA (PRIVATE_VADDR_MIN + PRIVATE_PAGES * PAGE_SIZE) + +struct pageTableEntry { + uint64_t present:1; + uint64_t ignored_11_01:11; + uint64_t pfn:40; + uint64_t ignored_63_52:12; +}; + +/* Globals for use by #VC handler and helpers. */ +static int page_not_validated_count; +static struct sev_sync_data *guest_sync; +static uint8_t enc_bit; + +static void fill_buf(uint8_t *buf, size_t pages, size_t stride, uint8_t val) +{ + int i, j; + + for (i = 0; i < pages; i++) + for (j = 0; j < PAGE_SIZE; j += stride) + buf[i * PAGE_SIZE + j] = val; +} + +static bool check_buf_nostop(uint8_t *buf, size_t pages, size_t stride, uint8_t val) +{ + bool matches = true; + int i, j; + + for (i = 0; i < pages; i++) + for (j = 0; j < PAGE_SIZE; j += stride) + if (buf[i * PAGE_SIZE + j] != val) + matches = false; + return matches; +} + +static bool check_buf(uint8_t *buf, size_t pages, size_t stride, uint8_t val) +{ + int i, j; + + for (i = 0; i < pages; i++) + for (j = 0; j < PAGE_SIZE; j += stride) + if (buf[i * PAGE_SIZE + j] != val) + return false; + + return true; +} + +static void vc_handler(struct ex_regs *regs) +{ + int ret; + + if (regs->error_code == SVM_EXIT_NOT_VALIDATED) { + unsigned long gva; + + page_not_validated_count++; + + asm volatile("mov %%cr2,%0" : "=r" (gva)); + ret = snp_pvalidate((void *)gva, 0, true); + SEV_GUEST_ASSERT(guest_sync, 9001, !ret); + + return; + } + + ret = sev_es_handle_vc(NULL, 0, regs); + SEV_GUEST_ASSERT(guest_sync, 20000 + regs->error_code, !ret); +} + +#define gpa_mask(gpa) (gpa & ~(1ULL << enc_bit)) +#define gfn_mask(gfn) (gfn & ~((1ULL << enc_bit) >> PAGE_SHIFT)) +#define va(gpa) ((void *)(LINEAR_MAP_GVA + (gpa & ~(1ULL << enc_bit)))) +#define gfn2va(gfn) va(gfn_mask(gfn) * PAGE_SIZE) + +static void set_pte_bit(void *ptr, uint8_t pos, bool enable) +{ + struct pageTableEntry *pml4e, *pdpe, *pde, *pte; + uint16_t index[4]; + uint64_t *pte_val; + uint64_t gva = (uint64_t)ptr; + + index[0] = (gva >> 12) & 0x1FFU; + index[1] = (gva >> 21) & 0x1FFU; + index[2] = (gva >> 30) & 0x1FFU; + index[3] = (gva >> 39) & 0x1FFU; + + pml4e = (struct pageTableEntry *)va(gpa_mask(get_cr3())); + SEV_GUEST_ASSERT(guest_sync, 1001, pml4e[index[3]].present); + + pdpe = (struct pageTableEntry *)gfn2va(pml4e[index[3]].pfn); + SEV_GUEST_ASSERT(guest_sync, 1002, pdpe[index[2]].present); + + pde = (struct pageTableEntry *)gfn2va(pdpe[index[2]].pfn); + SEV_GUEST_ASSERT(guest_sync, 1003, pde[index[1]].present); + + pte = (struct pageTableEntry *)gfn2va(pde[index[1]].pfn); + SEV_GUEST_ASSERT(guest_sync, 1004, pte[index[0]].present); + + pte_val = (uint64_t *)&pte[index[0]]; + if (enable) + *pte_val |= (1UL << pos); + else + *pte_val &= ~(1UL << pos); + + asm volatile("invlpg (%0)" ::"r" (gva) : "memory"); +} + +static void guest_test_psc(uint64_t shared_buf_gpa, uint8_t *shared_buf, + uint64_t private_buf_gpa, uint8_t *private_buf) +{ + bool success; + int rc, i; + + sev_guest_sync(guest_sync, 100, 0); + + /* Flip 1st half of private pages to shared and verify VMM can read them. */ + for (i = 0; i < (PRIVATE_PAGES / 2); i++) { + rc = snp_pvalidate(&private_buf[i * PAGE_SIZE], 0, false); + SEV_GUEST_ASSERT(guest_sync, 101, !rc); + snp_psc_set_shared(private_buf_gpa + i * PAGE_SIZE); + set_pte_bit(&private_buf[i * PAGE_SIZE], enc_bit, false); + } + fill_buf(private_buf, PRIVATE_PAGES / 2, PAGE_STRIDE, 0x43); + + sev_guest_sync(guest_sync, 200, 0); + + /* + * Flip 2nd half of private pages to shared and hand them to the VMM. + * + * This time leave the C-bit set, which should cause a 0x404 + * (PAGE_NOT_VALIDATED) #VC when guest later attempts to access each + * page. + */ + for (i = PRIVATE_PAGES / 2; i < PRIVATE_PAGES; i++) { + rc = snp_pvalidate(&private_buf[i * PAGE_SIZE], 0, false); + if (rc) + sev_guest_abort(guest_sync, rc, 0); + snp_psc_set_shared(private_buf_gpa + i * PAGE_SIZE); + } + + sev_guest_sync(guest_sync, 300, 0); + + /* + * VMM has filled up the newly-shared pages, but C-bit is still set, so + * verify the contents still show up as encrypted, and make sure to + * access each to verify #VC records the PAGE_NOT_VALIDATED exceptions. + */ + WRITE_ONCE(page_not_validated_count, 0); + success = check_buf_nostop(&private_buf[(PRIVATE_PAGES / 2) * PAGE_SIZE], + PRIVATE_PAGES / 2, PAGE_STRIDE, 0x44); + SEV_GUEST_ASSERT(guest_sync, 301, !success); + SEV_GUEST_ASSERT(guest_sync, 302, + READ_ONCE(page_not_validated_count) == (PRIVATE_PAGES / 2)); + + /* Now flip the C-bit off and verify the VMM-provided values are intact. */ + for (i = PRIVATE_PAGES / 2; i < PRIVATE_PAGES; i++) + set_pte_bit(&private_buf[i * PAGE_SIZE], enc_bit, false); + success = check_buf(&private_buf[(PRIVATE_PAGES / 2) * PAGE_SIZE], + PRIVATE_PAGES / 2, PAGE_STRIDE, 0x44); + SEV_GUEST_ASSERT(guest_sync, 303, success); + + /* Flip the 1st half back to private pages. */ + for (i = 0; i < (PRIVATE_PAGES / 2); i++) { + snp_psc_set_private(private_buf_gpa + i * PAGE_SIZE); + set_pte_bit(&private_buf[i * PAGE_SIZE], enc_bit, true); + rc = snp_pvalidate(&private_buf[i * PAGE_SIZE], 0, true); + SEV_GUEST_ASSERT(guest_sync, 304, !rc); + } + /* Pages are private again, write over them with new encrypted data. */ + fill_buf(private_buf, PRIVATE_PAGES / 2, PAGE_STRIDE, 0x45); + + sev_guest_sync(guest_sync, 400, 0); + + /* + * Take some private pages and flip the C-bit off. Subsequent access + * should cause an RMP fault, which should lead to the VMM doing a + * PSC to shared on our behalf. + */ + for (i = 0; i < (PRIVATE_PAGES / 4); i++) + set_pte_bit(&private_buf[i * PAGE_SIZE], enc_bit, false); + fill_buf(private_buf, PRIVATE_PAGES / 4, PAGE_STRIDE, 0x46); + + sev_guest_sync(guest_sync, 500, 0); + + /* Flip all even-numbered shared pages to private. */ + for (i = 0; i < SHARED_PAGES; i++) { + if ((i % 2) != 0) + continue; + + snp_psc_set_private(shared_buf_gpa + i * PAGE_SIZE); + set_pte_bit(&shared_buf[i * PAGE_SIZE], enc_bit, true); + rc = snp_pvalidate(&shared_buf[i * PAGE_SIZE], 0, true); + SEV_GUEST_ASSERT(guest_sync, 501, !rc); + } + + /* Write across the entire range and hand it back to VMM to verify. */ + fill_buf(shared_buf, SHARED_PAGES, PAGE_STRIDE, 0x47); + + sev_guest_sync(guest_sync, 600, 0); +} + +static void check_test_psc(struct kvm_vm *vm, struct sev_sync_data *sync, + uint8_t *shared_buf, uint8_t *private_buf) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + bool success; + int i; + + /* Initial check-in for PSC tests. */ + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 100); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 200); + + /* 1st half of private buffer should be shared now, check contents. */ + success = check_buf(private_buf, PRIVATE_PAGES / 2, PAGE_STRIDE, 0x43); + TEST_ASSERT(success, "Unexpected contents in newly-shared buffer."); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 300); + + /* 2nd half of private buffer should be shared now, write to it. */ + fill_buf(&private_buf[(PRIVATE_PAGES / 2) * PAGE_SIZE], + PRIVATE_PAGES / 2, PAGE_STRIDE, 0x44); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 400); + + /* 1st half of private buffer should no longer be shared. Verify. */ + success = check_buf(private_buf, PRIVATE_PAGES / 2, PAGE_STRIDE, 0x45); + TEST_ASSERT(!success, "Unexpected contents in newly-private buffer."); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 500); + + /* 1st quarter of private buffer should be shared again. Verify. */ + success = check_buf(private_buf, PRIVATE_PAGES / 4, PAGE_STRIDE, 0x46); + TEST_ASSERT(success, "Unexpected contents in newly-shared buffer."); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 600); + + /* Verify even-numbered pages in shared_buf are now private. */ + for (i = 0; i < SHARED_PAGES; i++) { + success = check_buf(&shared_buf[i * PAGE_SIZE], 1, PAGE_STRIDE, 0x47); + if ((i % 2) == 0) + TEST_ASSERT(!success, "Private buffer contains plain-text."); + else + TEST_ASSERT(success, "Shared buffer contains cipher-text."); + } +} + +static void __attribute__((__flatten__)) +guest_code(struct sev_sync_data *sync, uint64_t shared_buf_gpa, uint8_t *shared_buf, + uint64_t private_buf_gpa, uint8_t *private_buf) +{ + uint32_t eax, ebx, ecx, edx; + + /* Initial check-in. */ + guest_sync = sync; + sev_guest_sync(guest_sync, 1, 0); + + /* Get encryption bit via CPUID. */ + eax = 0x8000001f; + ecx = 0; + cpuid(&eax, &ebx, &ecx, &edx); + enc_bit = ebx & 0x3F; + + /* Do the tests. */ + guest_test_psc(shared_buf_gpa, shared_buf, private_buf_gpa, private_buf); + + sev_guest_done(guest_sync, 10000, 0); +} + +int main(int argc, char *argv[]) +{ + vm_vaddr_t shared_vaddr, private_vaddr, sync_vaddr; + uint8_t *shared_buf, *private_buf; + struct sev_sync_data *sync; + struct kvm_run *run; + struct sev_vm *sev; + struct kvm_vm *vm; + + /* Create VM and main memslot/region. */ + sev = sev_snp_vm_create(SNP_POLICY_SMT, TOTAL_PAGES); + if (!sev) + exit(KSFT_SKIP); + vm = sev_get_vm(sev); + + /* Set up VCPU and #VC handler. */ + vm_vcpu_add_default(vm, VCPU_ID, guest_code); + kvm_vm_elf_load(vm, program_invocation_name); + vm_init_descriptor_tables(vm); + vm_install_exception_handler(vm, 29, vc_handler); + vcpu_init_descriptor_tables(vm, VCPU_ID); + + /* Set up shared page for sync buffer. */ + sync_vaddr = vm_vaddr_alloc_shared(vm, PAGE_SIZE, 0); + sync = addr_gva2hva(vm, sync_vaddr); + + /* Set up additional buffer for reserved shared memory. */ + shared_vaddr = vm_vaddr_alloc_shared(vm, SHARED_PAGES * PAGE_SIZE, + SHARED_VADDR_MIN); + shared_buf = addr_gva2hva(vm, shared_vaddr); + memset(shared_buf, 0, SHARED_PAGES * PAGE_SIZE); + + /* Set up additional buffer for reserved private memory. */ + private_vaddr = vm_vaddr_alloc(vm, PRIVATE_PAGES * PAGE_SIZE, + PRIVATE_VADDR_MIN); + private_buf = addr_gva2hva(vm, private_vaddr); + memset(private_buf, 0, PRIVATE_PAGES * PAGE_SIZE); + + /* + * Create a linear mapping of all guest memory. This will map all pages + * as encrypted, which is okay in this case, because the linear mapping + * will only be used to access page tables, which are always treated + * as encrypted. + */ + virt_map(vm, LINEAR_MAP_GVA, 1UL << sev_get_enc_bit(sev), TOTAL_PAGES); + + /* Set up guest params. */ + vcpu_args_set(vm, VCPU_ID, 5, sync_vaddr, + addr_gva2gpa(vm, shared_vaddr), shared_vaddr, + addr_gva2gpa(vm, private_vaddr), private_vaddr); + + /* Encrypt initial guest payload and prepare to run it. */ + sev_snp_vm_launch(sev); + + /* Initial guest check-in. */ + run = vcpu_state(vm, VCPU_ID); + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 1); + + /* Do the tests. */ + check_test_psc(vm, sync, shared_buf, private_buf); + + /* Wait for guest to finish up. */ + vcpu_run(vm, VCPU_ID); + sev_check_guest_done(run, sync, 10000); + + sev_snp_vm_free(sev); + + return 0; +}
Add interfaces to allow tests to create/manage SEV guests. The additional state associated with these guests is encapsulated in a new struct sev_vm, which is a light wrapper around struct kvm_vm. These VMs will use vm_set_memory_encryption() and vm_get_encrypted_phy_pages() under the covers to configure and sync up with the core kvm_util library on what should/shouldn't be treated as encrypted memory.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/include/x86_64/sev.h | 62 ++++ tools/testing/selftests/kvm/lib/x86_64/sev.c | 303 ++++++++++++++++++ 3 files changed, 366 insertions(+) create mode 100644 tools/testing/selftests/kvm/include/x86_64/sev.h create mode 100644 tools/testing/selftests/kvm/lib/x86_64/sev.c
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 5832f510a16c..c7a5e1c69e0c 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -35,6 +35,7 @@ endif
LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S +LIBKVM_x86_64 += lib/x86_64/sev.c LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
diff --git a/tools/testing/selftests/kvm/include/x86_64/sev.h b/tools/testing/selftests/kvm/include/x86_64/sev.h new file mode 100644 index 000000000000..d2f41b131ecc --- /dev/null +++ b/tools/testing/selftests/kvm/include/x86_64/sev.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Helpers used for SEV guests + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#ifndef SELFTEST_KVM_SEV_H +#define SELFTEST_KVM_SEV_H + +#include <stdint.h> +#include <stdbool.h> +#include "kvm_util.h" + +#define SEV_DEV_PATH "/dev/sev" +#define SEV_FW_REQ_VER_MAJOR 1 +#define SEV_FW_REQ_VER_MINOR 30 + +#define SEV_POLICY_NO_DBG (1UL << 0) +#define SEV_POLICY_ES (1UL << 2) + +#define SEV_GUEST_ASSERT(sync, token, _cond) do { \ + if (!(_cond)) \ + sev_guest_abort(sync, token, 0); \ +} while (0) + +enum { + SEV_GSTATE_UNINIT = 0, + SEV_GSTATE_LUPDATE, + SEV_GSTATE_LSECRET, + SEV_GSTATE_RUNNING, +}; + +struct sev_sync_data { + uint32_t token; + bool pending; + bool done; + bool aborted; + uint64_t info; +}; + +struct sev_vm; + +void sev_guest_sync(struct sev_sync_data *sync, uint32_t token, uint64_t info); +void sev_guest_done(struct sev_sync_data *sync, uint32_t token, uint64_t info); +void sev_guest_abort(struct sev_sync_data *sync, uint32_t token, uint64_t info); + +void sev_check_guest_sync(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token); +void sev_check_guest_done(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token); + +void kvm_sev_ioctl(struct sev_vm *sev, int cmd, void *data); +struct kvm_vm *sev_get_vm(struct sev_vm *sev); +uint8_t sev_get_enc_bit(struct sev_vm *sev); + +struct sev_vm *sev_vm_create(uint32_t policy, uint64_t npages); +void sev_vm_free(struct sev_vm *sev); +void sev_vm_launch(struct sev_vm *sev); +void sev_vm_measure(struct sev_vm *sev, uint8_t *measurement); +void sev_vm_launch_finish(struct sev_vm *sev); + +#endif /* SELFTEST_KVM_SEV_H */ diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c new file mode 100644 index 000000000000..adda3b396566 --- /dev/null +++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c @@ -0,0 +1,303 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Helpers used for SEV guests + * + * Copyright (C) 2021 Advanced Micro Devices + */ + +#include <stdint.h> +#include <stdbool.h> +#include "kvm_util.h" +#include "linux/psp-sev.h" +#include "processor.h" +#include "sev.h" + +#define PAGE_SHIFT 12 +#define PAGE_SIZE (1UL << PAGE_SHIFT) + +struct sev_vm { + struct kvm_vm *vm; + int fd; + int enc_bit; + uint32_t sev_policy; +}; + +/* Helpers for coordinating between guests and test harness. */ + +void sev_guest_sync(struct sev_sync_data *sync, uint32_t token, uint64_t info) +{ + sync->token = token; + sync->info = info; + sync->pending = true; + + asm volatile("hlt" : : : "memory"); +} + +void sev_guest_done(struct sev_sync_data *sync, uint32_t token, uint64_t info) +{ + while (true) { + sync->done = true; + sev_guest_sync(sync, token, info); + } +} + +void sev_guest_abort(struct sev_sync_data *sync, uint32_t token, uint64_t info) +{ + while (true) { + sync->aborted = true; + sev_guest_sync(sync, token, info); + } +} + +void sev_check_guest_sync(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token) +{ + TEST_ASSERT(run->exit_reason == KVM_EXIT_HLT, + "unexpected exit reason: %u (%s)", + run->exit_reason, exit_reason_str(run->exit_reason)); + TEST_ASSERT(sync->token == token, + "unexpected guest token, expected %d, got: %d", token, + sync->token); + TEST_ASSERT(!sync->done, "unexpected guest state"); + TEST_ASSERT(!sync->aborted, "unexpected guest state"); + sync->pending = false; +} + +void sev_check_guest_done(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token) +{ + TEST_ASSERT(run->exit_reason == KVM_EXIT_HLT, + "unexpected exit reason: %u (%s)", + run->exit_reason, exit_reason_str(run->exit_reason)); + TEST_ASSERT(sync->token == token, + "unexpected guest token, expected %d, got: %d", token, + sync->token); + TEST_ASSERT(sync->done, "unexpected guest state"); + TEST_ASSERT(!sync->aborted, "unexpected guest state"); + sync->pending = false; +} + +/* Common SEV helpers/accessors. */ + +struct kvm_vm *sev_get_vm(struct sev_vm *sev) +{ + return sev->vm; +} + +uint8_t sev_get_enc_bit(struct sev_vm *sev) +{ + return sev->enc_bit; +} + +void sev_ioctl(int sev_fd, int cmd, void *data) +{ + int ret; + struct sev_issue_cmd arg; + + arg.cmd = cmd; + arg.data = (unsigned long)data; + ret = ioctl(sev_fd, SEV_ISSUE_CMD, &arg); + TEST_ASSERT(ret == 0, + "SEV ioctl %d failed, error: %d, fw_error: %d", + cmd, ret, arg.error); +} + +void kvm_sev_ioctl(struct sev_vm *sev, int cmd, void *data) +{ + struct kvm_sev_cmd arg = {0}; + int ret; + + arg.id = cmd; + arg.sev_fd = sev->fd; + arg.data = (__u64)data; + + ret = ioctl(vm_get_fd(sev->vm), KVM_MEMORY_ENCRYPT_OP, &arg); + TEST_ASSERT(ret == 0, + "SEV KVM ioctl %d failed, rc: %i errno: %i (%s), fw_error: %d", + cmd, ret, errno, strerror(errno), arg.error); +} + +/* Local helpers. */ + +static void +sev_register_user_range(struct sev_vm *sev, void *hva, uint64_t size) +{ + struct kvm_enc_region range = {0}; + int ret; + + pr_debug("register_user_range: hva: %p, size: %lu\n", hva, size); + + range.addr = (__u64)hva; + range.size = size; + + ret = ioctl(vm_get_fd(sev->vm), KVM_MEMORY_ENCRYPT_REG_REGION, &range); + TEST_ASSERT(ret == 0, "failed to register user range, errno: %i\n", errno); +} + +static void +sev_encrypt_phy_range(struct sev_vm *sev, vm_paddr_t gpa, uint64_t size) +{ + struct kvm_sev_launch_update_data ksev_update_data = {0}; + + pr_debug("encrypt_phy_range: addr: 0x%lx, size: %lu\n", gpa, size); + + ksev_update_data.uaddr = (__u64)addr_gpa2hva(sev->vm, gpa); + ksev_update_data.len = size; + + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_UPDATE_DATA, &ksev_update_data); +} + +static void sev_encrypt(struct sev_vm *sev) +{ + struct sparsebit *enc_phy_pages; + struct kvm_vm *vm = sev->vm; + sparsebit_idx_t pg = 0; + vm_paddr_t gpa_start; + uint64_t memory_size; + + /* Only memslot 0 supported for now. */ + enc_phy_pages = vm_get_encrypted_phy_pages(sev->vm, 0, &gpa_start, &memory_size); + TEST_ASSERT(enc_phy_pages, "Unable to retrieve encrypted pages bitmap"); + while (pg < (memory_size / vm_get_page_size(vm))) { + sparsebit_idx_t pg_cnt; + + if (sparsebit_is_clear(enc_phy_pages, pg)) { + pg = sparsebit_next_set(enc_phy_pages, pg); + if (!pg) + break; + } + + pg_cnt = sparsebit_next_clear(enc_phy_pages, pg) - pg; + if (pg_cnt <= 0) + pg_cnt = 1; + + sev_encrypt_phy_range(sev, + gpa_start + pg * vm_get_page_size(vm), + pg_cnt * vm_get_page_size(vm)); + pg += pg_cnt; + } + + sparsebit_free(&enc_phy_pages); +} + +/* SEV VM implementation. */ + +static struct sev_vm *sev_common_create(struct kvm_vm *vm) +{ + struct sev_user_data_status sev_status = {0}; + uint32_t eax, ebx, ecx, edx; + struct sev_vm *sev; + int sev_fd; + + sev_fd = open(SEV_DEV_PATH, O_RDWR); + if (sev_fd < 0) { + pr_info("Failed to open SEV device, path: %s, error: %d, skipping test.\n", + SEV_DEV_PATH, sev_fd); + return NULL; + } + + sev_ioctl(sev_fd, SEV_PLATFORM_STATUS, &sev_status); + + if (!(sev_status.api_major > SEV_FW_REQ_VER_MAJOR || + (sev_status.api_major == SEV_FW_REQ_VER_MAJOR && + sev_status.api_minor >= SEV_FW_REQ_VER_MINOR))) { + pr_info("SEV FW version too old. Have API %d.%d (build: %d), need %d.%d, skipping test.\n", + sev_status.api_major, sev_status.api_minor, sev_status.build, + SEV_FW_REQ_VER_MAJOR, SEV_FW_REQ_VER_MINOR); + return NULL; + } + + sev = calloc(1, sizeof(*sev)); + sev->fd = sev_fd; + sev->vm = vm; + + /* Get encryption bit via CPUID. */ + eax = 0x8000001f; + ecx = 0; + cpuid(&eax, &ebx, &ecx, &edx); + sev->enc_bit = ebx & 0x3F; + + return sev; +} + +static void sev_common_free(struct sev_vm *sev) +{ + close(sev->fd); + free(sev); +} + +struct sev_vm *sev_vm_create(uint32_t policy, uint64_t npages) +{ + struct sev_vm *sev; + struct kvm_vm *vm; + + /* Need to handle memslots after init, and after setting memcrypt. */ + vm = vm_create(VM_MODE_DEFAULT, 0, O_RDWR); + sev = sev_common_create(vm); + if (!sev) + return NULL; + sev->sev_policy = policy; + + kvm_sev_ioctl(sev, KVM_SEV_INIT, NULL); + + vm_set_memory_encryption(vm, true, true, sev->enc_bit); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, 0, 0, npages, 0); + sev_register_user_range(sev, addr_gpa2hva(vm, 0), npages * vm_get_page_size(vm)); + + pr_info("SEV guest created, policy: 0x%x, size: %lu KB\n", + sev->sev_policy, npages * vm_get_page_size(vm) / 1024); + + return sev; +} + +void sev_vm_free(struct sev_vm *sev) +{ + kvm_vm_free(sev->vm); + sev_common_free(sev); +} + +void sev_vm_launch(struct sev_vm *sev) +{ + struct kvm_sev_launch_start ksev_launch_start = {0}; + struct kvm_sev_guest_status ksev_status = {0}; + + ksev_launch_start.policy = sev->sev_policy; + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_START, &ksev_launch_start); + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_status); + TEST_ASSERT(ksev_status.policy == sev->sev_policy, "Incorrect guest policy."); + TEST_ASSERT(ksev_status.state == SEV_GSTATE_LUPDATE, + "Unexpected guest state: %d", ksev_status.state); + + sev_encrypt(sev); +} + +void sev_vm_measure(struct sev_vm *sev, uint8_t *measurement) +{ + struct kvm_sev_launch_measure ksev_launch_measure = {0}; + struct kvm_sev_guest_status ksev_guest_status = {0}; + + ksev_launch_measure.len = 256; + ksev_launch_measure.uaddr = (__u64)measurement; + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_MEASURE, &ksev_launch_measure); + + /* Measurement causes a state transition, check that. */ + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_guest_status); + TEST_ASSERT(ksev_guest_status.state == SEV_GSTATE_LSECRET, + "Unexpected guest state: %d", ksev_guest_status.state); +} + +void sev_vm_launch_finish(struct sev_vm *sev) +{ + struct kvm_sev_guest_status ksev_status = {0}; + + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_status); + TEST_ASSERT(ksev_status.state == SEV_GSTATE_LUPDATE || + ksev_status.state == SEV_GSTATE_LSECRET, + "Unexpected guest state: %d", ksev_status.state); + + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_FINISH, NULL); + + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_status); + TEST_ASSERT(ksev_status.state == SEV_GSTATE_RUNNING, + "Unexpected guest state: %d", ksev_status.state); +}
On Wed, Oct 06, 2021 at 03:28:05PM -0500, Michael Roth wrote:
Please ignore this patch, it is a dupe of 06/16. Some patches got caught in our spam filter and I messed up the numbering when resending this. Hopefully now all the patches are present.
Recent kernels have checks to ensure the GPA values in special-purpose registers like CR3 are within the maximum physical address range and don't overlap with anything in the upper/reserved range. In the case of SEV kselftest guests booting directly into 64-bit mode, CR3 needs to be initialized to the GPA of the page table root, with the encryption bit set. The kernel accounts for this encryption bit by removing it from reserved bit range when the guest advertises the bit position via KVM_SET_CPUID*, but kselftests currently call KVM_SET_SREGS as part of vm_vcpu_add_default(), *prior* to vCPU creation, so there's no opportunity to call KVM_SET_CPUID* in advance. As a result, KVM_SET_SREGS will return an error in these cases.
Address this by moving vcpu_set_cpuid() (which calls KVM_SET_CPUID*) ahead of vcpu_setup() (which calls KVM_SET_SREGS).
While there, address a typo in the assertion that triggers when KVM_SET_SREGS fails.
Suggested-by: Sean Christopherson seanjc@google.com Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/lib/kvm_util.c | 2 +- tools/testing/selftests/kvm/lib/x86_64/processor.c | 4 +--- 2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index ef88fdc7e46b..646cffd86d09 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1906,7 +1906,7 @@ void vcpu_sregs_get(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_sregs *sregs) void vcpu_sregs_set(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_sregs *sregs) { int ret = _vcpu_sregs_set(vm, vcpuid, sregs); - TEST_ASSERT(ret == 0, "KVM_RUN IOCTL failed, " + TEST_ASSERT(ret == 0, "KVM_SET_SREGS IOCTL failed, " "rc: %i errno: %i", ret, errno); }
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 0bbd88fe1127..1ab4c20f5d12 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -660,6 +660,7 @@ void vm_vcpu_add_default(struct kvm_vm *vm, uint32_t vcpuid, void *guest_code)
/* Create VCPU */ vm_vcpu_add(vm, vcpuid); + vcpu_set_cpuid(vm, vcpuid, kvm_get_supported_cpuid()); vcpu_setup(vm, vcpuid);
/* Setup guest general purpose registers */ @@ -672,9 +673,6 @@ void vm_vcpu_add_default(struct kvm_vm *vm, uint32_t vcpuid, void *guest_code) /* Setup the MP state */ mp_state.mp_state = 0; vcpu_set_mp_state(vm, vcpuid, &mp_state); - - /* Setup supported CPUIDs */ - vcpu_set_cpuid(vm, vcpuid, kvm_get_supported_cpuid()); }
/*
On Wed, Oct 6, 2021 at 1:39 PM Michael Roth michael.roth@amd.com wrote:
Good catch. Reviewed-by: Nathan Tempelman natet@google.com
On 10/6/21 1:36 PM, Michael Roth wrote:
In the current code, I see that KVM_SET_SREGS is called by vcpu_setup() which is called after vm_vcpu_add() that calls KVM_CREATE_VCPU. Since you mentioned "prior", I wanted to check if the wording was wrong or if I missed something.
On Tue, Oct 12, 2021 at 06:45:09PM -0700, Krish Sadhukhan wrote:
Ah, yes, just poorly worded. What I meant to convey is that from the perspective the test program the vm_vcpu_add* call that creates the vcpu does the KVM_SET_SREGS, so there's no way to call KVM_SET_CPUID in advance other than to have vm_vcpu_add* do it as part of creating the vcpu. I get the wording fixed up on that.
The default policy for whether to handle allocations as encrypted or shared pages is currently determined by vm_phy_pages_alloc(), which in turn uses the policy defined by vm->memcrypt.enc_by_default.
Test programs may wish to allocate shared vaddrs for things like sharing memory with the guest. Since enc_by_default will be true in the case of SEV guests (since it's required in order to have the initial ELF binary and page table become part of the initial guest payload), an interface is needed to explicitly request shared pages.
Implement this by splitting the common code out from vm_vaddr_alloc() and introducing a new vm_vaddr_alloc_shared().
Signed-off-by: Michael Roth michael.roth@amd.com --- .../testing/selftests/kvm/include/kvm_util.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 23 ++++++++++++++----- 2 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 4bf686d664cc..d96e89ee4f40 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -143,6 +143,7 @@ void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa); void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot); void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid); vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min); +vm_vaddr_t vm_vaddr_alloc_shared(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min); vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages); vm_vaddr_t vm_vaddr_alloc_page(struct kvm_vm *vm);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 646cffd86d09..f6df50012c8d 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1325,14 +1325,13 @@ static vm_vaddr_t vm_vaddr_unused_gap(struct kvm_vm *vm, size_t sz, }
/* - * VM Virtual Address Allocate + * VM Virtual Address Allocate Shared/Encrypted * * Input Args: * vm - Virtual Machine * sz - Size in bytes * vaddr_min - Minimum starting virtual address - * data_memslot - Memory region slot for data pages - * pgd_memslot - Memory region slot for new virtual translation tables + * encrypt - Whether the region should be handled as encrypted * * Output Args: None * @@ -1345,13 +1344,15 @@ static vm_vaddr_t vm_vaddr_unused_gap(struct kvm_vm *vm, size_t sz, * a unique set of pages, with the minimum real allocation being at least * a page. */ -vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min) +static vm_vaddr_t +_vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min, bool encrypt) { uint64_t pages = (sz >> vm->page_shift) + ((sz % vm->page_size) != 0);
virt_pgd_alloc(vm); - vm_paddr_t paddr = vm_phy_pages_alloc(vm, pages, - KVM_UTIL_MIN_PFN * vm->page_size, 0); + vm_paddr_t paddr = _vm_phy_pages_alloc(vm, pages, + KVM_UTIL_MIN_PFN * vm->page_size, + 0, encrypt);
/* * Find an unused range of virtual page addresses of at least @@ -1372,6 +1373,16 @@ vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min) return vaddr_start; }
+vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min) +{ + return _vm_vaddr_alloc(vm, sz, vaddr_min, vm->memcrypt.enc_by_default); +} + +vm_vaddr_t vm_vaddr_alloc_shared(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min) +{ + return _vm_vaddr_alloc(vm, sz, vaddr_min, false); +} + /* * VM Virtual Address Allocate Pages *
Add interfaces to allow tests to create/manage SEV guests. The additional state associated with these guests is encapsulated in a new struct sev_vm, which is a light wrapper around struct kvm_vm. These VMs will use vm_set_memory_encryption() and vm_get_encrypted_phy_pages() under the covers to configure and sync up with the core kvm_util library on what should/shouldn't be treated as encrypted memory.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/include/x86_64/sev.h | 62 ++++ tools/testing/selftests/kvm/lib/x86_64/sev.c | 303 ++++++++++++++++++ 3 files changed, 366 insertions(+) create mode 100644 tools/testing/selftests/kvm/include/x86_64/sev.h create mode 100644 tools/testing/selftests/kvm/lib/x86_64/sev.c
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 5832f510a16c..c7a5e1c69e0c 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -35,6 +35,7 @@ endif
LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S +LIBKVM_x86_64 += lib/x86_64/sev.c LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
diff --git a/tools/testing/selftests/kvm/include/x86_64/sev.h b/tools/testing/selftests/kvm/include/x86_64/sev.h new file mode 100644 index 000000000000..d2f41b131ecc --- /dev/null +++ b/tools/testing/selftests/kvm/include/x86_64/sev.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Helpers used for SEV guests + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#ifndef SELFTEST_KVM_SEV_H +#define SELFTEST_KVM_SEV_H + +#include <stdint.h> +#include <stdbool.h> +#include "kvm_util.h" + +#define SEV_DEV_PATH "/dev/sev" +#define SEV_FW_REQ_VER_MAJOR 1 +#define SEV_FW_REQ_VER_MINOR 30 + +#define SEV_POLICY_NO_DBG (1UL << 0) +#define SEV_POLICY_ES (1UL << 2) + +#define SEV_GUEST_ASSERT(sync, token, _cond) do { \ + if (!(_cond)) \ + sev_guest_abort(sync, token, 0); \ +} while (0) + +enum { + SEV_GSTATE_UNINIT = 0, + SEV_GSTATE_LUPDATE, + SEV_GSTATE_LSECRET, + SEV_GSTATE_RUNNING, +}; + +struct sev_sync_data { + uint32_t token; + bool pending; + bool done; + bool aborted; + uint64_t info; +}; + +struct sev_vm; + +void sev_guest_sync(struct sev_sync_data *sync, uint32_t token, uint64_t info); +void sev_guest_done(struct sev_sync_data *sync, uint32_t token, uint64_t info); +void sev_guest_abort(struct sev_sync_data *sync, uint32_t token, uint64_t info); + +void sev_check_guest_sync(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token); +void sev_check_guest_done(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token); + +void kvm_sev_ioctl(struct sev_vm *sev, int cmd, void *data); +struct kvm_vm *sev_get_vm(struct sev_vm *sev); +uint8_t sev_get_enc_bit(struct sev_vm *sev); + +struct sev_vm *sev_vm_create(uint32_t policy, uint64_t npages); +void sev_vm_free(struct sev_vm *sev); +void sev_vm_launch(struct sev_vm *sev); +void sev_vm_measure(struct sev_vm *sev, uint8_t *measurement); +void sev_vm_launch_finish(struct sev_vm *sev); + +#endif /* SELFTEST_KVM_SEV_H */ diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c new file mode 100644 index 000000000000..adda3b396566 --- /dev/null +++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c @@ -0,0 +1,303 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Helpers used for SEV guests + * + * Copyright (C) 2021 Advanced Micro Devices + */ + +#include <stdint.h> +#include <stdbool.h> +#include "kvm_util.h" +#include "linux/psp-sev.h" +#include "processor.h" +#include "sev.h" + +#define PAGE_SHIFT 12 +#define PAGE_SIZE (1UL << PAGE_SHIFT) + +struct sev_vm { + struct kvm_vm *vm; + int fd; + int enc_bit; + uint32_t sev_policy; +}; + +/* Helpers for coordinating between guests and test harness. */ + +void sev_guest_sync(struct sev_sync_data *sync, uint32_t token, uint64_t info) +{ + sync->token = token; + sync->info = info; + sync->pending = true; + + asm volatile("hlt" : : : "memory"); +} + +void sev_guest_done(struct sev_sync_data *sync, uint32_t token, uint64_t info) +{ + while (true) { + sync->done = true; + sev_guest_sync(sync, token, info); + } +} + +void sev_guest_abort(struct sev_sync_data *sync, uint32_t token, uint64_t info) +{ + while (true) { + sync->aborted = true; + sev_guest_sync(sync, token, info); + } +} + +void sev_check_guest_sync(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token) +{ + TEST_ASSERT(run->exit_reason == KVM_EXIT_HLT, + "unexpected exit reason: %u (%s)", + run->exit_reason, exit_reason_str(run->exit_reason)); + TEST_ASSERT(sync->token == token, + "unexpected guest token, expected %d, got: %d", token, + sync->token); + TEST_ASSERT(!sync->done, "unexpected guest state"); + TEST_ASSERT(!sync->aborted, "unexpected guest state"); + sync->pending = false; +} + +void sev_check_guest_done(struct kvm_run *run, struct sev_sync_data *sync, + uint32_t token) +{ + TEST_ASSERT(run->exit_reason == KVM_EXIT_HLT, + "unexpected exit reason: %u (%s)", + run->exit_reason, exit_reason_str(run->exit_reason)); + TEST_ASSERT(sync->token == token, + "unexpected guest token, expected %d, got: %d", token, + sync->token); + TEST_ASSERT(sync->done, "unexpected guest state"); + TEST_ASSERT(!sync->aborted, "unexpected guest state"); + sync->pending = false; +} + +/* Common SEV helpers/accessors. */ + +struct kvm_vm *sev_get_vm(struct sev_vm *sev) +{ + return sev->vm; +} + +uint8_t sev_get_enc_bit(struct sev_vm *sev) +{ + return sev->enc_bit; +} + +void sev_ioctl(int sev_fd, int cmd, void *data) +{ + int ret; + struct sev_issue_cmd arg; + + arg.cmd = cmd; + arg.data = (unsigned long)data; + ret = ioctl(sev_fd, SEV_ISSUE_CMD, &arg); + TEST_ASSERT(ret == 0, + "SEV ioctl %d failed, error: %d, fw_error: %d", + cmd, ret, arg.error); +} + +void kvm_sev_ioctl(struct sev_vm *sev, int cmd, void *data) +{ + struct kvm_sev_cmd arg = {0}; + int ret; + + arg.id = cmd; + arg.sev_fd = sev->fd; + arg.data = (__u64)data; + + ret = ioctl(vm_get_fd(sev->vm), KVM_MEMORY_ENCRYPT_OP, &arg); + TEST_ASSERT(ret == 0, + "SEV KVM ioctl %d failed, rc: %i errno: %i (%s), fw_error: %d", + cmd, ret, errno, strerror(errno), arg.error); +} + +/* Local helpers. */ + +static void +sev_register_user_range(struct sev_vm *sev, void *hva, uint64_t size) +{ + struct kvm_enc_region range = {0}; + int ret; + + pr_debug("register_user_range: hva: %p, size: %lu\n", hva, size); + + range.addr = (__u64)hva; + range.size = size; + + ret = ioctl(vm_get_fd(sev->vm), KVM_MEMORY_ENCRYPT_REG_REGION, &range); + TEST_ASSERT(ret == 0, "failed to register user range, errno: %i\n", errno); +} + +static void +sev_encrypt_phy_range(struct sev_vm *sev, vm_paddr_t gpa, uint64_t size) +{ + struct kvm_sev_launch_update_data ksev_update_data = {0}; + + pr_debug("encrypt_phy_range: addr: 0x%lx, size: %lu\n", gpa, size); + + ksev_update_data.uaddr = (__u64)addr_gpa2hva(sev->vm, gpa); + ksev_update_data.len = size; + + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_UPDATE_DATA, &ksev_update_data); +} + +static void sev_encrypt(struct sev_vm *sev) +{ + struct sparsebit *enc_phy_pages; + struct kvm_vm *vm = sev->vm; + sparsebit_idx_t pg = 0; + vm_paddr_t gpa_start; + uint64_t memory_size; + + /* Only memslot 0 supported for now. */ + enc_phy_pages = vm_get_encrypted_phy_pages(sev->vm, 0, &gpa_start, &memory_size); + TEST_ASSERT(enc_phy_pages, "Unable to retrieve encrypted pages bitmap"); + while (pg < (memory_size / vm_get_page_size(vm))) { + sparsebit_idx_t pg_cnt; + + if (sparsebit_is_clear(enc_phy_pages, pg)) { + pg = sparsebit_next_set(enc_phy_pages, pg); + if (!pg) + break; + } + + pg_cnt = sparsebit_next_clear(enc_phy_pages, pg) - pg; + if (pg_cnt <= 0) + pg_cnt = 1; + + sev_encrypt_phy_range(sev, + gpa_start + pg * vm_get_page_size(vm), + pg_cnt * vm_get_page_size(vm)); + pg += pg_cnt; + } + + sparsebit_free(&enc_phy_pages); +} + +/* SEV VM implementation. */ + +static struct sev_vm *sev_common_create(struct kvm_vm *vm) +{ + struct sev_user_data_status sev_status = {0}; + uint32_t eax, ebx, ecx, edx; + struct sev_vm *sev; + int sev_fd; + + sev_fd = open(SEV_DEV_PATH, O_RDWR); + if (sev_fd < 0) { + pr_info("Failed to open SEV device, path: %s, error: %d, skipping test.\n", + SEV_DEV_PATH, sev_fd); + return NULL; + } + + sev_ioctl(sev_fd, SEV_PLATFORM_STATUS, &sev_status); + + if (!(sev_status.api_major > SEV_FW_REQ_VER_MAJOR || + (sev_status.api_major == SEV_FW_REQ_VER_MAJOR && + sev_status.api_minor >= SEV_FW_REQ_VER_MINOR))) { + pr_info("SEV FW version too old. Have API %d.%d (build: %d), need %d.%d, skipping test.\n", + sev_status.api_major, sev_status.api_minor, sev_status.build, + SEV_FW_REQ_VER_MAJOR, SEV_FW_REQ_VER_MINOR); + return NULL; + } + + sev = calloc(1, sizeof(*sev)); + sev->fd = sev_fd; + sev->vm = vm; + + /* Get encryption bit via CPUID. */ + eax = 0x8000001f; + ecx = 0; + cpuid(&eax, &ebx, &ecx, &edx); + sev->enc_bit = ebx & 0x3F; + + return sev; +} + +static void sev_common_free(struct sev_vm *sev) +{ + close(sev->fd); + free(sev); +} + +struct sev_vm *sev_vm_create(uint32_t policy, uint64_t npages) +{ + struct sev_vm *sev; + struct kvm_vm *vm; + + /* Need to handle memslots after init, and after setting memcrypt. */ + vm = vm_create(VM_MODE_DEFAULT, 0, O_RDWR); + sev = sev_common_create(vm); + if (!sev) + return NULL; + sev->sev_policy = policy; + + kvm_sev_ioctl(sev, KVM_SEV_INIT, NULL); + + vm_set_memory_encryption(vm, true, true, sev->enc_bit); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, 0, 0, npages, 0); + sev_register_user_range(sev, addr_gpa2hva(vm, 0), npages * vm_get_page_size(vm)); + + pr_info("SEV guest created, policy: 0x%x, size: %lu KB\n", + sev->sev_policy, npages * vm_get_page_size(vm) / 1024); + + return sev; +} + +void sev_vm_free(struct sev_vm *sev) +{ + kvm_vm_free(sev->vm); + sev_common_free(sev); +} + +void sev_vm_launch(struct sev_vm *sev) +{ + struct kvm_sev_launch_start ksev_launch_start = {0}; + struct kvm_sev_guest_status ksev_status = {0}; + + ksev_launch_start.policy = sev->sev_policy; + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_START, &ksev_launch_start); + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_status); + TEST_ASSERT(ksev_status.policy == sev->sev_policy, "Incorrect guest policy."); + TEST_ASSERT(ksev_status.state == SEV_GSTATE_LUPDATE, + "Unexpected guest state: %d", ksev_status.state); + + sev_encrypt(sev); +} + +void sev_vm_measure(struct sev_vm *sev, uint8_t *measurement) +{ + struct kvm_sev_launch_measure ksev_launch_measure = {0}; + struct kvm_sev_guest_status ksev_guest_status = {0}; + + ksev_launch_measure.len = 256; + ksev_launch_measure.uaddr = (__u64)measurement; + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_MEASURE, &ksev_launch_measure); + + /* Measurement causes a state transition, check that. */ + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_guest_status); + TEST_ASSERT(ksev_guest_status.state == SEV_GSTATE_LSECRET, + "Unexpected guest state: %d", ksev_guest_status.state); +} + +void sev_vm_launch_finish(struct sev_vm *sev) +{ + struct kvm_sev_guest_status ksev_status = {0}; + + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_status); + TEST_ASSERT(ksev_status.state == SEV_GSTATE_LUPDATE || + ksev_status.state == SEV_GSTATE_LSECRET, + "Unexpected guest state: %d", ksev_status.state); + + kvm_sev_ioctl(sev, KVM_SEV_LAUNCH_FINISH, NULL); + + kvm_sev_ioctl(sev, KVM_SEV_GUEST_STATUS, &ksev_status); + TEST_ASSERT(ksev_status.state == SEV_GSTATE_RUNNING, + "Unexpected guest state: %d", ksev_status.state); +}
On Wed, Oct 6, 2021 at 1:40 PM Michael Roth michael.roth@amd.com wrote:
Regarding RFC-level feedback: First off, I'm super jazzed with what I'm seeing so far! (While this is my first review, I've been studying the patches up through the SEV boot test, i.e., patch #7). One thing I'm wondering is: the way this is structured is to essentially split the test cases into non-SEV and SEV. I'm wondering how hard it would be to add some flag or environment variable to set up pre-existing tests to run under SEV. Or is this something you all thought about, and decided that it does not make sense?
Looking at how the guest memory is handled, it seems like it's not far off from handling SEV transparently across all test cases. I'd think that we could just default all memory to use the encryption bit, and then have test cases, such as the test case in patch #7, clear the encryption bit for shared pages. However, I think the VM creation would need a bit more refactoring to work with other test cases.
Not necessary for the initial patches, but eventually, it would be nice to make this configurable. On the machine I was using to test this, the sev device appears at `/mnt/devtmpfs/sev`
+#define SEV_FW_REQ_VER_MAJOR 1 +#define SEV_FW_REQ_VER_MINOR 30
Where does the requirement for this minimum version come from? Maybe add a comment?
Edit: Is this for patches later on in the series that exercise SNP? If so, I think it would be better to add a check like this in the test itself, rather than globally. I happened to test this on a machine with a very old PSP FW, 0.22, and the SEV test added in patch #7 seems to work fine with this ancient PSP FW.
nit: This struct could use some comments. For example, `token` is not totally self-explanatory. In general, a comment explaining how this struct is intended to be used by test cases seems useful.
Maybe this should have some logic to make sure it only gets executed once instead of a while loop. Something like:
void sev_guest_done(struct sev_sync_data *sync, uint32_t token, uint64_t info) { static bool abort = false;
SEV_GUEST_ASSERT(sync, 0xDEADBEEF, !abort); abort = true;
sync->done = true; sev_guest_sync(sync, token, info); }
Check that `pending` is `true` before setting to `false`?
nit: This function is nearly identical to `sev_check_guest_sync()`, other than the ASSERT for `sync->done`. Might be worth splitting out a common static helper or using a single function for both cases, and distinguishing between them with a function parameter.
nit: Should the cast be `(__u64)`, rather than `(unsigned long)`? This is how `data` is defined in `struct sev_issue_cmd`, and also how the `data` field is cast in `sev_ioctl`, below.
nit: Technically, the `KVM_MEMORY_ENCRYPT_OP ` ioctl failed. The failure message should reflect this. Maybe something like:
"SEV KVM_MEMORY_ENCRYPT_OP ioctl w/ cmd: %d failed, ..."
I don't see any code to call `KVM_MEMORY_ENCRYPT_UNREG_REGION` in this code. Shouldn't we be calling it to unpin the memory when the test is done?
The `pg_cnt <= 0` case doesn't seem correct. First off, I'd just git rid of the `<`, because `pg_cnt` is unsigned. Second, the comment header for `sparsebit_next_clear()` says that a return value of `0` indicates that no bits following `prev` are cleared. Thus, in this case, I'd think that `pg_cnt` should be set to `memory_size - pg`. So maybe something like:
end = sparsebit_next_clear(enc_phy_pages, pg); if (end == 0) end = memory_size; pg_cnt = end - pg;
nit: If we fail here, we leak `sev_fd`. Should we call `close(sev_fd)` here?
This is phenomenal. I'll try to send feedback on the other patches in the first batch of 7 sooner rather than later. I haven't looked past patch #7 yet. Let me know what you think about my first comment, on whether we can get all pre-existing tests to run under SEV, or that's a bad idea. I probably haven't given it as much thought as you have.
On Sun, Oct 10, 2021 at 08:17:00PM -0700, Marc Orr wrote:
I think it's possible, but there's a few missing pieces:
1) As you indicated, existing tests which rely on vm_create(), vm_create_default(), vm_create_default_with_vcpus(), etc. would either need to be updated with whatever new interface provides this 'use-sev' flag, or it would need to happen underneath the covers based on said environment variable/global/etc. There's also the question of where to hook in the sev_vm_launch_start() hooks. Maybe the first time a vcpu_run() is issued? Or maybe some explict call each test will need to be updated to call just prior to initial execution.
2) Many of the existing tests use the GUESY_SYNC/ucall stuff to handle synchronization between host userspace and guest kernel, which relies on guests issuing PIO instructions to particular port addresses to cause an exit back to host userspace, with various parameters passed via register arguments.
- For SEV this would almost work as-is, but some tests might rely on things like memory addresses being passed in this way so would need to audit the code and mark that memory as shared where needed.
- For SEV-ES/SEV-SNP, there's a bit more work since:
- The registers will not be accessible through the existing KVM_GET_REGS mechanism. It may be possible to set some flag/hook to set/access arguments through some other mechanism like a shared buffer for certain VM types though.
- Additionally, the #VC handler only supports CPUID currently, and leverages that fact to avoid doing any significant instruction decoding. Instead the SEV tests use HLT instructions to handle exits to host userspace, which may not work for some tests. So unless there's some other mechanism that SEV/non-SEV tests could utilize rather that PIO, the #VC handler would need to support PIO, which would be nice to have either way, but would likely involve pulling in the intruction decoder library used in the kernel, or some subset/re-implementation of it at least.
3) Similar to SEV-ES/SEV-SNP requirements for 1), tests which generate PIO/MMIO and other NAE events would need appropriate support for those events in the #VC handler. Nice-to-have either way, but not sure atm how much it would be to implement all of that. Also any tests relying on things like KVM_GET_REGS/KVM_GET_SREGS are non-starters.
Maybe with 1) and 2) in place, some tests can start incrementally being 'whitelisted' to run under SEV without needing to support anything and everything right off the bat. Might also be good to see how the TDX stuff settles since there may be similar considerations there as well.
Hmm, don't see any use of getenv() currently, so I guess maybe a compile-time option would be the preferred approach. I'll take a look at that.
Ah, yes, this was mostly for SNP support. I'll implement a separate minimum version for SEV/SEV-ES.
Will do.
That makes sense.
Yes, that should be the case.
Will do.
Yes, I should probably do that in sev_vm_free() for the main memslot at least.
Ah, yes I think I was assuming I'd only hit the pg_cnt <= 0 case if I was looking at the last page, but it could also be hit for any number of contiguous pages at the end of the bitmap.
That should do the trick, thanks!
Thanks for looking! I think it would be great if we could get to a point where all the SEV/CoCo stuff can happen transparently, but might take some time to get all the groundwork in place, and likely some churn in the affected test cases.
-Mike
On Mon, Oct 11, 2021 at 08:15:37PM -0500, Michael Roth wrote:
One more I should mention:
4) After encryption, the page table is no longer usable for translations by stuff like addr_gva2gpa(), so tests would either need to be audited/updated to do these translations upfront and only rely on cached/stored values thereafter, or perhaps a "shadow" copy could be maintained by kvm_util so the translations will continue to work after encryption.
On 12/10/21 14:55, Michael Roth wrote:
Yeah, this is a big one. Considering that a lot of the selftests are for specific bugs, the benefit in running them with SEV is relatively low. That said, there could be some simple tests where it makes sense, so it'd be nice to plan a little ahead so that it isn't _too_ difficult.
Paolo
I want to ask the same thing, I tried to run the sev selftest today and I was blocked by this minimum version number... BTW: I suspect if I want to update the SEV firmware I have to update the BIOS myself? So, it would be good to know what is the actual minimum for SEV.
In addition, maybe that's side effect, I see a warning when building the kernel:
"module ccp.ko requires firmware amd/amd_sev_fam19h_model0xh.sbin"
Maybe I need some hints from you? Or maybe it is just harmless. I did double checked and it looks like I was using either amd_sev_fam17h_model3xh.sbin or amd_sev_fam17h_model0xh.sbin
Thanks. -Mingwei
On 11/4/21 12:25 AM, Mingwei Zhang wrote:
The SEV firmware is updatable at module load time through the DOWNLOAD_FIRMWARE command.
The firmware images reside (typically) in /lib/firmware/amd/. There is a new version for fam19h that you can copy into that directory at:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/...
or
https://developer.amd.com/sev/ under the Links & Downloads section (Note, if retrieved from here you will/may need to rename the .sbin file to match the name mentioned above).
If you're on a fam19h machine, the fam17h builds won't be used.
Thanks, Tom
Thanks. -Mingwei
On 10/6/21 1:37 PM, Michael Roth wrote:
Since 'region' is used in every naming, it's better to call it sev_register_user_region or sev_register_userspace_region for the sake of consistency.
vm_get_encrypted_phy_pages() allocates a duplicate sparsebit. I am wondering if it is possible for the function to just give a pointer to 'region->encrypted_phy_pages' so the allocation and freeing of memory can be avoided.
Can there be a better name like, sev_vm_alloc because it's essentially allocating and initializing a sev_vm data structure.
On 10/6/21 1:37 PM, Michael Roth wrote:
Instead of having three different members, 'pending', done' and 'abort', and their corresponding functions, is it possible to use a single member and a single function with #defines for the three states ? This may be better if we need to add more sync states in the future.
Just like you named it after the command LAUNCH_FINISH, it's better to name the other functions based on the command they are executing, for the sake of consistency:
sev_vm_launch_measure, sev_vm_launch_start
On 06/10/21 22:37, Michael Roth wrote:
Please add a comment explaining roughly the design and what the fields are for. Maybe the bools can be replaced by an enum { DONE, ABORT, SYNC, RUNNING } (running is for pending==false)?
Also, for the part that you can feel free to ignore: this seems to be similar to the ucall mechanism. Is it possible to implement the ucall interface in terms of this one (or vice versa)?
One idea could be to:
- move ucall to the main lib/ directory
- make it use a struct of function pointers, whose default implementation would be in the existing lib/ARCH/ucall.c files
- add a function to register the struct for the desired implementation
- make sev.c register its own implementation
Thanks,
Paolo
On Thu, Oct 21, 2021 at 05:39:34PM +0200, Paolo Bonzini wrote:
That makes sense. And SYNC is basically pending==true.
Though since it would be the test code calling sev_check_guest_sync() that sets pending to true after each sync, "RUNNABLE" might be a better name since whether it's actually running depends on a separate call to vcpu_run().
That seems doable. We'd also need to register an opaque pointer that, in the case of SEV, would be for the shared page used for syncing. The guest would also need the pointers for the ops/opaque, but ucall_init() could additionally set some globals to provide the GVAs for them in the guest environment.
Will look at that for the next spin.
Thanks,
Paolo
A common aspect of booting SEV guests is checking related CPUID/MSR bits and accessing shared/private memory. Add a basic test to cover this.
This test will be expanded to cover basic boot of SEV-ES and SEV-SNP in subsequent patches.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/x86_64/sev_all_boot_test.c | 252 ++++++++++++++++++ 3 files changed, 254 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index 0709af0144c8..824f100bec2a 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -38,6 +38,7 @@ /x86_64/xen_vmcall_test /x86_64/xss_msr_test /x86_64/vmx_pmu_msrs_test +/x86_64/sev_all_boot_test /access_tracking_perf_test /demand_paging_test /dirty_log_test diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index c7a5e1c69e0c..aa8901bdbd22 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -72,6 +72,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/tsc_msrs_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_pmu_msrs_test TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test +TEST_GEN_PROGS_x86_64 += x86_64/sev_all_boot_test TEST_GEN_PROGS_x86_64 += access_tracking_perf_test TEST_GEN_PROGS_x86_64 += demand_paging_test TEST_GEN_PROGS_x86_64 += dirty_log_test diff --git a/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c b/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c new file mode 100644 index 000000000000..8df7143ac17d --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/sev_all_boot_test.c @@ -0,0 +1,252 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Basic SEV boot tests. + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include <fcntl.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/ioctl.h> + +#include "test_util.h" + +#include "kvm_util.h" +#include "processor.h" +#include "svm_util.h" +#include "linux/psp-sev.h" +#include "sev.h" + +#define VCPU_ID 2 +#define PAGE_SIZE 4096 +#define PAGE_STRIDE 32 + +#define SHARED_PAGES 8192 +#define SHARED_VADDR_MIN 0x1000000 + +#define PRIVATE_PAGES 2048 +#define PRIVATE_VADDR_MIN (SHARED_VADDR_MIN + SHARED_PAGES * PAGE_SIZE) + +#define TOTAL_PAGES (512 + SHARED_PAGES + PRIVATE_PAGES) + +static void fill_buf(uint8_t *buf, size_t pages, size_t stride, uint8_t val) +{ + int i, j; + + for (i = 0; i < pages; i++) + for (j = 0; j < PAGE_SIZE; j += stride) + buf[i * PAGE_SIZE + j] = val; +} + +static bool check_buf(uint8_t *buf, size_t pages, size_t stride, uint8_t val) +{ + int i, j; + + for (i = 0; i < pages; i++) + for (j = 0; j < PAGE_SIZE; j += stride) + if (buf[i * PAGE_SIZE + j] != val) + return false; + + return true; +} + +static void guest_test_start(struct sev_sync_data *sync) +{ + /* Initial guest check-in. */ + sev_guest_sync(sync, 1, 0); +} + +static void check_test_start(struct kvm_vm *vm, struct sev_sync_data *sync) +{ + struct kvm_run *run; + + run = vcpu_state(vm, VCPU_ID); + vcpu_run(vm, VCPU_ID); + + /* Initial guest check-in. */ + sev_check_guest_sync(run, sync, 1); +} + +static void +guest_test_common(struct sev_sync_data *sync, uint8_t *shared_buf, uint8_t *private_buf) +{ + bool success; + + /* Initial check-in for common. */ + sev_guest_sync(sync, 100, 0); + + /* Ensure initial shared pages are intact. */ + success = check_buf(shared_buf, SHARED_PAGES, PAGE_STRIDE, 0x41); + SEV_GUEST_ASSERT(sync, 103, success); + + /* Ensure initial private pages are intact/encrypted. */ + success = check_buf(private_buf, PRIVATE_PAGES, PAGE_STRIDE, 0x42); + SEV_GUEST_ASSERT(sync, 104, success); + + /* Ensure host userspace can't read newly-written encrypted data. */ + fill_buf(private_buf, PRIVATE_PAGES, PAGE_STRIDE, 0x43); + + sev_guest_sync(sync, 200, 0); + + /* Ensure guest can read newly-written shared data from host. */ + success = check_buf(shared_buf, SHARED_PAGES, PAGE_STRIDE, 0x44); + SEV_GUEST_ASSERT(sync, 201, success); + + /* Ensure host can read newly-written shared data from guest. */ + fill_buf(shared_buf, SHARED_PAGES, PAGE_STRIDE, 0x45); + + sev_guest_sync(sync, 300, 0); +} + +static void +check_test_common(struct kvm_vm *vm, struct sev_sync_data *sync, + uint8_t *shared_buf, uint8_t *private_buf) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + bool success; + + /* Initial guest check-in. */ + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 100); + + /* Ensure initial private pages are intact/encrypted. */ + success = check_buf(private_buf, PRIVATE_PAGES, PAGE_STRIDE, 0x42); + TEST_ASSERT(!success, "Initial guest memory not encrypted!"); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 200); + + /* Ensure host userspace can't read newly-written encrypted data. */ + success = check_buf(private_buf, PRIVATE_PAGES, PAGE_STRIDE, 0x43); + TEST_ASSERT(!success, "Modified guest memory not encrypted!"); + + /* Ensure guest can read newly-written shared data from host. */ + fill_buf(shared_buf, SHARED_PAGES, PAGE_STRIDE, 0x44); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_sync(run, sync, 300); + + /* Ensure host can read newly-written shared data from guest. */ + success = check_buf(shared_buf, SHARED_PAGES, PAGE_STRIDE, 0x45); + TEST_ASSERT(success, "Host can't read shared guest memory!"); +} + +static void +guest_test_done(struct sev_sync_data *sync) +{ + sev_guest_done(sync, 10000, 0); +} + +static void +check_test_done(struct kvm_vm *vm, struct sev_sync_data *sync) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + + vcpu_run(vm, VCPU_ID); + sev_check_guest_done(run, sync, 10000); +} + +static void __attribute__((__flatten__)) +guest_sev_code(struct sev_sync_data *sync, uint8_t *shared_buf, uint8_t *private_buf) +{ + uint32_t eax, ebx, ecx, edx; + uint64_t sev_status; + + guest_test_start(sync); + + /* Check SEV CPUID bit. */ + eax = 0x8000001f; + ecx = 0; + cpuid(&eax, &ebx, &ecx, &edx); + SEV_GUEST_ASSERT(sync, 2, eax & (1 << 1)); + + /* Check SEV MSR bit. */ + sev_status = rdmsr(MSR_AMD64_SEV); + SEV_GUEST_ASSERT(sync, 3, (sev_status & 0x1) == 1); + + guest_test_common(sync, shared_buf, private_buf); + + guest_test_done(sync); +} + +static void +setup_test_common(struct sev_vm *sev, void *guest_code, vm_vaddr_t *sync_vaddr, + vm_vaddr_t *shared_vaddr, vm_vaddr_t *private_vaddr) +{ + struct kvm_vm *vm = sev_get_vm(sev); + uint8_t *shared_buf, *private_buf; + + /* Set up VCPU and initial guest kernel. */ + vm_vcpu_add_default(vm, VCPU_ID, guest_code); + kvm_vm_elf_load(vm, program_invocation_name); + + /* Set up shared sync buffer. */ + *sync_vaddr = vm_vaddr_alloc_shared(vm, PAGE_SIZE, 0); + + /* Set up buffer for reserved shared memory. */ + *shared_vaddr = vm_vaddr_alloc_shared(vm, SHARED_PAGES * PAGE_SIZE, + SHARED_VADDR_MIN); + shared_buf = addr_gva2hva(vm, *shared_vaddr); + fill_buf(shared_buf, SHARED_PAGES, PAGE_STRIDE, 0x41); + + /* Set up buffer for reserved private memory. */ + *private_vaddr = vm_vaddr_alloc(vm, PRIVATE_PAGES * PAGE_SIZE, + PRIVATE_VADDR_MIN); + private_buf = addr_gva2hva(vm, *private_vaddr); + fill_buf(private_buf, PRIVATE_PAGES, PAGE_STRIDE, 0x42); +} + +static void test_sev(void *guest_code, uint64_t policy) +{ + vm_vaddr_t sync_vaddr, shared_vaddr, private_vaddr; + uint8_t *shared_buf, *private_buf; + struct sev_sync_data *sync; + uint8_t measurement[512]; + struct sev_vm *sev; + struct kvm_vm *vm; + int i; + + sev = sev_vm_create(policy, TOTAL_PAGES); + if (!sev) + return; + vm = sev_get_vm(sev); + + setup_test_common(sev, guest_code, &sync_vaddr, &shared_vaddr, &private_vaddr); + + /* Set up guest params. */ + vcpu_args_set(vm, VCPU_ID, 4, sync_vaddr, shared_vaddr, private_vaddr); + + sync = addr_gva2hva(vm, sync_vaddr); + shared_buf = addr_gva2hva(vm, shared_vaddr); + private_buf = addr_gva2hva(vm, private_vaddr); + + /* Allocations/setup done. Encrypt initial guest payload. */ + sev_vm_launch(sev); + + /* Dump the initial measurement. A test to actually verify it would be nice. */ + sev_vm_measure(sev, measurement); + pr_info("guest measurement: "); + for (i = 0; i < 32; ++i) + pr_info("%02x", measurement[i]); + pr_info("\n"); + + sev_vm_launch_finish(sev); + + /* Guest is ready to run. Do the tests. */ + check_test_start(vm, sync); + check_test_common(vm, sync, shared_buf, private_buf); + check_test_done(vm, sync); + + sev_vm_free(sev); +} + +int main(int argc, char *argv[]) +{ + /* SEV tests */ + test_sev(guest_sev_code, SEV_POLICY_NO_DBG); + test_sev(guest_sev_code, 0); + + return 0; +}
On 10/6/21 1:37 PM, Michael Roth wrote:
Is there any need to do the cpuid and MSR tests every time the guest code is executed ?
Since the above section of this function is actually setup code and is not running the test yet, it is best placed in a setup function. My suggestion is that you place the above section into a function called setup_test_common() and within that function you further separate out the SEV-related setup into a function called setup_test_sev() or something similar. Then call the top-level setup function from within main().
These function names can be better. These functions are not just checking the result, they are actually running the guest code. So, may be, calling them 'test_start, test_common and test_done are better. I would just collapse them and place their code in test_sev() only if you separate the out the setup code as I suggested above.
On Fri, Oct 15, 2021 at 07:55:53PM -0700, Krish Sadhukhan wrote:
It seems like a good sanity check that KVM is setting the expected bits (via KVM_GET_SUPPORTED_CPUID) for SEV guests. This is also a fairly common example of the sort of things a guest would do during boot to detect/initialize SEV-related functionality.
It becomes a little more useful for the SEV-ES/SEV-SNP tests, where cpuid instructions cause a #VC exception, which in turn leads to a VMGExit and exercises the host KVM's handling of host-guest communiction via GHCB MSR/GHCB page.
That makes sense. I'll try to rework this according to your suggestions.
Will do.
collapse them and place their code in test_sev() only if you separate the out the setup code as I suggested above.
Hmm, I did it that way because in the guest_{sev,sev_es,sev_snp}_code routines there are some type-specific checks that happen before guest_test_done(), so it made sense to have that in a separate routine, and seemed more readable to then also have check_test_done() separate to pair with it. But I'll see if I can rework that a bit.
Normally guests will set up CR3 themselves, but some guests, such as kselftests, and potentially CONFIG_PVH guests, rely on being booted with paging enabled and CR3 initialized to a pre-allocated page table.
Currently CR3 updates via KVM_SET_SREGS* are not loaded into the guest VMCB until just prior to entering the guest. For SEV-ES/SEV-SNP, this is too late, since it will have switched over to using the VMSA page prior to that point, with the VMSA CR3 copied from the VMCB initial CR3 value: 0.
Address this by sync'ing the CR3 value into the VMCB save area immediately when KVM_SET_SREGS* is issued so it will find it's way into the initial VMSA.
Suggested-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Michael Roth michael.roth@amd.com --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 22 ++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 8 ++++++++ arch/x86/kvm/x86.c | 3 +-- 5 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 3d25a86840db..653659e20614 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -35,6 +35,7 @@ KVM_X86_OP(get_cpl) KVM_X86_OP(set_segment) KVM_X86_OP_NULL(get_cs_db_l_bits) KVM_X86_OP(set_cr0) +KVM_X86_OP(set_cr3) KVM_X86_OP(is_valid_cr4) KVM_X86_OP(set_cr4) KVM_X86_OP(set_efer) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 09ec1ff5bd83..232e997acae6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1324,6 +1324,7 @@ struct kvm_x86_ops { struct kvm_segment *var, int seg); void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l); void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0); + void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3); bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr0); void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4); int (*set_efer)(struct kvm_vcpu *vcpu, u64 efer); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7e19f5f6d0d8..2c3bc7a667c8 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1716,6 +1716,27 @@ static void svm_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) vmcb_mark_dirty(svm->vmcb, VMCB_DT); }
+static void svm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + vcpu->arch.cr3 = cr3; + kvm_register_mark_available(vcpu, VCPU_EXREG_CR3); + + /* + * For guests that don't set guest_state_protected, the cr3 update is + * handled via kvm_mmu_load() while entering the guest. For guests + * that do (SEV-ES/SEV-SNP), the cr3 update needs to be written to + * VMCB save area now, since the save area will become the initial + * contents of the VMSA, and future VMCB save area updates won't be + * seen. + */ + if (sev_es_guest(vcpu->kvm)) { + svm->vmcb->save.cr3 = cr3; + vmcb_mark_dirty(svm->vmcb, VMCB_CR); + } +} + void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_svm *svm = to_svm(vcpu); @@ -4564,6 +4585,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .get_cpl = svm_get_cpl, .get_cs_db_l_bits = kvm_get_cs_db_l_bits, .set_cr0 = svm_set_cr0, + .set_cr3 = svm_set_cr3, .is_valid_cr4 = svm_is_valid_cr4, .set_cr4 = svm_set_cr4, .set_efer = svm_set_efer, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fada1055f325..4f233d0b05bf 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3130,6 +3130,13 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, vmcs_writel(GUEST_CR3, guest_cr3); }
+ +void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) +{ + vcpu->arch.cr3 = cr3; + kvm_register_mark_available(vcpu, VCPU_EXREG_CR3); +} + static bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { /* @@ -7578,6 +7585,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .get_cpl = vmx_get_cpl, .get_cs_db_l_bits = vmx_get_cs_db_l_bits, .set_cr0 = vmx_set_cr0, + .set_cr3 = vmx_set_cr3, .is_valid_cr4 = vmx_is_valid_cr4, .set_cr4 = vmx_set_cr4, .set_efer = vmx_set_efer, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c1a2dd0024b2..d724fa185bef 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10400,8 +10400,7 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
vcpu->arch.cr2 = sregs->cr2; *mmu_reset_needed |= kvm_read_cr3(vcpu) != sregs->cr3; - vcpu->arch.cr3 = sregs->cr3; - kvm_register_mark_available(vcpu, VCPU_EXREG_CR3); + static_call(kvm_x86_set_cr3)(vcpu, sregs->cr3);
kvm_set_cr8(vcpu, sregs->cr8);
On 06/10/21 01:44, Michael Roth wrote:
Hi Mike,
this SEV/SEV-ES testing (both your series and kvm-unit-tests) is good stuff. :) If you fix up patches 1-12, I will commit them pretty much straight away. The only thing that possibly needs some thought is the integration with ucall.
Thanks,
Paolo
On Thu, Oct 21, 2021 at 06:48:24PM +0200, Paolo Bonzini wrote:
Hi Paolo,
Glad to hear :) For v2 I'll work on getting SEV/SEV-ES broken out into a separate series with all the review comments addressed. Still a little unsure about the best way to address some things in patch #3, but outlined a tentative plan that hopefully seems reasonable. Can re-visit in v2 as well.
Thanks!
-Mike
linux-kselftest-mirror@lists.linaro.org