Re: [RFC PATCH 00/10] KVM: selftests: Add support for test-selectable ucall implementations

5 Jan 2022


      On Wed, Jan 05, 2022 at 05:43:21PM +0000, Sean Christopherson wrote:
...
On Wed, Jan 05, 2022, Michael Roth wrote:
...
On Wed, Jan 05, 2022 at 12:17:33AM +0000, Sean Christopherson wrote:
...
PIO shouldn't require instruction decoding or a #VC handler.  What I was thinking
is that the guest in the selftest would make a direct #VMGEXIT/TDCALL to request
PIO instead of executing an OUT.
That seems like a nicer approach. But it sort of lends itself to having
test-specific ucall implementations in some form. How are you thinking
vm_create() should decide what implementation to use? With this series
in place it could be something like:
vm_create(..., struct ucall_ops *ops)
    ucall_init_ops(ops)
and with the SEV selftests in their current form it would look something
like:
sev_vm_create(...)
    vm_create_with_ucall(..., ops=ucall_ops_pio_vmgexit)
      ucall_init_ops(ops)
is that sort of what you're thinking, or something else?
I keep forgetting ucall() doesn't have access to the VM.  But, since we're
restricing ucall() to a single VM, we can have a global that sets ucall_ops during
ucall_init() based on the VM type, or skip an ops and just open code the behavior
in x86's ucall() by snapshotting the VM type.  Either way, the goal is to avoid
having to pass in ucall_ops at the test level.
...
...
Yeah, I was thinking it could be done at the lowest level vm_create() helper.
We'll need to expand vm_create() (or add yet another layer to avoid modifying a
pile of tests) to allow opting out of initializing ucall, e.g. sev_migrate_tests.c
needs to create multiple concurrent VMs, but happily doesn't need ucall support.
Why does sev_migrate_tests need to opt out? Couldn't it use
ucall_ops_pio_vmgexit like that SEV case above?
Because it uses multiple VMs, and my rough sketch only allows for a single VM to
use ucall.  Though I suppose we could simply keep appending to the ucall list for
every VM.  The requirement would then be that all VMs are of the same type, i.e.
utilize the same ucall_ops.
Hmm, maybe I misread your patch. Not supporting multiple VMs was the
reason I gave up on having the ucall structs allocated on-demand and
went with requiring them to be passed as arguments to ucall().
I thought with your patch you had solved that by having each vm have it's
own pool, via vm->ucall_list, and then mapping each pool into each guest
separately via:
ucall_init(vm):
    ucall_list = vm->ucall_list
    sync_global_to_guest(ucall_list).
then as long as that ucall_init() is done *after* the guest calls
kvm_vm_elf_load(), it will end up with a 'ucall_list' global that points
to it's own specific vm->ucall_list. Then on the test side it doesn't
matter what the 'ucall_list' global is currently set to since you have
the GPA and know what vm exited.
Or am I missing something there?
Although even if that is the case, now that we're proposing doing the
ucall_init() inside vm_create(), then we run the risk of a test calling
kvm_vm_elf_load() after, which might clobber the guest's copy of
ucall_list global if ucall_init() had since been called for another VM.
But that could maybe be worked around by having whatever vm_create()
variant we use also do the kvm_vm_elf_load() unconditionally as part of
creation.
...
...
I ask because there is a ucall() in the exception handling code where
some unhandled exceptions result in the guest automatically issuing a
ucall(UCALL_UNHANDLED), so even when tests don't use ucall() they
might still rely on it if they enable exception handling. So that might
be an argument for always setting up at least the default ucall_ops_pio
implementation and creating a pool just in case. (or an argument for
dropping the UCALL_HANDLED handling).
The sev_migrate_tests don't even run a guest, hence the quick-and-dirty "solution".
Though thinking toward the future, that may be too dirty as it would prevent tests
from having multiple "real" VMs.
...
...
...
...
To reduce the burden on tests and avoid ordering issues with creating vCPUs,
allocate a ucall struct for every possible vCPU when the VM is created and stuff
the GPA of the struct in the struct itself so that the guest can communicate the
GPA instead of the GVA.  Then confidential VMs just need to make all structs shared.
So a separate call like:
ucall_make_shared(vm->ucall_list)
? Might need some good documentation/assertions to make sure it gets
called at the right place for confidential VMs, and may need some extra
hooks in SEV selftest implementation for switching from private to shared
after the memory has already been allocated, but seems reasonable.
Again, I was thinking that it would be done unconditionally by ucall_init(), i.e.
would be automatically handled by the selftest framework and would Just Work for
individual tests.
Ok, I'll have to think that through more. Currently with the SEV
selftests as they we have:
sev_vm_create(policy, npages)
    vm = vm_create(...)
    vm_set_memory_encryption(vm, encrypt_by_default, enc_bit)
    //vm_vaddr_alloc_shared() can be used now
The ucall struct allocations would need to go through
vm_vaddr_alloc_shared() to make sure the selftest library tracks/maps
the pages as shared, but that vm_set_memory_encryption() happens too
late if the ucall_init() stuff is done in vm_create(). It should be
possible to pass the vm_set_memory_encryption() arguments directly to
vm_create() to allow for what you're proposing, but I guess we'd need
a new vm_create() wrapper that handles both the
vm_set_memory_encryption() args, along with the ucall_ops above,
something like:
sev_vm_create(policy, npages)
    vm = vm_create_coco(..., encrypt_by_default, enc_bit/shared_bit, ucall_ops)
Or were you thinking something else? Just trying to get an idea of how
this will all need to tie in with the SEV selftests and what needs to
change on that end.
Hmm, I was thinking the selftest framework would only need to be told the VM type,
e.g. DEFAULT, SEV, SEV-ES, SEV-SNP, or TDX, and would then handle setting everything
up, e.g. enumerating the C-bit location and encrypting memory as needed.
One thought would be to extend "enum vm_guest_mode" with flags above NUM_VM_MODES
to specify the VM type.  That way tests that use VM_MODE_DEFAULT would continue to
work without any updates.
Ok, let me see what that approach looks like on the SEV selftest side.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [RFC PATCH 00/10] KVM: selftests: Add support for test-selectable ucall implementations