On Fri, Oct 31, 2025 at 10:15 AM Sean Christopherson seanjc@google.com wrote:
On Fri, Oct 31, 2025, Ira Weiny wrote:
Sagi Shahar wrote:
From: Erdem Aktas erdemaktas@google.com
Add support for TDX guests to issue TDCALLs to the TDX module.
Generally it is nice to have more details. As someone new to TDX I have to remind myself what a TDCALL is. And any random kernel developer reading this in the future will likely have even less clue than me.
Paraphrased from the spec:
TDCALL is the instruction used by the guest TD software (in TDX non-root mode) to invoke guest-side TDX functions. TDG.VP.VMCALL helps invoke services from the host VMM.
Add support for TDX guests to invoke services from the host VMM.
Eh, at some point a baseline amount of knowledge is required. I highly doubt regurgitating the spec is going to make a huge difference
I also dislike the above wording, because it doesn't help understand _why_ KVM selftests need to support TDCALL, or _how_ the functionality will be utilized. E.g. strictly speaking, we could write KVM selftests without ever doing a single TDG.VP.VMCALL, because we control both sides (guest and VMM). And I have a hard time belive name-dropping TDG.VP.VMCALL is going to connect the dots between TDCALL and the "tunneling" scheme defined by the GHCI for requesting emulation of "legacy" functionality".
What I would like to know is why selftests are copy-pasting the kernel's scheme for marshalling data to/from the registers used by TDCALL, how selftests are expected to utilize TDCALL, etc. I'm confident that if someone actually took the time to write a changelog explaining those details, then what TDCALL "is" will be fairly clear, even if the reader doesn't know exactly what it is.
E.g. IMO this is ugly and lazy on multiple fronts:
To give some context to why this was done this way: Part of the reason for the selftests is to test the GHCI protocol itself. Some of the selftests will issue calls with purposely invalid arguments to ensure KVM handles these cases properly. For example, issuing a port IO calls with sizes other than 1,2 or 4 and ensure we get an error on the guest side.
The code was intentionally written to be specific to TDX so we can test the TDX GHCI spec itself.
As I understand it, you want the selftests to operate at a higher level and abstract away the specific GHCI details so that the code can be shared between TDX and SEV. I can refactor the code to abstract away implementation details. However, tests that want to exercise the API at a fine-grained level to test different arguments will need to define these TDCALLs themselves.
These calls were placed in a header that can be included in the guest code. I can add higher level wrappers that can be used for common code.
uint64_t tdg_vp_vmcall_ve_request_mmio_write(uint64_t address, uint64_t size, uint64_t data_in) { struct tdx_tdcall_args args = { .r10 = TDG_VP_VMCALL, .r11 = TDG_VP_VMCALL_VE_REQUEST_MMIO, .r12 = size, .r13 = MMIO_WRITE, .r14 = address, .r15 = data_in, };
return __tdx_tdcall(&args, 0);}
First, these are KVM selftests, there's no need to provide a super fancy namespace because we are "competing" with thousands upon thousands of lines of code from other components and subsystems.
Similarly, tdg_vp_vmcall_ve_request_mmio_write() is absurdly verbose. Referencing #VE in any way is also flat out wrong.
This name was taken from the GHCI spec: TDG.VP.VMCALL<#VE.RequestMMIO> ("Intel TDX Guest-Hypervisor Communication Interface v1.5" section 3.7)
It's also far too specific to TDX, which is going to be problematic when full support for SEV-ES+ selftests comes along. I.e. calling this from common code is going to be a pain in the rear, bordering on unworkable.
And related to your comment about having enums for the sizes, there's absolutely zero reason the caller should have to specify the size.
In short, don't simply copy what was done for the kernel. The kernel is operating under constraints that do not and should not ever apply to KVM selftests. Except for tests like set_memory_region_test.c that delete memslots while a vCPU is running and thus _may_ generate MMIO accesses, our selftests should never, ever take a #VE (or #VC) and then request MMIO in the handler. If a test wants to do MMIO, then do MMIO.
So, I want to see GUEST_MMIO_WRITE() and GUEST_MMIO_READ(), or probably even just MMIO_WRITE() and MMIO_READ(). And then under the hood, wire up kvm_arch_mmio_write() and kvm_arch_mmio_read() in kvm_util_arch.h. And from there have x86 globally track if it's TDX, SEV-ES+, or "normal". That'd also give us a good reason+way to assert on s390 if a test attempts MMIO, as s390 doesn't support emulated MMIO.
One potential hiccup is if/when KVM selftests get access to actual MMIO, i.e. don't want to trigger emulation, e.g. for VFIO related selftests when accessing BARs. Though the answer there is probably to just use WRITE/READ_ONCE() and call it good.
E.g.
#define MMIO_WRITE(addr, val) \ kvm_arch_mmio_write(addr, val);
#define kvm_arch_mmio_write(addr, val) \ ({ \ if (guest_needs_tdvmcall) \ tdx_mmio_write(addr, val, sizeof(val)); \ else if (guest_needs_vmgexit) \ sev_mmio_write(addr, val, sizeof(val)); \ else \ WRITE_ONCE(addr, val); \ })
#define MMIO_READ(addr, val) \ kvm_arch_mmio_read(addr, val);
#define kvm_arch_mmio_read(addr, val) \ ({ \ if (guest_needs_tdvmcall) \ tdx_mmio_read(addr, &(val), sizeof(val)); \ else if (guest_needs_vmgexit) \ sev_mmio_write(addr, &(val), sizeof(val)); \ else \ (val) = READ_ONCE(addr); \ })