Actually the current version allows you to delay the allocation to a later time (e.g. page fault time) if you don't call fallocate() on the private fd. fallocate() is necessary in previous versions because we treat the existense in the fd as 'private' but in this version we track private/shared info in KVM so we don't rely on that fact from memory backstores.
Does this also mean reservation of guest physical memory with secure processor (both for SEV-SNP & TDX) will also happen at page fault time?
Do we plan to keep it this way?
If you are talking about accepting memory by the guest, it is initiated by the guest and has nothing to do with page fault time vs fallocate() allocation of host memory. I mean acceptance happens after host memory allocation but they are not in lockstep, acceptance can happen much later.
No, I meant reserving guest physical memory range from hypervisor e.g with RMPUpdate for SEV-SNP or equivalent at TDX side (PAMTs?).
As proposed, RMP/PAMT updates will occur in the fault path, i.e. there is no way for userspace to pre-map guest memory.
I think the best approach is to turn KVM_TDX_INIT_MEM_REGION into a generic vCPU-scoped ioctl() that allows userspace to pre-map guest memory. Supporting initializing guest private memory with a source page can be implemented via a flag. That also gives KVM line of sight to in-place "conversion", e.g. another flag could be added to say that the dest is also the source.
Questions to clarify *my* understanding here:
- Do you suggest to use KVM_TDX_INIT_MEM_REGION into a generic ioctl to pre-map guest private memory in addition to initialize the payload (in-place encryption or just copy page to guest private memory)?
- Want to clarify "pre-map": Are you suggesting to use the ioctl to avoid the RMP/PAMT registration at guest page fault time? instead pre-map guest private memory i.e to allocate and do RMP/PAMT registration before running the actual guest vCPU's?
Thanks, Pankaj
The TDX and SNP restrictions would then become addition restrictions on when initializing with a source is allowed (and VMs that don't have guest private memory wouldn't allow the flag at all).