On Wed, Jul 06, 2022 at 04:20:02PM +0800, Chao Peng wrote:
This is the v7 of this series which tries to implement the fd-based KVM guest private memory. The patches are based on latest kvm/queue branch commit:
b9b71f43683a (kvm/queue) KVM: x86/mmu: Buffer nested MMU split_desc_cache only by default capacity
Introduction
In general this patch series introduce fd-based memslot which provides guest memory through memory file descriptor fd[offset,size] instead of hva/size. The fd can be created from a supported memory filesystem like tmpfs/hugetlbfs etc. which we refer as memory backing store. KVM and the the memory backing store exchange callbacks when such memslot gets created. At runtime KVM will call into callbacks provided by the backing store to get the pfn with the fd+offset. Memory backing store will also call into KVM callbacks when userspace punch hole on the fd to notify KVM to unmap secondary MMU page table entries.
Comparing to existing hva-based memslot, this new type of memslot allows guest memory unmapped from host userspace like QEMU and even the kernel itself, therefore reduce attack surface and prevent bugs.
Based on this fd-based memslot, we can build guest private memory that is going to be used in confidential computing environments such as Intel TDX and AMD SEV. When supported, the memory backing store can provide more enforcement on the fd and KVM can use a single memslot to hold both the private and shared part of the guest memory.
Hi everyone,
Just wanted to let you all know that I reserved a slot at the LPC Confidential Computing Microconference to discuss some topics related to unmapped/inaccessible private memory support:
"Unmapped Private Memory for Confidential Guests" Tuesday, Sep 13th, 10:00am (Dublin time) https://lpc.events/event/16/sessions/133/#20220913
The discussion agenda is still a bit in flux, but one topic I really wanted to cover is how we intend to deal with the kernel directmap for TDX/SNP, where there is a need to either remove or split mappings so that KVM or other kernel threads writing to non-private pages don't run into issues due mappings overlapping with private pages.[1]
Other possible discussion topics:
- guarding against shared->private conversions while KVM is attempting to access a shared page (separate PFN pools for shared/private seems to resolve this nicely, but may not be compatible with things like pKVM where the underlying PFN is the same for shared/private)[2]
- extending KVM_EXIT_MEMORY_FAULT to handle batched requests to better handle things like explicit batched conversions initiated by the guest
It's a short session so not sure how much time we'll actually have to discuss things in detail, but maybe this can at least be a good jumping off point for other discussions.
Thanks, and hope to see you there!
[1] https://lore.kernel.org/all/YWb8WG6Ravbs1nbx@google.com/ [2] https://lore.kernel.org/lkml/CA+EHjTy6NF=BkCqK0vhXLdtKZMahp55JUMSfxN96-NT3Yi...