On Wed, Aug 31, 2022 at 10:12:12AM +0100, Fuad Tabba wrote:
Moreover, something which was discussed here before [3], is the ability to share in-place. For pKVM/arm64, the conversion between shared and private involves only changes to the stage-2 page tables, which are controlled by the hypervisor. Android supports this in-place conversion already, and I think that the cost of copying for many use-cases that would involve large amounts of data would be big. We will measure the relative costs in due course, but in the meantime we’re nervous about adopting a new user ABI which doesn’t appear to cater for in-place conversion; having just the fd would simplify that somewhat
I understand there is difficulty to achieve that with the current private_fd + userspace_addr (they basically in two separate fds), but is it possible for pKVM to extend this? Brainstorming for example, pKVM can ignore userspace_addr and only use private_fd to cover both shared and private memory, or pKVM introduce new KVM memslot flag?
It's not that there's anything blocking pKVM from doing that. It's that the disconnect of using a memory address for the shared memory, and a file descriptor for the private memory doesn't really make sense for pKVM. I see how it makes sense for TDX and the Intel-specific implementation. It just seems that this is baking in an implementation-specific aspect as a part of the KVM general api, and the worry is that this might have some unintended consequences in the future.
It's true this API originates from supporting TDX and probably other similar confidential computing(CC) technologies. But if we ever get chance to make it more common to cover more usages like pKVM, I would also like to. The challenge on this point is pKVM diverges a lot from CC usages, putting both shared and private memory in the same fd complicates CC usages. If two things are different enough, I'm also thinking implementation-specific may not be that bad.
Chao