On 15.11.24 17:59, Patrick Roy wrote:
On Tue, 2024-11-12 at 14:52 +0000, David Hildenbrand wrote:
On 12.11.24 15:40, Patrick Roy wrote:
I remember talking to someone at some point about whether we could reuse the proc-local stuff for guest memory, but I cannot remember the outcome of that discussion... (or maybe I just wanted to have a discussion about it, but forgot to follow up on that thought?). I guess we wouldn't use proc-local _allocations_, but rather just set up proc-local mappings of the gmem allocations that have been removed from the direct map.
Yes. And likely only for memory we really access / try access, if possible.
Well, if we start on-demand mm-local mapping the things we want to access, we're back in TLB flush hell, no?
At least the on-demand mapping shouldn't require a TLB flush? Only "unmapping" if we want to restrict the size of a "mapped pool" of restricted size.
Anyhow, this would be a pure optimization, to avoid the expense of mapping everything, when in practice you'd like not access most of it.
(my theory, happy to be told I'm wrong :) )
And we can't know ahead-of-time what needs to be mapped, so everything would need to be mapped (unless we do something like mm-local mapping a page on first access, and then just never unmapping it again, under the assumption that establishing the mapping won't be expensive)
Right, the whole problem is that we don't know that upfront.
I'm wondering, where exactly would be the differences to Sean's idea about messing with the CR3 register inside KVM to temporarily install page tables that contain all the gmem stuff, conceptually? Wouldn't we run into the same interrupt problems that Sean foresaw for the CR3 stuff? (which, admittedly, I still don't quite follow what these are :( ).
I'd need some more details on that. If anything would rely on the direct mapping (from IRQ context?) than ... we obviously cannot remove the direct mapping :)
I've talked to Fares internally, and it seems that generally doing mm-local mappings of guest memory would work for us. We also figured out what the "interrupt problem" is, namely that if we receive an interrupt while executing in a context that has mm-local mappings available, those mappings will continue to be available while the interrupt is being handled.
Isn't that likely also the case with secretmem where we removed the directmap, but have an effective per-mm mapping in the (user-space portion) of the page table?
I'm talking to my security folks to see how much of a concern this is for the speculation hardening we're trying to achieve. Will keep you in the loop there :)
Thanks!