On 08/04/20 15:01, Thomas Gleixner wrote:
And it comes with restrictions:
The Do Other Stuff event can only be delivered when guest IF=1. If guest IF=0 then the host has to suspend the guest until the situation is resolved. The 'Situation resolved' event must also wait for a guest IF=1 slot.
Additionally:
- the do other stuff event must be delivered to the same CPU that is causing the host-side page fault
- the do other stuff event provides a token that identifies the cause and the situation resolved event provides a matching token
This stuff is why I think the do other stuff event looks very much like a #VE. But I think we're in violent agreement after all.
If you just want to solve Viveks problem, then its good enough. I.e. the file truncation turns the EPT entries into #VE convertible entries and the guest #VE handler can figure it out. This one can be injected directly by the hardware, i.e. you don't need a VMEXIT.
If you want the opportunistic do other stuff mechanism, then #VE has exactly the same problems as the existing async "PF". It's not magicaly making that go away.
You can inject #VE from the hypervisor too, with PV magic to distinguish the two. However that's not necessarily a good idea because it makes it harder to switch to hardware delivery in the future.
One possible solution might be to make all recoverable EPT entries convertible and let the HW inject #VE for those.
So the #VE handler in the guest would have to do:
if (!recoverable()) { if (user_mode) send_signal(); else if (!fixup_exception()) die_hard(); goto done; } store_ve_info_in_pv_page(); if (!user_mode(regs) || !preemptible()) { hypercall_resolve_ept(can_continue = false); } else { init_completion(); hypercall_resolve_ept(can_continue = true); wait_for_completion(); }
or something like that.
Yes, pretty much. The VE info can also be passed down to the hypercall as arguments.
Paolo
The hypercall to resolve the EPT fail on the host acts on the can_continue argument.
If false, it suspends the guest vCPU and only returns when done.
If true it kicks the resolve process and returns to the guest which suspends the task and tries to do something else.
The wakeup side needs to be a regular interrupt and cannot go through #VE.