Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS

9 Apr 2020

      On 09/04/2020 05:50, Andy Lutomirski wrote:
...
On Wed, Apr 8, 2020 at 11:01 AM Thomas Gleixner tglx@linutronix.de wrote:
...
Paolo Bonzini pbonzini@redhat.com writes:
...
On 08/04/20 17:34, Sean Christopherson wrote:
...
On Wed, Apr 08, 2020 at 10:23:58AM +0200, Paolo Bonzini wrote:
...
Page-not-present async page faults are almost a perfect match for the
hardware use of #VE (and it might even be possible to let the processor
deliver the exceptions).
My "async" page fault knowledge is limited, but if the desired behavior is
to reflect a fault into the guest for select EPT Violations, then yes,
enabling EPT Violation #VEs in hardware is doable.  The big gotcha is that
KVM needs to set the suppress #VE bit for all EPTEs when allocating a new
MMU page, otherwise not-present faults on zero-initialized EPTEs will get
reflected.
Attached a patch that does the prep work in the MMU.  The VMX usage would be:
 kvm_mmu_set_spte_init_value(VMX_EPT_SUPPRESS_VE_BIT);

when EPT Violation #VEs are enabled.  It's 64-bit only as it uses stosq to
initialize EPTEs.  32-bit could also be supported by doing memcpy() from
a static page.
The complication is that (at least according to the current ABI) we
would not want #VE to kick if the guest currently has IF=0 (and possibly
CPL=0).  But the ABI is not set in stone, and anyway the #VE protocol is
a decent one and worth using as a base for whatever PV protocol we design.
Forget the current pf async semantics (or the lack of). You really want
to start from scratch and igore the whole thing.
The charm of #VE is that the hardware can inject it and it's not nesting
until the guest cleared the second word in the VE information area. If
that word is not 0 then you get a regular vmexit where you suspend the
vcpu until the nested problem is solved.
Can you point me at where the SDM says this?
Vol3 25.5.6.1 Convertible EPT Violations
...
Anyway, I see two problems with #VE, one big and one small.  The small
(or maybe small) one is that any fancy protocol where the guest
returns from an exception by doing, logically:
Hey I'm done;  /* MOV somewhere, hypercall, MOV to CR4, whatever */
IRET;
is fundamentally racy.  After we say we're done and before IRET, we
can be recursively reentered.  Hi, NMI!
Correct.  There is no way to atomically end the #VE handler.  (This
causes "fun" even when using #VE for its intended purpose.)
~Andrew

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS