The ABI is broken and we cannot support it properly. Turn it off.
If this causes a meaningful performance regression for someone, KVM can introduce an improved ABI that is supportable.
Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski luto@kernel.org --- arch/x86/kernel/kvm.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 93ab0cbd304e..e6f2aefa298b 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -318,11 +318,26 @@ static void kvm_guest_cpu_init(void)
pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
-#ifdef CONFIG_PREEMPTION - pa |= KVM_ASYNC_PF_SEND_ALWAYS; -#endif pa |= KVM_ASYNC_PF_ENABLED;
+ /* + * We do not set KVM_ASYNC_PF_SEND_ALWAYS. With the current + * KVM paravirt ABI, the following scenario is possible: + * + * #PF: async page fault (KVM_PV_REASON_PAGE_NOT_PRESENT) + * NMI before CR2 or KVM_PF_REASON_PAGE_NOT_PRESENT + * NMI accesses user memory, e.g. due to perf + * #PF: normal page fault + * #PF reads CR2 and apf_reason -- apf_reason should be 0 + * + * outer #PF reads CR2 and apf_reason -- apf_reason should be + * KVM_PV_REASON_PAGE_NOT_PRESENT + * + * There is no possible way that both reads of CR2 and + * apf_reason get the correct values. Fixing this would + * require paravirt ABI changes. + */ + if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_VMEXIT)) pa |= KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT;