The following commit fixes freezes in virtio device drivers when KVM is nested under VMWare Workstation/ESXi or Hyper-V. I've encountered problems running KVM inside VMWare since upgrading to Debian 9 (currently testing 4.9.88-1+deb9u1).
d391f1207067268261add0485f0f34503539c5b0
The same issue affects 4.4.y as well. A git-bisect within my environment stopped at e9ea5069d9e569c32ab913c39467df32e056b3a7, where the KVM capability was added that QEMU checks before enabling fast mmio.
Thanks, Mike
On Fri, May 25, 2018 at 09:03:07AM -0400, Mike Haboustak wrote:
The following commit fixes freezes in virtio device drivers when KVM is nested under VMWare Workstation/ESXi or Hyper-V. I've encountered problems running KVM inside VMWare since upgrading to Debian 9 (currently testing 4.9.88-1+deb9u1).
d391f1207067268261add0485f0f34503539c5b0
The same issue affects 4.4.y as well. A git-bisect within my environment stopped at e9ea5069d9e569c32ab913c39467df32e056b3a7, where the KVM capability was added that QEMU checks before enabling fast mmio.
The patch does not apply to 4.9.y or 4.4.y, can you please provide a working backport? I will be glad to queue it up then.
thanks,
greg k-h
On Sat, May 26, 2018 at 03:22:26PM +0200, Greg KH wrote:
On Fri, May 25, 2018 at 09:03:07AM -0400, Mike Haboustak wrote:
The following commit fixes freezes in virtio device drivers when KVM is nested under VMWare Workstation/ESXi or Hyper-V. I've encountered problems running KVM inside VMWare since upgrading to Debian 9 (currently testing 4.9.88-1+deb9u1).
d391f1207067268261add0485f0f34503539c5b0
The same issue affects 4.4.y as well. A git-bisect within my environment stopped at e9ea5069d9e569c32ab913c39467df32e056b3a7, where the KVM capability was added that QEMU checks before enabling fast mmio.
The patch does not apply to 4.9.y or 4.4.y, can you please provide a working backport? I will be glad to queue it up then.
Greg,
Sorry for taking so long to get back to you. I wanted to ensure the backport was tested before sending, and it took me a while to prioritize it.
I've run the patch I'll be sending in 4.9.y extensively, and in 4.4.y enough to be confident that the issue is resolved.
Thanks, mike
[ Backport of upstream commit d391f1207067268261add0485f0f34503539c5b0 ]
I was investigating an issue with seabios >= 1.10 which stopped working for nested KVM on Hyper-V. The problem appears to be in handle_ept_violation() function: when we do fast mmio we need to skip the instruction so we do kvm_skip_emulated_instruction(). This, however, depends on VM_EXIT_INSTRUCTION_LEN field being set correctly in VMCS. However, this is not the case.
Intel's manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set when EPT MISCONFIG occurs. While on real hardware it was observed to be set, some hypervisors follow the spec and don't set it; we end up advancing IP with some random value.
I checked with Microsoft and they confirmed they don't fill VM_EXIT_INSTRUCTION_LEN on EPT MISCONFIG.
Fix the issue by doing instruction skip through emulator when running nested.
Fixes: 68c3b4d1676d870f0453c31d5a52e7e65c7448ae Suggested-by: Radim Krčmář rkrcmar@redhat.com Suggested-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Radim Krčmář rkrcmar@redhat.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com [mhaboustak: backport to 4.9.y] Signed-off-by: Mike Haboustak haboustak@gmail.com --- arch/x86/kvm/vmx.c | 19 +++++++++++++++++-- arch/x86/kvm/x86.c | 3 ++- 2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 011050820608..9446a3a2fc69 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6548,9 +6548,24 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); if (!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) { - skip_emulated_instruction(vcpu); trace_kvm_fast_mmio(gpa); - return 1; + /* + * Doing kvm_skip_emulated_instruction() depends on undefined + * behavior: Intel's manual doesn't mandate + * VM_EXIT_INSTRUCTION_LEN to be set in VMCS when EPT MISCONFIG + * occurs and while on real hardware it was observed to be set, + * other hypervisors (namely Hyper-V) don't set it, we end up + * advancing IP with some random value. Disable fast mmio when + * running nested and keep it for real hardware in hope that + * VM_EXIT_INSTRUCTION_LEN will always be set correctly. + */ + if (!static_cpu_has(X86_FEATURE_HYPERVISOR)) { + skip_emulated_instruction(vcpu); + return 1; + } + else + return x86_emulate_instruction(vcpu, gpa, EMULTYPE_SKIP, + NULL, 0) == EMULATE_DONE; }
ret = handle_mmio_page_fault(vcpu, gpa, true); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 27d13b870e07..46e0ad71b4da 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5707,7 +5707,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, * handle watchpoints yet, those would be handled in * the emulate_ops. */ - if (kvm_vcpu_check_breakpoint(vcpu, &r)) + if (!(emulation_type & EMULTYPE_SKIP) && + kvm_vcpu_check_breakpoint(vcpu, &r)) return r;
ctxt->interruptibility = 0;
On Sun, Jan 06, 2019 at 01:57:24PM -0500, Mike Haboustak wrote:
[ Backport of upstream commit d391f1207067268261add0485f0f34503539c5b0 ]
I was investigating an issue with seabios >= 1.10 which stopped working for nested KVM on Hyper-V. The problem appears to be in handle_ept_violation() function: when we do fast mmio we need to skip the instruction so we do kvm_skip_emulated_instruction(). This, however, depends on VM_EXIT_INSTRUCTION_LEN field being set correctly in VMCS. However, this is not the case.
Intel's manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set when EPT MISCONFIG occurs. While on real hardware it was observed to be set, some hypervisors follow the spec and don't set it; we end up advancing IP with some random value.
I checked with Microsoft and they confirmed they don't fill VM_EXIT_INSTRUCTION_LEN on EPT MISCONFIG.
Fix the issue by doing instruction skip through emulator when running nested.
Fixes: 68c3b4d1676d870f0453c31d5a52e7e65c7448ae Suggested-by: Radim Krčmář rkrcmar@redhat.com Suggested-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Radim Krčmář rkrcmar@redhat.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com [mhaboustak: backport to 4.9.y] Signed-off-by: Mike Haboustak haboustak@gmail.com
arch/x86/kvm/vmx.c | 19 +++++++++++++++++-- arch/x86/kvm/x86.c | 3 ++- 2 files changed, 19 insertions(+), 3 deletions(-)
Now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org