[PATCH 1/2] KVM: SVM: avoid infinite loop on NPF from bad address

List overview All Threads
Download

newer

older

[PATCH v2] Bluetooth: fix the...

[PATCH] infiniband: core: fix...

Paolo Bonzini

17 Apr 2020 17 Apr '20

4:38 p.m.

When a nested page fault is taken from an address that does not have a memslot associated to it, kvm_mmu_do_page_fault returns RET_PF_EMULATE (via mmu_set_spte) and kvm_mmu_page_fault then invokes svm_need_emulation_on_page_fault.

The default answer there is to return false, but in this case this just causes the page fault to be retried ad libitum. Since this is not a fast path, and the only other case where it is taken is an erratum, just stick a kvm_vcpu_gfn_to_memslot check in there to detect the common case where the erratum is not happening.

This fixes an infinite loop in the new set_memory_region_test.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com --- arch/x86/kvm/svm/svm.c | 7 +++++++ virt/kvm/kvm_main.c | 1 + 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a91e397d6750..c86f7278509b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3837,6 +3837,13 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu) bool smap = cr4 & X86_CR4_SMAP; bool is_user = svm_get_cpl(vcpu) == 3;

+ /* + * If RIP is invalid, go ahead with emulation which will cause an + * internal error exit. + */ + if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT)) + return true; + /* * Detect and workaround Errata 1096 Fam_17h_00_0Fh. * diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e2f60e313c87..e7436d054305 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1602,6 +1602,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn { return __gfn_to_memslot(kvm_vcpu_memslots(vcpu), gfn); } +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);

bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn) {

-- 2.18.2

Show replies by date

Sasha Levin

21 Apr 21 Apr

7:56 p.m.

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag fixing commit: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)").

The bot has tested the following trees: v5.6.5, v5.5.18, v5.4.33.

v5.6.5: Build OK! v5.5.18: Failed to apply! Possible dependencies: Unable to calculate

v5.4.33: Failed to apply! Possible dependencies: Unable to calculate

NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

-- Thanks Sasha

Wanpeng Li

8 Jul 8 Jul

8:17 a.m.

On Sat, 18 Apr 2020 at 00:39, Paolo Bonzini pbonzini@redhat.com wrote:

...

When a nested page fault is taken from an address that does not have a memslot associated to it, kvm_mmu_do_page_fault returns RET_PF_EMULATE (via mmu_set_spte) and kvm_mmu_page_fault then invokes svm_need_emulation_on_page_fault.

The default answer there is to return false, but in this case this just causes the page fault to be retried ad libitum. Since this is not a fast path, and the only other case where it is taken is an erratum, just stick a kvm_vcpu_gfn_to_memslot check in there to detect the common case where the erratum is not happening.

This fixes an infinite loop in the new set_memory_region_test.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com

arch/x86/kvm/svm/svm.c | 7 +++++++ virt/kvm/kvm_main.c | 1 + 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a91e397d6750..c86f7278509b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3837,6 +3837,13 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu) bool smap = cr4 & X86_CR4_SMAP; bool is_user = svm_get_cpl(vcpu) == 3;
  /*
   * If RIP is invalid, go ahead with emulation which will cause an
   * internal error exit.
   */
  if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT))
          return true;
  /*
   * Detect and workaround Errata 1096 Fam_17h_00_0Fh.
   *
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e2f60e313c87..e7436d054305 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1602,6 +1602,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn { return __gfn_to_memslot(kvm_vcpu_memslots(vcpu), gfn); } +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);

This commit incurs the linux guest fails to boot once add --overcommit cpu-pm=on or not intercept hlt instruction, any thoughts?

<...>-35787 [038] .... 2825.959082: kvm_exit: vcpu 1 reason npf rip 0xfd11d info 100000014 fd000 <...>-35788 [037] .... 2825.959082: kvm_exit: vcpu 2 reason npf rip 0xfd11d info 100000014 fd000 <...>-35789 [036] .... 2825.959082: kvm_exit: vcpu 3 reason npf rip 0xfd11d info 100000014 fd000 <...>-35788 [037] .... 2825.959082: kvm_page_fault: address fd000 error_code 14 <...>-35789 [036] .... 2825.959082: kvm_page_fault: address fd000 error_code 14 <...>-35787 [038] .... 2825.959083: kvm_page_fault: address fd000 error_code 14 <...>-35788 [037] .... 2825.959086: kvm_emulate_insn: 0:fd11d: (prot32) <...>-35788 [037] .... 2825.959086: kvm_emulate_insn: 0:fd11d: (prot32) failed <...>-35789 [036] .... 2825.959087: kvm_emulate_insn: 0:fd11d: (prot32) <...>-35789 [036] .... 2825.959087: kvm_emulate_insn: 0:fd11d: (prot32) failed <...>-35788 [037] .... 2825.959087: kvm_fpu: unload <...>-35787 [038] .... 2825.959087: kvm_emulate_insn: 0:fd11d: (prot32) <...>-35787 [038] .... 2825.959087: kvm_emulate_insn: 0:fd11d: (prot32) failed <...>-35789 [036] .... 2825.959087: kvm_fpu: unload <...>-35787 [038] .... 2825.959088: kvm_fpu: unload <...>-35788 [037] .... 2825.959088: kvm_userspace_exit: reason KVM_EXIT_INTERNAL_ERROR (17) <...>-35789 [036] .... 2825.959089: kvm_userspace_exit: reason KVM_EXIT_INTERNAL_ERROR (17) <...>-35787 [038] .... 2825.959089: kvm_userspace_exit: reason KVM_EXIT_INTERNAL_ERROR (17)

Paolo Bonzini

8:38 a.m.

On 08/07/20 10:17, Wanpeng Li wrote:

...

On Sat, 18 Apr 2020 at 00:39, Paolo Bonzini pbonzini@redhat.com wrote:

...
When a nested page fault is taken from an address that does not have a memslot associated to it, kvm_mmu_do_page_fault returns RET_PF_EMULATE (via mmu_set_spte) and kvm_mmu_page_fault then invokes svm_need_emulation_on_page_fault.

The default answer there is to return false, but in this case this just causes the page fault to be retried ad libitum. Since this is not a fast path, and the only other case where it is taken is an erratum, just stick a kvm_vcpu_gfn_to_memslot check in there to detect the common case where the erratum is not happening.

This fixes an infinite loop in the new set_memory_region_test.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com

arch/x86/kvm/svm/svm.c | 7 +++++++ virt/kvm/kvm_main.c | 1 + 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a91e397d6750..c86f7278509b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3837,6 +3837,13 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu) bool smap = cr4 & X86_CR4_SMAP; bool is_user = svm_get_cpl(vcpu) == 3;
  /*
   * If RIP is invalid, go ahead with emulation which will cause an
   * internal error exit.
   */
  if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT))
          return true;
  /*
   * Detect and workaround Errata 1096 Fam_17h_00_0Fh.
   *
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e2f60e313c87..e7436d054305 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1602,6 +1602,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn { return __gfn_to_memslot(kvm_vcpu_memslots(vcpu), gfn); } +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);
This commit incurs the linux guest fails to boot once add --overcommit cpu-pm=on or not intercept hlt instruction, any thoughts?

Can you write a selftest?

Paolo

Wanpeng Li

9:08 a.m.

On Wed, 8 Jul 2020 at 16:38, Paolo Bonzini pbonzini@redhat.com wrote:

...

On 08/07/20 10:17, Wanpeng Li wrote:

...
On Sat, 18 Apr 2020 at 00:39, Paolo Bonzini pbonzini@redhat.com wrote:

...
When a nested page fault is taken from an address that does not have a memslot associated to it, kvm_mmu_do_page_fault returns RET_PF_EMULATE (via mmu_set_spte) and kvm_mmu_page_fault then invokes svm_need_emulation_on_page_fault.

The default answer there is to return false, but in this case this just causes the page fault to be retried ad libitum. Since this is not a fast path, and the only other case where it is taken is an erratum, just stick a kvm_vcpu_gfn_to_memslot check in there to detect the common case where the erratum is not happening.

This fixes an infinite loop in the new set_memory_region_test.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com

arch/x86/kvm/svm/svm.c | 7 +++++++ virt/kvm/kvm_main.c | 1 + 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a91e397d6750..c86f7278509b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3837,6 +3837,13 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu) bool smap = cr4 & X86_CR4_SMAP; bool is_user = svm_get_cpl(vcpu) == 3;
  /*
   * If RIP is invalid, go ahead with emulation which will cause an
   * internal error exit.
   */
  if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT))
          return true;
  /*
   * Detect and workaround Errata 1096 Fam_17h_00_0Fh.
   *
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e2f60e313c87..e7436d054305 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1602,6 +1602,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn { return __gfn_to_memslot(kvm_vcpu_memslots(vcpu), gfn); } +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);
This commit incurs the linux guest fails to boot once add --overcommit cpu-pm=on or not intercept hlt instruction, any thoughts?
Can you write a selftest?

Actually I don't know what's happening here(why not intercept hlt instruction has associated with this commit), otherwise, it has already been fixed. :)

Wanpeng

Paolo Bonzini

11:10 a.m.

On 08/07/20 11:08, Wanpeng Li wrote:

...

...
...
...
+EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);

This commit incurs the linux guest fails to boot once add --overcommit cpu-pm=on or not intercept hlt instruction, any thoughts?

Can you write a selftest?

Actually I don't know what's happening here(why not intercept hlt instruction has associated with this commit), otherwise, it has already been fixed. :)

I don't understand, what has been fixed and where?

Paolo

Salvatore Bonaccorso

8 Jun 8 Jun

4:39 a.m.

Hi Paolo,

On Fri, Apr 17, 2020 at 12:38:42PM -0400, Paolo Bonzini wrote:

...

When a nested page fault is taken from an address that does not have a memslot associated to it, kvm_mmu_do_page_fault returns RET_PF_EMULATE (via mmu_set_spte) and kvm_mmu_page_fault then invokes svm_need_emulation_on_page_fault.

The default answer there is to return false, but in this case this just causes the page fault to be retried ad libitum. Since this is not a fast path, and the only other case where it is taken is an erratum, just stick a kvm_vcpu_gfn_to_memslot check in there to detect the common case where the erratum is not happening.

This fixes an infinite loop in the new set_memory_region_test.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com

arch/x86/kvm/svm/svm.c | 7 +++++++ virt/kvm/kvm_main.c | 1 + 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a91e397d6750..c86f7278509b 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3837,6 +3837,13 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu) bool smap = cr4 & X86_CR4_SMAP; bool is_user = svm_get_cpl(vcpu) == 3;
/*
* If RIP is invalid, go ahead with emulation which will cause an
* internal error exit.
*/
if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT))
return true;
/*

Detect and workaround Errata 1096 Fam_17h_00_0Fh.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e2f60e313c87..e7436d054305 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1602,6 +1602,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn { return __gfn_to_memslot(kvm_vcpu_memslots(vcpu), gfn); } +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot); bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn) { -- 2.18.2

I noticed that this patch, whilst beeing CC'ed for stable, appers to not have been applied to stable branches back then. There was first a mail from Sasha's bot that patches do not apply and Wanpeng Li.

Did this simply felt through the cracks here or is it not worth backporting to older series? At least https://bugzilla.redhat.com/show_bug.cgi?id=1947982#c3 seem to indicate it might not be worth of if there is risk for regression if I understand Wanpeng Li. Is this right?

Regards, Salvatore

Paolo Bonzini

7:17 a.m.

On 08/06/21 06:39, Salvatore Bonaccorso wrote:

...

Did this simply felt through the cracks here or is it not worth backporting to older series? At least https://bugzilla.redhat.com/show_bug.cgi?id=1947982#c3 seem to indicate it might not be worth of if there is risk for regression if I understand Wanpeng Li. Is this right?

It's not particularly interesting, because the loop can be broken with just Ctrl-C (or any signal for that matter) and the guest was misbehaving anyway. You can read from that bugzilla link my opinion on this "vulnerability": if you run a VM for somebody and they want to waste your CPU time, they can just run a while(1) loop.

It's a bug and it is caught by the kvm-unit-tests, so I marked it for stable at the time because it can be useful to run kvm-unit-tests on stable kernels and hanging is a bit impolite (the test harness has a timeout, but of course tests that hang have the risk missing other regressions).

I will review gladly a backport, but if it is just because of that CVE report, documenting that the vulnerability is bogus would be time spent better that doing and testing the backport.

Paolo

1661

days inactive

2078

days old

linux-stable-mirror@lists.linaro.org

7 comments

participants

tags (0)

participants (4)

Paolo Bonzini
Salvatore Bonaccorso
Sasha Levin
Wanpeng Li