Changes since v3 [1] * Drop 'ifence_array_ptr' and associated compile-time + run-time switching and just use the masking approach all the time.
* Convert 'get_user' to use pointer sanitization via masking rather than lfence. '__get_user' and associated paths still rely on lfence. (Linus)
"Basically, the rule is trivial: find all 'stac' users, and use address masking if those users already integrate the limit check, and lfence they don't."
* At syscall entry sanitize the syscall number under speculation to remove a user controlled pointer de-reference in kernel space. (Linus)
* Fix a raw lfence in the kvm code (added for v4.15-rc8) to use 'array_ptr'.
* Propose 'array_idx' as a way to sanitize user input that is later used as an array index, but where the validation is happening in a different code block than the array reference. (Christian).
* Fix grammar in speculation.txt (Kees)
---
Quoting Mark's original RFC:
"Recently, Google Project Zero discovered several classes of attack against speculative execution. One of these, known as variant-1, allows explicit bounds checks to be bypassed under speculation, providing an arbitrary read gadget. Further details can be found on the GPZ blog [2] and the Documentation patch in this series."
A precondition of using this attack on the kernel is to get a user controlled pointer de-referenced (under speculation) in privileged code. The primary source of user controlled pointers in the kernel is the arguments passed to 'get_user' and '__get_user'. An example of other user controlled pointers are user-controlled array / pointer offsets.
Better tooling is needed to find more arrays / pointers with user controlled indices / offsets that can be converted to use 'array_ptr' or 'array_idx'. A few are included in this set, and these are not expected to be complete. That said, the 'get_user' protections raise the bar on finding a vulnerable gadget in the kernel.
These patches are also available via the 'nospec-v4' git branch here:
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4
Note that the BPF fix for Spectre variant1 is merged for 4.15-rc8.
[2]: https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-w...
---
Dan Williams (9): asm/nospec, array_ptr: sanitize speculative array de-references x86: implement array_ptr_mask() x86: introduce __uaccess_begin_nospec and ifence x86, __get_user: use __uaccess_begin_nospec x86, get_user: use pointer masking to limit speculation x86: narrow out of bounds syscalls to sys_read under speculation vfs, fdtable: prevent bounds-check bypass via speculative execution kvm, x86: fix spectre-v1 mitigation nl80211: sanitize array index in parse_txq_params
Mark Rutland (1): Documentation: document array_ptr
Documentation/speculation.txt | 143 +++++++++++++++++++++++++++++++++++++ arch/x86/entry/entry_64.S | 2 + arch/x86/include/asm/barrier.h | 28 +++++++ arch/x86/include/asm/msr.h | 3 - arch/x86/include/asm/smap.h | 24 ++++++ arch/x86/include/asm/uaccess.h | 15 +++- arch/x86/include/asm/uaccess_32.h | 6 +- arch/x86/include/asm/uaccess_64.h | 12 ++- arch/x86/kvm/vmx.c | 19 ++--- arch/x86/lib/getuser.S | 5 + arch/x86/lib/usercopy_32.c | 8 +- include/linux/fdtable.h | 7 +- include/linux/nospec.h | 65 +++++++++++++++++ net/wireless/nl80211.c | 10 ++- 14 files changed, 312 insertions(+), 35 deletions(-) create mode 100644 Documentation/speculation.txt create mode 100644 include/linux/nospec.h
Commit 75f139aaf896 "KVM: x86: Add memory barrier on vmcs field lookup" added a raw 'asm("lfence");' to prevent a bounds check bypass of 'vmcs_field_to_offset_table'. This does not work for some AMD cpus, see the 'ifence' helper, and it otherwise does not use the common 'array_ptr' helper designed for these types of fixes. Convert this to use 'array_ptr'.
Cc: Andrew Honig ahonig@google.com Cc: Jim Mattson jmattson@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Dan Williams dan.j.williams@intel.com --- arch/x86/kvm/vmx.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c829d89e2e63..20b9b0b5e336 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -34,6 +34,7 @@ #include <linux/tboot.h> #include <linux/hrtimer.h> #include <linux/frame.h> +#include <linux/nospec.h> #include "kvm_cache_regs.h" #include "x86.h"
@@ -898,21 +899,15 @@ static const unsigned short vmcs_field_to_offset_table[] = {
static inline short vmcs_field_to_offset(unsigned long field) { - BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX); - - if (field >= ARRAY_SIZE(vmcs_field_to_offset_table)) - return -ENOENT; + const unsigned short *offset;
- /* - * FIXME: Mitigation for CVE-2017-5753. To be replaced with a - * generic mechanism. - */ - asm("lfence"); + BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
- if (vmcs_field_to_offset_table[field] == 0) + offset = array_ptr(vmcs_field_to_offset_table, field, + ARRAY_SIZE(vmcs_field_to_offset_table)); + if (!offset || *offset == 0) return -ENOENT; - - return vmcs_field_to_offset_table[field]; + return *offset; }
static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
On 19/01/2018 01:02, Dan Williams wrote:
Commit 75f139aaf896 "KVM: x86: Add memory barrier on vmcs field lookup" added a raw 'asm("lfence");' to prevent a bounds check bypass of 'vmcs_field_to_offset_table'. This does not work for some AMD cpus, see the 'ifence' helper,
The code never runs on AMD cpus (it's for Intel virtualization extensions), so it'd be nice if you could fix up the commit message.
Apart from this, obviously
Acked-by: Paolo Bonzini pbonzini@redhat.com
Thanks!
Paolo
and it otherwise does not use the common 'array_ptr' helper designed for these types of fixes. Convert this to use 'array_ptr'.
Cc: Andrew Honig ahonig@google.com Cc: Jim Mattson jmattson@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Dan Williams dan.j.williams@intel.com
arch/x86/kvm/vmx.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c829d89e2e63..20b9b0b5e336 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -34,6 +34,7 @@ #include <linux/tboot.h> #include <linux/hrtimer.h> #include <linux/frame.h> +#include <linux/nospec.h> #include "kvm_cache_regs.h" #include "x86.h" @@ -898,21 +899,15 @@ static const unsigned short vmcs_field_to_offset_table[] = { static inline short vmcs_field_to_offset(unsigned long field) {
- BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
- if (field >= ARRAY_SIZE(vmcs_field_to_offset_table))
return -ENOENT;
- const unsigned short *offset;
- /*
* FIXME: Mitigation for CVE-2017-5753. To be replaced with a
* generic mechanism.
*/
- asm("lfence");
- BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
- if (vmcs_field_to_offset_table[field] == 0)
- offset = array_ptr(vmcs_field_to_offset_table, field,
ARRAY_SIZE(vmcs_field_to_offset_table));
- if (!offset || *offset == 0) return -ENOENT;
- return vmcs_field_to_offset_table[field];
- return *offset;
} static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
On Thu, Jan 18, 2018 at 4:01 PM, Dan Williams dan.j.williams@intel.com wrote:
Changes since v3 [1]
Drop 'ifence_array_ptr' and associated compile-time + run-time switching and just use the masking approach all the time.
Convert 'get_user' to use pointer sanitization via masking rather than lfence. '__get_user' and associated paths still rely on lfence. (Linus)
"Basically, the rule is trivial: find all 'stac' users, and use address masking if those users already integrate the limit check, and lfence they don't."
At syscall entry sanitize the syscall number under speculation to remove a user controlled pointer de-reference in kernel space. (Linus)
Fix a raw lfence in the kvm code (added for v4.15-rc8) to use 'array_ptr'.
Propose 'array_idx' as a way to sanitize user input that is later used as an array index, but where the validation is happening in a different code block than the array reference. (Christian).
Fix grammar in speculation.txt (Kees)
Quoting Mark's original RFC:
"Recently, Google Project Zero discovered several classes of attack against speculative execution. One of these, known as variant-1, allows explicit bounds checks to be bypassed under speculation, providing an arbitrary read gadget. Further details can be found on the GPZ blog [2] and the Documentation patch in this series."
A precondition of using this attack on the kernel is to get a user controlled pointer de-referenced (under speculation) in privileged code. The primary source of user controlled pointers in the kernel is the arguments passed to 'get_user' and '__get_user'. An example of other user controlled pointers are user-controlled array / pointer offsets.
Better tooling is needed to find more arrays / pointers with user controlled indices / offsets that can be converted to use 'array_ptr' or 'array_idx'. A few are included in this set, and these are not expected to be complete. That said, the 'get_user' protections raise the bar on finding a vulnerable gadget in the kernel.
These patches are also available via the 'nospec-v4' git branch here:
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4
I've pushed out a nospec-v4.1 with the below minor cleanup, a fixup of the changelog for "kvm, x86: fix spectre-v1 mitigation", and added Paolo's ack.
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4.1
diff --git a/include/linux/nospec.h b/include/linux/nospec.h index 8af35be1869e..b8a9222e34d1 100644 --- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -37,7 +37,7 @@ static inline unsigned long array_ptr_mask(unsigned long idx, unsigned long sz) unsigned long _i = (idx); \ unsigned long _mask = array_ptr_mask(_i, (sz)); \ \ - __u._ptr = _arr + (_i & _mask); \ + __u._ptr = _arr + _i; \ __u._bit &= _mask; \ __u._ptr; \ })
On Fri, Jan 19, 2018 at 10:58:44PM -0800, Dan Williams wrote:
On Thu, Jan 18, 2018 at 4:01 PM, Dan Williams dan.j.williams@intel.com wrote:
Changes since v3 [1]
Drop 'ifence_array_ptr' and associated compile-time + run-time switching and just use the masking approach all the time.
Convert 'get_user' to use pointer sanitization via masking rather than lfence. '__get_user' and associated paths still rely on lfence. (Linus)
"Basically, the rule is trivial: find all 'stac' users, and use address masking if those users already integrate the limit check, and lfence they don't."
At syscall entry sanitize the syscall number under speculation to remove a user controlled pointer de-reference in kernel space. (Linus)
Fix a raw lfence in the kvm code (added for v4.15-rc8) to use 'array_ptr'.
Propose 'array_idx' as a way to sanitize user input that is later used as an array index, but where the validation is happening in a different code block than the array reference. (Christian).
Fix grammar in speculation.txt (Kees)
Quoting Mark's original RFC:
"Recently, Google Project Zero discovered several classes of attack against speculative execution. One of these, known as variant-1, allows explicit bounds checks to be bypassed under speculation, providing an arbitrary read gadget. Further details can be found on the GPZ blog [2] and the Documentation patch in this series."
A precondition of using this attack on the kernel is to get a user controlled pointer de-referenced (under speculation) in privileged code. The primary source of user controlled pointers in the kernel is the arguments passed to 'get_user' and '__get_user'. An example of other user controlled pointers are user-controlled array / pointer offsets.
Better tooling is needed to find more arrays / pointers with user controlled indices / offsets that can be converted to use 'array_ptr' or 'array_idx'. A few are included in this set, and these are not expected to be complete. That said, the 'get_user' protections raise the bar on finding a vulnerable gadget in the kernel.
These patches are also available via the 'nospec-v4' git branch here:
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4
I've pushed out a nospec-v4.1 with the below minor cleanup, a fixup of the changelog for "kvm, x86: fix spectre-v1 mitigation", and added Paolo's ack.
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4.1
diff --git a/include/linux/nospec.h b/include/linux/nospec.h index 8af35be1869e..b8a9222e34d1 100644 --- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -37,7 +37,7 @@ static inline unsigned long array_ptr_mask(unsigned long idx, unsigned long sz) unsigned long _i = (idx); \ unsigned long _mask = array_ptr_mask(_i, (sz)); \ \
__u._ptr = _arr + (_i & _mask); \
__u._ptr = _arr + _i; \ __u._bit &= _mask; \ __u._ptr; \
hmm. I'm not sure it's the right thing to do, since the macro is forcing cpu to speculate subsequent load from null instead of valid pointer. As Linus said: " So that __u._bit masking wasn't masking the pointer, it was masking the value that was *read* from the pointer, so that you could know that an invalid access returned 0/NULL, not just the first value in the array. " imo just return _arr + (_i & _mask); is enough. No need for union games. The cpu will speculate the load from _arr[0] if _i is out of bounds which is the same as if user passed _i == 0 which would have passed bounds check anyway, so I don't see any data leak from populating cache with _arr[0] data. In-bounds access can do that just as well without any speculation.
On Sat, Jan 20, 2018 at 8:56 AM, Alexei Starovoitov alexei.starovoitov@gmail.com wrote:
On Fri, Jan 19, 2018 at 10:58:44PM -0800, Dan Williams wrote:
On Thu, Jan 18, 2018 at 4:01 PM, Dan Williams dan.j.williams@intel.com wrote:
Changes since v3 [1]
Drop 'ifence_array_ptr' and associated compile-time + run-time switching and just use the masking approach all the time.
Convert 'get_user' to use pointer sanitization via masking rather than lfence. '__get_user' and associated paths still rely on lfence. (Linus)
"Basically, the rule is trivial: find all 'stac' users, and use address masking if those users already integrate the limit check, and lfence they don't."
At syscall entry sanitize the syscall number under speculation to remove a user controlled pointer de-reference in kernel space. (Linus)
Fix a raw lfence in the kvm code (added for v4.15-rc8) to use 'array_ptr'.
Propose 'array_idx' as a way to sanitize user input that is later used as an array index, but where the validation is happening in a different code block than the array reference. (Christian).
Fix grammar in speculation.txt (Kees)
Quoting Mark's original RFC:
"Recently, Google Project Zero discovered several classes of attack against speculative execution. One of these, known as variant-1, allows explicit bounds checks to be bypassed under speculation, providing an arbitrary read gadget. Further details can be found on the GPZ blog [2] and the Documentation patch in this series."
A precondition of using this attack on the kernel is to get a user controlled pointer de-referenced (under speculation) in privileged code. The primary source of user controlled pointers in the kernel is the arguments passed to 'get_user' and '__get_user'. An example of other user controlled pointers are user-controlled array / pointer offsets.
Better tooling is needed to find more arrays / pointers with user controlled indices / offsets that can be converted to use 'array_ptr' or 'array_idx'. A few are included in this set, and these are not expected to be complete. That said, the 'get_user' protections raise the bar on finding a vulnerable gadget in the kernel.
These patches are also available via the 'nospec-v4' git branch here:
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4
I've pushed out a nospec-v4.1 with the below minor cleanup, a fixup of the changelog for "kvm, x86: fix spectre-v1 mitigation", and added Paolo's ack.
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4.1
diff --git a/include/linux/nospec.h b/include/linux/nospec.h index 8af35be1869e..b8a9222e34d1 100644 --- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -37,7 +37,7 @@ static inline unsigned long array_ptr_mask(unsigned long idx, unsigned long sz) unsigned long _i = (idx); \ unsigned long _mask = array_ptr_mask(_i, (sz)); \ \
__u._ptr = _arr + (_i & _mask); \
__u._ptr = _arr + _i; \ __u._bit &= _mask; \ __u._ptr; \
hmm. I'm not sure it's the right thing to do, since the macro is forcing cpu to speculate subsequent load from null instead of valid pointer. As Linus said: " So that __u._bit masking wasn't masking the pointer, it was masking the value that was *read* from the pointer, so that you could know that an invalid access returned 0/NULL, not just the first value in the array. " imo just return _arr + (_i & _mask); is enough. No need for union games. The cpu will speculate the load from _arr[0] if _i is out of bounds which is the same as if user passed _i == 0 which would have passed bounds check anyway, so I don't see any data leak from populating cache with _arr[0] data. In-bounds access can do that just as well without any speculation.
scratch that. It's array_ptr, not array_access. The code will do if (!ptr) later, so yeah this api is fine.
linux-stable-mirror@lists.linaro.org