This is a note to let you know that I've just added the patch titled
x86/nospec: Fix header guards names
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-nospec-fix-header-guards-names.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:30:27 CET 2018
From: Borislav Petkov <bp(a)suse.de>
Date: Fri, 26 Jan 2018 13:11:37 +0100
Subject: x86/nospec: Fix header guards names
From: Borislav Petkov <bp(a)suse.de>
(cherry picked from commit 7a32fc51ca938e67974cbb9db31e1a43f98345a9)
... to adhere to the _ASM_X86_ naming scheme.
No functional change.
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: riel(a)redhat.com
Cc: ak(a)linux.intel.com
Cc: peterz(a)infradead.org
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: jikos(a)kernel.org
Cc: luto(a)amacapital.net
Cc: dave.hansen(a)intel.com
Cc: torvalds(a)linux-foundation.org
Cc: keescook(a)google.com
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: tim.c.chen(a)linux.intel.com
Cc: gregkh(a)linux-foundation.org
Cc: pjt(a)google.com
Link: https://lkml.kernel.org/r/20180126121139.31959-3-bp@alien8.de
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/nospec-branch.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __NOSPEC_BRANCH_H__
-#define __NOSPEC_BRANCH_H__
+#ifndef _ASM_X86_NOSPEC_BRANCH_H_
+#define _ASM_X86_NOSPEC_BRANCH_H_
#include <asm/alternative.h>
#include <asm/alternative-asm.h>
@@ -232,4 +232,4 @@ static inline void indirect_branch_predi
}
#endif /* __ASSEMBLY__ */
-#endif /* __NOSPEC_BRANCH_H__ */
+#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-retpoline-simplify-vmexit_fill_rsb.patch
queue-4.9/x86-cpufeatures-clean-up-spectre-v2-related-cpuid-flags.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/x86-microcode-amd-do-not-load-when-running-on-a-hypervisor.patch
queue-4.9/x86-nospec-fix-header-guards-names.patch
queue-4.9/x86-alternative-print-unadorned-pointers.patch
queue-4.9/x86-spectre-fix-spelling-mistake-vunerable-vulnerable.patch
queue-4.9/x86-pti-mark-constant-arrays-as-__initconst.patch
queue-4.9/x86-bugs-drop-one-mitigation-from-dmesg.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
This is a note to let you know that I've just added the patch titled
x86/kvm: Update spectre-v1 mitigation
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-kvm-update-spectre-v1-mitigation.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Wed, 31 Jan 2018 17:47:03 -0800
Subject: x86/kvm: Update spectre-v1 mitigation
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit 085331dfc6bbe3501fb936e657331ca943827600)
Commit 75f139aaf896 "KVM: x86: Add memory barrier on vmcs field lookup"
added a raw 'asm("lfence");' to prevent a bounds check bypass of
'vmcs_field_to_offset_table'.
The lfence can be avoided in this path by using the array_index_nospec()
helper designed for these types of fixes.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Andrew Honig <ahonig(a)google.com>
Cc: kvm(a)vger.kernel.org
Cc: Jim Mattson <jmattson(a)google.com>
Link: https://lkml.kernel.org/r/151744959670.6342.3001723920950249067.stgit@dwill…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -33,6 +33,7 @@
#include <linux/slab.h>
#include <linux/tboot.h>
#include <linux/hrtimer.h>
+#include <linux/nospec.h>
#include "kvm_cache_regs.h"
#include "x86.h"
@@ -856,21 +857,18 @@ static const unsigned short vmcs_field_t
static inline short vmcs_field_to_offset(unsigned long field)
{
- BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
+ const size_t size = ARRAY_SIZE(vmcs_field_to_offset_table);
+ unsigned short offset;
- if (field >= ARRAY_SIZE(vmcs_field_to_offset_table))
+ BUILD_BUG_ON(size > SHRT_MAX);
+ if (field >= size)
return -ENOENT;
- /*
- * FIXME: Mitigation for CVE-2017-5753. To be replaced with a
- * generic mechanism.
- */
- asm("lfence");
-
- if (vmcs_field_to_offset_table[field] == 0)
+ field = array_index_nospec(field, size);
+ offset = vmcs_field_to_offset_table[field];
+ if (offset == 0)
return -ENOENT;
-
- return vmcs_field_to_offset_table[field];
+ return offset;
}
static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
x86: Introduce barrier_nospec
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-introduce-barrier_nospec.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 29 Jan 2018 17:02:33 -0800
Subject: x86: Introduce barrier_nospec
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit b3d7ad85b80bbc404635dca80f5b129f6242bc7a)
Rename the open coded form of this instruction sequence from
rdtsc_ordered() into a generic barrier primitive, barrier_nospec().
One of the mitigations for Spectre variant1 vulnerabilities is to fence
speculative execution after successfully validating a bounds check. I.e.
force the result of a bounds check to resolve in the instruction pipeline
to ensure speculative execution honors that result before potentially
operating on out-of-bounds data.
No functional changes.
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Suggested-by: Andi Kleen <ak(a)linux.intel.com>
Suggested-by: Ingo Molnar <mingo(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-arch(a)vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: kernel-hardening(a)lists.openwall.com
Cc: gregkh(a)linuxfoundation.org
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727415361.33451.9049453007262764675.stgit@dwil…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/barrier.h | 4 ++++
arch/x86/include/asm/msr.h | 3 +--
2 files changed, 5 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -47,6 +47,10 @@ static inline unsigned long array_index_
/* Override the default implementation from linux/nospec.h. */
#define array_index_mask_nospec array_index_mask_nospec
+/* Prevent speculative execution past this barrier. */
+#define barrier_nospec() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
+ "lfence", X86_FEATURE_LFENCE_RDTSC)
+
#ifdef CONFIG_X86_PPRO_FENCE
#define dma_rmb() rmb()
#else
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -188,8 +188,7 @@ static __always_inline unsigned long lon
* that some other imaginary CPU is updating continuously with a
* time stamp.
*/
- alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
- "lfence", X86_FEATURE_LFENCE_RDTSC);
+ barrier_nospec();
return rdtsc();
}
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 29 Jan 2018 17:02:39 -0800
Subject: x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit b3bbfb3fb5d25776b8e3f361d2eedaabb0b496cd)
For __get_user() paths, do not allow the kernel to speculate on the value
of a user controlled pointer. In addition to the 'stac' instruction for
Supervisor Mode Access Protection (SMAP), a barrier_nospec() causes the
access_ok() result to resolve in the pipeline before the CPU might take any
speculative action on the pointer value. Given the cost of 'stac' the
speculation barrier is placed after 'stac' to hopefully overlap the cost of
disabling SMAP with the cost of flushing the instruction pipeline.
Since __get_user is a major kernel interface that deals with user
controlled pointers, the __uaccess_begin_nospec() mechanism will prevent
speculative execution past an access_ok() permission check. While
speculative execution past access_ok() is not enough to lead to a kernel
memory leak, it is a necessary precondition.
To be clear, __uaccess_begin_nospec() is addressing a class of potential
problems near __get_user() usages.
Note, that while the barrier_nospec() in __uaccess_begin_nospec() is used
to protect __get_user(), pointer masking similar to array_index_nospec()
will be used for get_user() since it incorporates a bounds check near the
usage.
uaccess_try_nospec provides the same mechanism for get_user_try.
No functional changes.
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Suggested-by: Andi Kleen <ak(a)linux.intel.com>
Suggested-by: Ingo Molnar <mingo(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-arch(a)vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: kernel-hardening(a)lists.openwall.com
Cc: gregkh(a)linuxfoundation.org
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727415922.33451.5796614273104346583.stgit@dwil…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/uaccess.h | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -123,6 +123,11 @@ extern int __get_user_bad(void);
#define __uaccess_begin() stac()
#define __uaccess_end() clac()
+#define __uaccess_begin_nospec() \
+({ \
+ stac(); \
+ barrier_nospec(); \
+})
/*
* This is a type: either unsigned long, if the argument fits into
@@ -474,6 +479,10 @@ struct __large_struct { unsigned long bu
__uaccess_begin(); \
barrier();
+#define uaccess_try_nospec do { \
+ current->thread.uaccess_err = 0; \
+ __uaccess_begin_nospec(); \
+
#define uaccess_catch(err) \
__uaccess_end(); \
(err) |= (current->thread.uaccess_err ? -EFAULT : 0); \
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
x86: Implement array_index_mask_nospec
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-implement-array_index_mask_nospec.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 29 Jan 2018 17:02:28 -0800
Subject: x86: Implement array_index_mask_nospec
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit babdde2698d482b6c0de1eab4f697cf5856c5859)
array_index_nospec() uses a mask to sanitize user controllable array
indexes, i.e. generate a 0 mask if 'index' >= 'size', and a ~0 mask
otherwise. While the default array_index_mask_nospec() handles the
carry-bit from the (index - size) result in software.
The x86 array_index_mask_nospec() does the same, but the carry-bit is
handled in the processor CF flag without conditional instructions in the
control flow.
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-arch(a)vger.kernel.org
Cc: kernel-hardening(a)lists.openwall.com
Cc: gregkh(a)linuxfoundation.org
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727414808.33451.1873237130672785331.stgit@dwil…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/barrier.h | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -23,6 +23,30 @@
#define wmb() asm volatile("sfence" ::: "memory")
#endif
+/**
+ * array_index_mask_nospec() - generate a mask that is ~0UL when the
+ * bounds check succeeds and 0 otherwise
+ * @index: array element index
+ * @size: number of elements in array
+ *
+ * Returns:
+ * 0 - (index < size)
+ */
+static inline unsigned long array_index_mask_nospec(unsigned long index,
+ unsigned long size)
+{
+ unsigned long mask;
+
+ asm ("cmp %1,%2; sbb %0,%0;"
+ :"=r" (mask)
+ :"r"(size),"r" (index)
+ :"cc");
+ return mask;
+}
+
+/* Override the default implementation from linux/nospec.h. */
+#define array_index_mask_nospec array_index_mask_nospec
+
#ifdef CONFIG_X86_PPRO_FENCE
#define dma_rmb() rmb()
#else
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
x86/get_user: Use pointer masking to limit speculation
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-get_user-use-pointer-masking-to-limit-speculation.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 29 Jan 2018 17:02:54 -0800
Subject: x86/get_user: Use pointer masking to limit speculation
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit c7f631cb07e7da06ac1d231ca178452339e32a94)
Quoting Linus:
I do think that it would be a good idea to very expressly document
the fact that it's not that the user access itself is unsafe. I do
agree that things like "get_user()" want to be protected, but not
because of any direct bugs or problems with get_user() and friends,
but simply because get_user() is an excellent source of a pointer
that is obviously controlled from a potentially attacking user
space. So it's a prime candidate for then finding _subsequent_
accesses that can then be used to perturb the cache.
Unlike the __get_user() case get_user() includes the address limit check
near the pointer de-reference. With that locality the speculation can be
mitigated with pointer narrowing rather than a barrier, i.e.
array_index_nospec(). Where the narrowing is performed by:
cmp %limit, %ptr
sbb %mask, %mask
and %mask, %ptr
With respect to speculation the value of %ptr is either less than %limit
or NULL.
Co-developed-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-arch(a)vger.kernel.org
Cc: Kees Cook <keescook(a)chromium.org>
Cc: kernel-hardening(a)lists.openwall.com
Cc: gregkh(a)linuxfoundation.org
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: torvalds(a)linux-foundation.org
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727417469.33451.11804043010080838495.stgit@dwi…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/lib/getuser.S | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -39,6 +39,8 @@ ENTRY(__get_user_1)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
1: movzbl (%_ASM_AX),%edx
xor %eax,%eax
@@ -53,6 +55,8 @@ ENTRY(__get_user_2)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
2: movzwl -1(%_ASM_AX),%edx
xor %eax,%eax
@@ -67,6 +71,8 @@ ENTRY(__get_user_4)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
3: movl -3(%_ASM_AX),%edx
xor %eax,%eax
@@ -82,6 +88,8 @@ ENTRY(__get_user_8)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
4: movq -7(%_ASM_AX),%rdx
xor %eax,%eax
@@ -93,6 +101,8 @@ ENTRY(__get_user_8)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user_8
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
4: movl -7(%_ASM_AX),%edx
5: movl -3(%_ASM_AX),%ecx
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
x86/entry/64: Remove the SYSCALL64 fast path
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-entry-64-remove-the-syscall64-fast-path.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:30:27 CET 2018
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 Jan 2018 10:38:49 -0800
Subject: x86/entry/64: Remove the SYSCALL64 fast path
From: Andy Lutomirski <luto(a)kernel.org>
(cherry picked from commit 21d375b6b34ff511a507de27bf316b3dde6938d9)
The SYCALLL64 fast path was a nice, if small, optimization back in the good
old days when syscalls were actually reasonably fast. Now there is PTI to
slow everything down, and indirect branches are verboten, making everything
messier. The retpoline code in the fast path is particularly nasty.
Just get rid of the fast path. The slow path is barely slower.
[ tglx: Split out the 'push all extra regs' part ]
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Kernel Hardening <kernel-hardening(a)lists.openwall.com>
Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.15171644…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/entry_64.S | 123 --------------------------------------------
arch/x86/entry/syscall_64.c | 7 --
2 files changed, 3 insertions(+), 127 deletions(-)
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -179,94 +179,11 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
pushq %r11 /* pt_regs->r11 */
sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */
- /*
- * If we need to do entry work or if we guess we'll need to do
- * exit work, go straight to the slow path.
- */
- movq PER_CPU_VAR(current_task), %r11
- testl $_TIF_WORK_SYSCALL_ENTRY|_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
- jnz entry_SYSCALL64_slow_path
-
-entry_SYSCALL_64_fastpath:
- /*
- * Easy case: enable interrupts and issue the syscall. If the syscall
- * needs pt_regs, we'll call a stub that disables interrupts again
- * and jumps to the slow path.
- */
- TRACE_IRQS_ON
- ENABLE_INTERRUPTS(CLBR_NONE)
-#if __SYSCALL_MASK == ~0
- cmpq $__NR_syscall_max, %rax
-#else
- andl $__SYSCALL_MASK, %eax
- cmpl $__NR_syscall_max, %eax
-#endif
- ja 1f /* return -ENOSYS (already in pt_regs->ax) */
- movq %r10, %rcx
-
- /*
- * This call instruction is handled specially in stub_ptregs_64.
- * It might end up jumping to the slow path. If it jumps, RAX
- * and all argument registers are clobbered.
- */
-#ifdef CONFIG_RETPOLINE
- movq sys_call_table(, %rax, 8), %rax
- call __x86_indirect_thunk_rax
-#else
- call *sys_call_table(, %rax, 8)
-#endif
-.Lentry_SYSCALL_64_after_fastpath_call:
-
- movq %rax, RAX(%rsp)
-1:
-
- /*
- * If we get here, then we know that pt_regs is clean for SYSRET64.
- * If we see that no exit work is required (which we are required
- * to check with IRQs off), then we can go straight to SYSRET64.
- */
- DISABLE_INTERRUPTS(CLBR_NONE)
- TRACE_IRQS_OFF
- movq PER_CPU_VAR(current_task), %r11
- testl $_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
- jnz 1f
-
- LOCKDEP_SYS_EXIT
- TRACE_IRQS_ON /* user mode is traced as IRQs on */
- movq RIP(%rsp), %rcx
- movq EFLAGS(%rsp), %r11
- RESTORE_C_REGS_EXCEPT_RCX_R11
- /*
- * This opens a window where we have a user CR3, but are
- * running in the kernel. This makes using the CS
- * register useless for telling whether or not we need to
- * switch CR3 in NMIs. Normal interrupts are OK because
- * they are off here.
- */
- SWITCH_USER_CR3
- movq RSP(%rsp), %rsp
- USERGS_SYSRET64
-
-1:
- /*
- * The fast path looked good when we started, but something changed
- * along the way and we need to switch to the slow path. Calling
- * raise(3) will trigger this, for example. IRQs are off.
- */
- TRACE_IRQS_ON
- ENABLE_INTERRUPTS(CLBR_NONE)
- SAVE_EXTRA_REGS
- movq %rsp, %rdi
- call syscall_return_slowpath /* returns with IRQs disabled */
- jmp return_from_SYSCALL_64
-
-entry_SYSCALL64_slow_path:
/* IRQs are off. */
SAVE_EXTRA_REGS
movq %rsp, %rdi
call do_syscall_64 /* returns with IRQs disabled */
-return_from_SYSCALL_64:
RESTORE_EXTRA_REGS
TRACE_IRQS_IRETQ /* we're about to change IF */
@@ -339,6 +256,7 @@ return_from_SYSCALL_64:
syscall_return_via_sysret:
/* rcx and r11 are already restored (see code above) */
RESTORE_C_REGS_EXCEPT_RCX_R11
+
/*
* This opens a window where we have a user CR3, but are
* running in the kernel. This makes using the CS
@@ -363,45 +281,6 @@ opportunistic_sysret_failed:
jmp restore_c_regs_and_iret
END(entry_SYSCALL_64)
-ENTRY(stub_ptregs_64)
- /*
- * Syscalls marked as needing ptregs land here.
- * If we are on the fast path, we need to save the extra regs,
- * which we achieve by trying again on the slow path. If we are on
- * the slow path, the extra regs are already saved.
- *
- * RAX stores a pointer to the C function implementing the syscall.
- * IRQs are on.
- */
- cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
- jne 1f
-
- /*
- * Called from fast path -- disable IRQs again, pop return address
- * and jump to slow path
- */
- DISABLE_INTERRUPTS(CLBR_NONE)
- TRACE_IRQS_OFF
- popq %rax
- jmp entry_SYSCALL64_slow_path
-
-1:
- JMP_NOSPEC %rax /* Called from C */
-END(stub_ptregs_64)
-
-.macro ptregs_stub func
-ENTRY(ptregs_\func)
- leaq \func(%rip), %rax
- jmp stub_ptregs_64
-END(ptregs_\func)
-.endm
-
-/* Instantiate ptregs_stub for each ptregs-using syscall */
-#define __SYSCALL_64_QUAL_(sym)
-#define __SYSCALL_64_QUAL_ptregs(sym) ptregs_stub sym
-#define __SYSCALL_64(nr, sym, qual) __SYSCALL_64_QUAL_##qual(sym)
-#include <asm/syscalls_64.h>
-
/*
* %rdi: prev task
* %rsi: next task
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -6,14 +6,11 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>
-#define __SYSCALL_64_QUAL_(sym) sym
-#define __SYSCALL_64_QUAL_ptregs(sym) ptregs_##sym
-
-#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long __SYSCALL_64_QUAL_##qual(sym)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
#include <asm/syscalls_64.h>
#undef __SYSCALL_64
-#define __SYSCALL_64(nr, sym, qual) [nr] = __SYSCALL_64_QUAL_##qual(sym),
+#define __SYSCALL_64(nr, sym, qual) [nr] = sym,
extern long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-entry-64-push-extra-regs-right-away.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/x86-asm-move-status-from-thread_struct-to-thread_info.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-entry-64-remove-the-syscall64-fast-path.patch
queue-4.9/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-spectre-fix-spelling-mistake-vunerable-vulnerable.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/x86-pti-mark-constant-arrays-as-__initconst.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
x86/entry/64: Push extra regs right away
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-entry-64-push-extra-regs-right-away.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:30:27 CET 2018
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 Jan 2018 10:38:49 -0800
Subject: x86/entry/64: Push extra regs right away
From: Andy Lutomirski <luto(a)kernel.org>
(cherry picked from commit d1f7732009e0549eedf8ea1db948dc37be77fd46)
With the fast path removed there is no point in splitting the push of the
normal and the extra register set. Just push the extra regs right away.
[ tglx: Split out from 'x86/entry/64: Remove the SYSCALL64 fast path' ]
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Kernel Hardening <kernel-hardening(a)lists.openwall.com>
Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.15171644…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/entry_64.S | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -177,10 +177,14 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
pushq %r9 /* pt_regs->r9 */
pushq %r10 /* pt_regs->r10 */
pushq %r11 /* pt_regs->r11 */
- sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */
+ pushq %rbx /* pt_regs->rbx */
+ pushq %rbp /* pt_regs->rbp */
+ pushq %r12 /* pt_regs->r12 */
+ pushq %r13 /* pt_regs->r13 */
+ pushq %r14 /* pt_regs->r14 */
+ pushq %r15 /* pt_regs->r15 */
/* IRQs are off. */
- SAVE_EXTRA_REGS
movq %rsp, %rdi
call do_syscall_64 /* returns with IRQs disabled */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-entry-64-push-extra-regs-right-away.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/x86-asm-move-status-from-thread_struct-to-thread_info.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-entry-64-remove-the-syscall64-fast-path.patch
queue-4.9/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-spectre-fix-spelling-mistake-vunerable-vulnerable.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/x86-pti-mark-constant-arrays-as-__initconst.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-cpuid-fix-up-virtual-ibrs-ibpb-stibp-feature-bits-on-intel.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: David Woodhouse <dwmw(a)amazon.co.uk>
Date: Tue, 30 Jan 2018 14:30:23 +0000
Subject: x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
From: David Woodhouse <dwmw(a)amazon.co.uk>
(cherry picked from commit 7fcae1118f5fd44a862aa5c3525248e35ee67c3b)
Despite the fact that all the other code there seems to be doing it, just
using set_cpu_cap() in early_intel_init() doesn't actually work.
For CPUs with PKU support, setup_pku() calls get_cpu_cap() after
c->c_init() has set those feature bits. That resets those bits back to what
was queried from the hardware.
Turning the bits off for bad microcode is easy to fix. That can just use
setup_clear_cpu_cap() to force them off for all CPUs.
I was less keen on forcing the feature bits *on* that way, just in case
of inconsistencies. I appreciate that the kernel is going to get this
utterly wrong if CPU features are not consistent, because it has already
applied alternatives by the time secondary CPUs are brought up.
But at least if setup_force_cpu_cap() isn't being used, we might have a
chance of *detecting* the lack of the corresponding bit and either
panicking or refusing to bring the offending CPU online.
So ensure that the appropriate feature bits are set within get_cpu_cap()
regardless of how many extra times it's called.
Fixes: 2961298e ("x86/cpufeatures: Clean up Spectre v2 related CPUID flags")
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: karahmed(a)amazon.de
Cc: peterz(a)infradead.org
Cc: bp(a)alien8.de
Link: https://lkml.kernel.org/r/1517322623-15261-1-git-send-email-dwmw@amazon.co.…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/common.c | 21 +++++++++++++++++++++
arch/x86/kernel/cpu/intel.c | 27 ++++++++-------------------
2 files changed, 29 insertions(+), 19 deletions(-)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -718,6 +718,26 @@ static void apply_forced_caps(struct cpu
}
}
+static void init_speculation_control(struct cpuinfo_x86 *c)
+{
+ /*
+ * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support,
+ * and they also have a different bit for STIBP support. Also,
+ * a hypervisor might have set the individual AMD bits even on
+ * Intel CPUs, for finer-grained selection of what's available.
+ *
+ * We use the AMD bits in 0x8000_0008 EBX as the generic hardware
+ * features, which are visible in /proc/cpuinfo and used by the
+ * kernel. So set those accordingly from the Intel bits.
+ */
+ if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
+ set_cpu_cap(c, X86_FEATURE_IBRS);
+ set_cpu_cap(c, X86_FEATURE_IBPB);
+ }
+ if (cpu_has(c, X86_FEATURE_INTEL_STIBP))
+ set_cpu_cap(c, X86_FEATURE_STIBP);
+}
+
void get_cpu_cap(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;
@@ -812,6 +832,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
init_scattered_cpuid_features(c);
+ init_speculation_control(c);
}
static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -140,28 +140,17 @@ static void early_init_intel(struct cpui
rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode);
}
- /*
- * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support,
- * and they also have a different bit for STIBP support. Also,
- * a hypervisor might have set the individual AMD bits even on
- * Intel CPUs, for finer-grained selection of what's available.
- */
- if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
- set_cpu_cap(c, X86_FEATURE_IBRS);
- set_cpu_cap(c, X86_FEATURE_IBPB);
- }
- if (cpu_has(c, X86_FEATURE_INTEL_STIBP))
- set_cpu_cap(c, X86_FEATURE_STIBP);
-
/* Now if any of them are set, check the blacklist and clear the lot */
- if ((cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) ||
+ if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) ||
+ cpu_has(c, X86_FEATURE_INTEL_STIBP) ||
+ cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) ||
cpu_has(c, X86_FEATURE_STIBP)) && bad_spectre_microcode(c)) {
pr_warn("Intel Spectre v2 broken microcode detected; disabling Speculation Control\n");
- clear_cpu_cap(c, X86_FEATURE_IBRS);
- clear_cpu_cap(c, X86_FEATURE_IBPB);
- clear_cpu_cap(c, X86_FEATURE_STIBP);
- clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL);
- clear_cpu_cap(c, X86_FEATURE_INTEL_STIBP);
+ setup_clear_cpu_cap(X86_FEATURE_IBRS);
+ setup_clear_cpu_cap(X86_FEATURE_IBPB);
+ setup_clear_cpu_cap(X86_FEATURE_STIBP);
+ setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL);
+ setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP);
}
/*
Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are
queue-4.9/x86-entry-64-push-extra-regs-right-away.patch
queue-4.9/kvm-vmx-introduce-alloc_loaded_vmcs.patch
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-retpoline-simplify-vmexit_fill_rsb.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/x86-cpufeatures-clean-up-spectre-v2-related-cpuid-flags.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/x86-asm-move-status-from-thread_struct-to-thread_info.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-entry-64-remove-the-syscall64-fast-path.patch
queue-4.9/x86-cpufeature-blacklist-spec_ctrl-pred_cmd-on-early-spectre-v2-microcodes.patch
queue-4.9/x86-nospec-fix-header-guards-names.patch
queue-4.9/x86-retpoline-avoid-retpolines-for-built-in-__init-functions.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-cpu-bugs-make-retpoline-module-warning-conditional.patch
queue-4.9/x86-spectre-check-config_retpoline-in-command-line-parser.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/x86-alternative-print-unadorned-pointers.patch
queue-4.9/x86-cpuid-fix-up-virtual-ibrs-ibpb-stibp-feature-bits-on-intel.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
queue-4.9/x86-spectre-fix-spelling-mistake-vunerable-vulnerable.patch
queue-4.9/module-retpoline-warn-about-missing-retpoline-in-module.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
queue-4.9/x86-spectre-simplify-spectre_v2-command-line-parsing.patch
queue-4.9/x86-msr-add-definitions-for-new-speculation-control-msrs.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kvm-vmx-make-msr-bitmaps-per-vcpu.patch
queue-4.9/x86-speculation-add-basic-ibpb-indirect-branch-prediction-barrier-support.patch
queue-4.9/kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
queue-4.9/x86-pti-mark-constant-arrays-as-__initconst.patch
queue-4.9/x86-speculation-fix-typo-ibrs_att-which-should-be-ibrs_all.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-bugs-drop-one-mitigation-from-dmesg.patch
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
x86/asm: Move 'status' from thread_struct to thread_info
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-asm-move-status-from-thread_struct-to-thread_info.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 Jan 2018 10:38:50 -0800
Subject: x86/asm: Move 'status' from thread_struct to thread_info
From: Andy Lutomirski <luto(a)kernel.org>
(cherry picked from commit 37a8f7c38339b22b69876d6f5a0ab851565284e3)
The TS_COMPAT bit is very hot and is accessed from code paths that mostly
also touch thread_info::flags. Move it into struct thread_info to improve
cache locality.
The only reason it was in thread_struct is that there was a brief period
during which arch-specific fields were not allowed in struct thread_info.
Linus suggested further changing:
ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
to:
if (unlikely(ti->status & (TS_COMPAT|TS_I386_REGS_POKED)))
ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
on the theory that frequently dirtying the cacheline even in pure 64-bit
code that never needs to modify status hurts performance. That could be a
reasonable followup patch, but I suspect it matters less on top of this
patch.
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Ingo Molnar <mingo(a)kernel.org>
Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Kernel Hardening <kernel-hardening(a)lists.openwall.com>
Link: https://lkml.kernel.org/r/03148bcc1b217100e6e8ecf6a5468c45cf4304b6.15171644…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/common.c | 4 ++--
arch/x86/include/asm/processor.h | 2 --
arch/x86/include/asm/syscall.h | 6 +++---
arch/x86/include/asm/thread_info.h | 3 ++-
arch/x86/kernel/process_64.c | 4 ++--
arch/x86/kernel/ptrace.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
7 files changed, 11 insertions(+), 12 deletions(-)
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -201,7 +201,7 @@ __visible inline void prepare_exit_to_us
* special case only applies after poking regs and before the
* very next return to user mode.
*/
- current->thread.status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
+ ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
#endif
user_enter_irqoff();
@@ -299,7 +299,7 @@ static __always_inline void do_syscall_3
unsigned int nr = (unsigned int)regs->orig_ax;
#ifdef CONFIG_IA32_EMULATION
- current->thread.status |= TS_COMPAT;
+ ti->status |= TS_COMPAT;
#endif
if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY) {
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -391,8 +391,6 @@ struct thread_struct {
unsigned short gsindex;
#endif
- u32 status; /* thread synchronous flags */
-
#ifdef CONFIG_X86_64
unsigned long fsbase;
unsigned long gsbase;
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -60,7 +60,7 @@ static inline long syscall_get_error(str
* TS_COMPAT is set for 32-bit syscall entries and then
* remains set until we return to user mode.
*/
- if (task->thread.status & (TS_COMPAT|TS_I386_REGS_POKED))
+ if (task->thread_info.status & (TS_COMPAT|TS_I386_REGS_POKED))
/*
* Sign-extend the value so (int)-EFOO becomes (long)-EFOO
* and will match correctly in comparisons.
@@ -116,7 +116,7 @@ static inline void syscall_get_arguments
unsigned long *args)
{
# ifdef CONFIG_IA32_EMULATION
- if (task->thread.status & TS_COMPAT)
+ if (task->thread_info.status & TS_COMPAT)
switch (i) {
case 0:
if (!n--) break;
@@ -177,7 +177,7 @@ static inline void syscall_set_arguments
const unsigned long *args)
{
# ifdef CONFIG_IA32_EMULATION
- if (task->thread.status & TS_COMPAT)
+ if (task->thread_info.status & TS_COMPAT)
switch (i) {
case 0:
if (!n--) break;
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -54,6 +54,7 @@ struct task_struct;
struct thread_info {
unsigned long flags; /* low level flags */
+ u32 status; /* thread synchronous flags */
};
#define INIT_THREAD_INFO(tsk) \
@@ -213,7 +214,7 @@ static inline int arch_within_stack_fram
#define in_ia32_syscall() true
#else
#define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
- current->thread.status & TS_COMPAT)
+ current_thread_info()->status & TS_COMPAT)
#endif
/*
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -538,7 +538,7 @@ void set_personality_ia32(bool x32)
current->personality &= ~READ_IMPLIES_EXEC;
/* in_compat_syscall() uses the presence of the x32
syscall bit flag to determine compat status */
- current->thread.status &= ~TS_COMPAT;
+ current_thread_info()->status &= ~TS_COMPAT;
} else {
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_X32);
@@ -546,7 +546,7 @@ void set_personality_ia32(bool x32)
current->mm->context.ia32_compat = TIF_IA32;
current->personality |= force_personality32;
/* Prepare the first "return" to user space */
- current->thread.status |= TS_COMPAT;
+ current_thread_info()->status |= TS_COMPAT;
}
}
EXPORT_SYMBOL_GPL(set_personality_ia32);
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -934,7 +934,7 @@ static int putreg32(struct task_struct *
*/
regs->orig_ax = value;
if (syscall_get_nr(child, regs) >= 0)
- child->thread.status |= TS_I386_REGS_POKED;
+ child->thread_info.status |= TS_I386_REGS_POKED;
break;
case offsetof(struct user32, regs.eflags):
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -785,7 +785,7 @@ static inline unsigned long get_nr_resta
* than the tracee.
*/
#ifdef CONFIG_IA32_EMULATION
- if (current->thread.status & (TS_COMPAT|TS_I386_REGS_POKED))
+ if (current_thread_info()->status & (TS_COMPAT|TS_I386_REGS_POKED))
return __NR_ia32_restart_syscall;
#endif
#ifdef CONFIG_X86_X32_ABI
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-entry-64-push-extra-regs-right-away.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/x86-asm-move-status-from-thread_struct-to-thread_info.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-entry-64-remove-the-syscall64-fast-path.patch
queue-4.9/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-spectre-fix-spelling-mistake-vunerable-vulnerable.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/x86-pti-mark-constant-arrays-as-__initconst.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
x86/bugs: Drop one "mitigation" from dmesg
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-bugs-drop-one-mitigation-from-dmesg.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:30:27 CET 2018
From: Borislav Petkov <bp(a)suse.de>
Date: Fri, 26 Jan 2018 13:11:39 +0100
Subject: x86/bugs: Drop one "mitigation" from dmesg
From: Borislav Petkov <bp(a)suse.de>
(cherry picked from commit 55fa19d3e51f33d9cd4056d25836d93abf9438db)
Make
[ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline
into
[ 0.031118] Spectre V2: Mitigation: Full generic retpoline
to reduce the mitigation mitigations strings.
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: riel(a)redhat.com
Cc: ak(a)linux.intel.com
Cc: peterz(a)infradead.org
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: jikos(a)kernel.org
Cc: luto(a)amacapital.net
Cc: dave.hansen(a)intel.com
Cc: torvalds(a)linux-foundation.org
Cc: keescook(a)google.com
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: tim.c.chen(a)linux.intel.com
Cc: pjt(a)google.com
Link: https://lkml.kernel.org/r/20180126121139.31959-5-bp@alien8.de
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/bugs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -90,7 +90,7 @@ static const char *spectre_v2_strings[]
};
#undef pr_fmt
-#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt
+#define pr_fmt(fmt) "Spectre V2 : " fmt
static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
static bool spectre_v2_bad_module;
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-retpoline-simplify-vmexit_fill_rsb.patch
queue-4.9/x86-cpufeatures-clean-up-spectre-v2-related-cpuid-flags.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/x86-microcode-amd-do-not-load-when-running-on-a-hypervisor.patch
queue-4.9/x86-nospec-fix-header-guards-names.patch
queue-4.9/x86-alternative-print-unadorned-pointers.patch
queue-4.9/x86-spectre-fix-spelling-mistake-vunerable-vulnerable.patch
queue-4.9/x86-pti-mark-constant-arrays-as-__initconst.patch
queue-4.9/x86-bugs-drop-one-mitigation-from-dmesg.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
This is a note to let you know that I've just added the patch titled
nl80211: Sanitize array index in parse_txq_params
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nl80211-sanitize-array-index-in-parse_txq_params.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 29 Jan 2018 17:03:15 -0800
Subject: nl80211: Sanitize array index in parse_txq_params
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit 259d8c1e984318497c84eef547bbb6b1d9f4eb05)
Wireless drivers rely on parse_txq_params to validate that txq_params->ac
is less than NL80211_NUM_ACS by the time the low-level driver's ->conf_tx()
handler is called. Use a new helper, array_index_nospec(), to sanitize
txq_params->ac with respect to speculation. I.e. ensure that any
speculation into ->conf_tx() handlers is done with a value of
txq_params->ac that is within the bounds of [0, NL80211_NUM_ACS).
Reported-by: Christian Lamparter <chunkeey(a)gmail.com>
Reported-by: Elena Reshetova <elena.reshetova(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Johannes Berg <johannes(a)sipsolutions.net>
Cc: linux-arch(a)vger.kernel.org
Cc: kernel-hardening(a)lists.openwall.com
Cc: gregkh(a)linuxfoundation.org
Cc: linux-wireless(a)vger.kernel.org
Cc: torvalds(a)linux-foundation.org
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727419584.33451.7700736761686184303.stgit@dwil…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/wireless/nl80211.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -16,6 +16,7 @@
#include <linux/nl80211.h>
#include <linux/rtnetlink.h>
#include <linux/netlink.h>
+#include <linux/nospec.h>
#include <linux/etherdevice.h>
#include <net/net_namespace.h>
#include <net/genetlink.h>
@@ -2014,20 +2015,22 @@ static const struct nla_policy txq_param
static int parse_txq_params(struct nlattr *tb[],
struct ieee80211_txq_params *txq_params)
{
+ u8 ac;
+
if (!tb[NL80211_TXQ_ATTR_AC] || !tb[NL80211_TXQ_ATTR_TXOP] ||
!tb[NL80211_TXQ_ATTR_CWMIN] || !tb[NL80211_TXQ_ATTR_CWMAX] ||
!tb[NL80211_TXQ_ATTR_AIFS])
return -EINVAL;
- txq_params->ac = nla_get_u8(tb[NL80211_TXQ_ATTR_AC]);
+ ac = nla_get_u8(tb[NL80211_TXQ_ATTR_AC]);
txq_params->txop = nla_get_u16(tb[NL80211_TXQ_ATTR_TXOP]);
txq_params->cwmin = nla_get_u16(tb[NL80211_TXQ_ATTR_CWMIN]);
txq_params->cwmax = nla_get_u16(tb[NL80211_TXQ_ATTR_CWMAX]);
txq_params->aifs = nla_get_u8(tb[NL80211_TXQ_ATTR_AIFS]);
- if (txq_params->ac >= NL80211_NUM_ACS)
+ if (ac >= NL80211_NUM_ACS)
return -EINVAL;
-
+ txq_params->ac = array_index_nospec(ac, NL80211_NUM_ACS);
return 0;
}
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
vfs, fdtable: Prevent bounds-check bypass via speculative execution
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 29 Jan 2018 17:03:05 -0800
Subject: vfs, fdtable: Prevent bounds-check bypass via speculative execution
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit 56c30ba7b348b90484969054d561f711ba196507)
'fd' is a user controlled value that is used as a data dependency to
read from the 'fdt->fd' array. In order to avoid potential leaks of
kernel memory values, block speculative execution of the instruction
stream that could issue reads based on an invalid 'file *' returned from
__fcheck_files.
Co-developed-by: Elena Reshetova <elena.reshetova(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-arch(a)vger.kernel.org
Cc: kernel-hardening(a)lists.openwall.com
Cc: gregkh(a)linuxfoundation.org
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: torvalds(a)linux-foundation.org
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727418500.33451.17392199002892248656.stgit@dwi…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/fdtable.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -9,6 +9,7 @@
#include <linux/compiler.h>
#include <linux/spinlock.h>
#include <linux/rcupdate.h>
+#include <linux/nospec.h>
#include <linux/types.h>
#include <linux/init.h>
#include <linux/fs.h>
@@ -81,8 +82,10 @@ static inline struct file *__fcheck_file
{
struct fdtable *fdt = rcu_dereference_raw(files->fdt);
- if (fd < fdt->max_fds)
+ if (fd < fdt->max_fds) {
+ fd = array_index_nospec(fd, fdt->max_fds);
return rcu_dereference_raw(fdt->fd[fd]);
+ }
return NULL;
}
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
array_index_nospec: Sanitize speculative array de-references
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
array_index_nospec-sanitize-speculative-array-de-references.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 29 Jan 2018 17:02:22 -0800
Subject: array_index_nospec: Sanitize speculative array de-references
From: Dan Williams <dan.j.williams(a)intel.com>
(cherry picked from commit f3804203306e098dae9ca51540fcd5eb700d7f40)
array_index_nospec() is proposed as a generic mechanism to mitigate
against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary
checks via speculative execution. The array_index_nospec()
implementation is expected to be safe for current generation CPUs across
multiple architectures (ARM, x86).
Based on an original implementation by Linus Torvalds, tweaked to remove
speculative flows by Alexei Starovoitov, and tweaked again by Linus to
introduce an x86 assembly implementation for the mask generation.
Co-developed-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Co-developed-by: Alexei Starovoitov <ast(a)kernel.org>
Suggested-by: Cyril Novikov <cnovikov(a)lynx.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-arch(a)vger.kernel.org
Cc: kernel-hardening(a)lists.openwall.com
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: gregkh(a)linuxfoundation.org
Cc: torvalds(a)linux-foundation.org
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwi…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/nospec.h | 72 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 72 insertions(+)
create mode 100644 include/linux/nospec.h
--- /dev/null
+++ b/include/linux/nospec.h
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright(c) 2018 Linus Torvalds. All rights reserved.
+// Copyright(c) 2018 Alexei Starovoitov. All rights reserved.
+// Copyright(c) 2018 Intel Corporation. All rights reserved.
+
+#ifndef _LINUX_NOSPEC_H
+#define _LINUX_NOSPEC_H
+
+/**
+ * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
+ * @index: array element index
+ * @size: number of elements in array
+ *
+ * When @index is out of bounds (@index >= @size), the sign bit will be
+ * set. Extend the sign bit to all bits and invert, giving a result of
+ * zero for an out of bounds index, or ~0 if within bounds [0, @size).
+ */
+#ifndef array_index_mask_nospec
+static inline unsigned long array_index_mask_nospec(unsigned long index,
+ unsigned long size)
+{
+ /*
+ * Warn developers about inappropriate array_index_nospec() usage.
+ *
+ * Even if the CPU speculates past the WARN_ONCE branch, the
+ * sign bit of @index is taken into account when generating the
+ * mask.
+ *
+ * This warning is compiled out when the compiler can infer that
+ * @index and @size are less than LONG_MAX.
+ */
+ if (WARN_ONCE(index > LONG_MAX || size > LONG_MAX,
+ "array_index_nospec() limited to range of [0, LONG_MAX]\n"))
+ return 0;
+
+ /*
+ * Always calculate and emit the mask even if the compiler
+ * thinks the mask is not needed. The compiler does not take
+ * into account the value of @index under speculation.
+ */
+ OPTIMIZER_HIDE_VAR(index);
+ return ~(long)(index | (size - 1UL - index)) >> (BITS_PER_LONG - 1);
+}
+#endif
+
+/*
+ * array_index_nospec - sanitize an array index after a bounds check
+ *
+ * For a code sequence like:
+ *
+ * if (index < size) {
+ * index = array_index_nospec(index, size);
+ * val = array[index];
+ * }
+ *
+ * ...if the CPU speculates past the bounds check then
+ * array_index_nospec() will clamp the index within the range of [0,
+ * size).
+ */
+#define array_index_nospec(index, size) \
+({ \
+ typeof(index) _i = (index); \
+ typeof(size) _s = (size); \
+ unsigned long _mask = array_index_mask_nospec(_i, _s); \
+ \
+ BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \
+ BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
+ \
+ _i &= _mask; \
+ _i; \
+})
+#endif /* _LINUX_NOSPEC_H */
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-paravirt-remove-noreplace-paravirt-cmdline-option.patch
queue-4.9/documentation-document-array_index_nospec.patch
queue-4.9/x86-usercopy-replace-open-coded-stac-clac-with-__uaccess_-begin-end.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/vfs-fdtable-prevent-bounds-check-bypass-via-speculative-execution.patch
queue-4.9/x86-uaccess-use-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/x86-implement-array_index_mask_nospec.patch
queue-4.9/array_index_nospec-sanitize-speculative-array-de-references.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-kvm-update-spectre-v1-mitigation.patch
queue-4.9/x86-get_user-use-pointer-masking-to-limit-speculation.patch
queue-4.9/x86-syscall-sanitize-syscall-table-de-references-under-speculation.patch
queue-4.9/x86-spectre-report-get_user-mitigation-for-spectre_v1.patch
queue-4.9/x86-introduce-barrier_nospec.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-introduce-__uaccess_begin_nospec-and-uaccess_try_nospec.patch
queue-4.9/nl80211-sanitize-array-index-in-parse_txq_params.patch
This is a note to let you know that I've just added the patch titled
Documentation: Document array_index_nospec
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
documentation-document-array_index_nospec.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 8 03:32:24 CET 2018
From: Mark Rutland <mark.rutland(a)arm.com>
Date: Mon, 29 Jan 2018 17:02:16 -0800
Subject: Documentation: Document array_index_nospec
From: Mark Rutland <mark.rutland(a)arm.com>
(cherry picked from commit f84a56f73dddaeac1dba8045b007f742f61cd2da)
Document the rationale and usage of the new array_index_nospec() helper.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Signed-off-by: Will Deacon <will.deacon(a)arm.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: linux-arch(a)vger.kernel.org
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: gregkh(a)linuxfoundation.org
Cc: kernel-hardening(a)lists.openwall.com
Cc: torvalds(a)linux-foundation.org
Cc: alan(a)linux.intel.com
Link: https://lkml.kernel.org/r/151727413645.33451.15878817161436755393.stgit@dwi…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/speculation.txt | 90 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 90 insertions(+)
create mode 100644 Documentation/speculation.txt
--- /dev/null
+++ b/Documentation/speculation.txt
@@ -0,0 +1,90 @@
+This document explains potential effects of speculation, and how undesirable
+effects can be mitigated portably using common APIs.
+
+===========
+Speculation
+===========
+
+To improve performance and minimize average latencies, many contemporary CPUs
+employ speculative execution techniques such as branch prediction, performing
+work which may be discarded at a later stage.
+
+Typically speculative execution cannot be observed from architectural state,
+such as the contents of registers. However, in some cases it is possible to
+observe its impact on microarchitectural state, such as the presence or
+absence of data in caches. Such state may form side-channels which can be
+observed to extract secret information.
+
+For example, in the presence of branch prediction, it is possible for bounds
+checks to be ignored by code which is speculatively executed. Consider the
+following code:
+
+ int load_array(int *array, unsigned int index)
+ {
+ if (index >= MAX_ARRAY_ELEMS)
+ return 0;
+ else
+ return array[index];
+ }
+
+Which, on arm64, may be compiled to an assembly sequence such as:
+
+ CMP <index>, #MAX_ARRAY_ELEMS
+ B.LT less
+ MOV <returnval>, #0
+ RET
+ less:
+ LDR <returnval>, [<array>, <index>]
+ RET
+
+It is possible that a CPU mis-predicts the conditional branch, and
+speculatively loads array[index], even if index >= MAX_ARRAY_ELEMS. This
+value will subsequently be discarded, but the speculated load may affect
+microarchitectural state which can be subsequently measured.
+
+More complex sequences involving multiple dependent memory accesses may
+result in sensitive information being leaked. Consider the following
+code, building on the prior example:
+
+ int load_dependent_arrays(int *arr1, int *arr2, int index)
+ {
+ int val1, val2,
+
+ val1 = load_array(arr1, index);
+ val2 = load_array(arr2, val1);
+
+ return val2;
+ }
+
+Under speculation, the first call to load_array() may return the value
+of an out-of-bounds address, while the second call will influence
+microarchitectural state dependent on this value. This may provide an
+arbitrary read primitive.
+
+====================================
+Mitigating speculation side-channels
+====================================
+
+The kernel provides a generic API to ensure that bounds checks are
+respected even under speculation. Architectures which are affected by
+speculation-based side-channels are expected to implement these
+primitives.
+
+The array_index_nospec() helper in <linux/nospec.h> can be used to
+prevent information from being leaked via side-channels.
+
+A call to array_index_nospec(index, size) returns a sanitized index
+value that is bounded to [0, size) even under cpu speculation
+conditions.
+
+This can be used to protect the earlier load_array() example:
+
+ int load_array(int *array, unsigned int index)
+ {
+ if (index >= MAX_ARRAY_ELEMS)
+ return 0;
+ else {
+ index = array_index_nospec(index, MAX_ARRAY_ELEMS);
+ return array[index];
+ }
+ }
Patches currently in stable-queue which might be from mark.rutland(a)arm.com are
queue-4.9/documentation-document-array_index_nospec.patch
This is the start of the stable review cycle for the 4.14.9 release.
There are 159 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Dec 24 08:45:36 UTC 2017.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.9-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.9-rc1
Peter Hutterer <peter.hutterer(a)who-t.net>
platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes
Daniel Lezcano <daniel.lezcano(a)linaro.org>
thermal/drivers/hisi: Fix multiple alarm interrupts firing
Daniel Lezcano <daniel.lezcano(a)linaro.org>
thermal/drivers/hisi: Simplify the temperature/step computation
Daniel Lezcano <daniel.lezcano(a)linaro.org>
thermal/drivers/hisi: Fix kernel panic on alarm interrupt
Daniel Lezcano <daniel.lezcano(a)linaro.org>
thermal/drivers/hisi: Fix missing interrupt enablement
Niranjana Vishwanathapura <niranjana.vishwanathapura(a)intel.com>
IB/opa_vnic: Properly return the total MACs in UC MAC list
Scott Franco <safranco(a)intel.com>
IB/opa_vnic: Properly clear Mac Table Digest
Eric Anholt <eric(a)anholt.net>
drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math.
Nicholas Piggin <npiggin(a)gmail.com>
cpuidle: fix broadcast control when broadcast can not be entered
Alexandre Belloni <alexandre.belloni(a)free-electrons.com>
rtc: set the alarm to the next expiring timer
Hoang Tran <tranviethoang.vn(a)gmail.com>
tcp: fix under-evaluated ssthresh in TCP Vegas
Chen-Yu Tsai <wens(a)csie.org>
clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision
Arvind Yadav <arvind.yadav.cs(a)gmail.com>
staging: greybus: light: Release memory obtained by kasprintf
Wei Hu(Xavier) <xavier.huwei(a)huawei.com>
RDMA/hns: Avoid NULL pointer exception
Mike Manning <mmanning(a)brocade.com>
net: ipv6: send NS for DAD when link operationally up
Mick Tarsel <mjtarsel(a)linux.vnet.ibm.com>
ibmvnic: Set state UP
Jacob Keller <jacob.e.keller(a)intel.com>
fm10k: ensure we process SM mbx when processing VF mbx
Marek Szyprowski <m.szyprowski(a)samsung.com>
ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board
Alex Williamson <alex.williamson(a)redhat.com>
vfio/pci: Virtualize Maximum Payload Size
Alan Brady <alan.brady(a)intel.com>
i40e: fix client notify of VF reset
Dick Kennedy <dick.kennedy(a)broadcom.com>
scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined
Dick Kennedy <dick.kennedy(a)broadcom.com>
scsi: lpfc: PLOGI failures during NPIV testing
Dick Kennedy <dick.kennedy(a)broadcom.com>
scsi: lpfc: Fix secure firmware updates
Jacob Keller <jacob.e.keller(a)intel.com>
fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw
Nicolas Dechesne <nicolas.dechesne(a)linaro.org>
ASoC: codecs: msm8916-wcd-analog: fix module autoload
Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
sctp: silence warns on sctp_stream_init allocations
Nicholas Piggin <npiggin(a)gmail.com>
powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog
Nicholas Piggin <npiggin(a)gmail.com>
powerpc/xmon: Avoid tripping SMP hardlockup watchdog
Ed Blake <ed.blake(a)sondrel.com>
ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback
Jean-François Têtu <jean-francois.tetu(a)savoirfairelinux.com>
ASoC: codecs: msm8916-wcd-analog: fix micbias level
Tom Zanussi <tom.zanussi(a)linux.intel.com>
tracing: Exclude 'generic fields' from histograms
Gabriele Paoloni <gabriele.paoloni(a)huawei.com>
PCI/AER: Report non-fatal errors only to the affected endpoint
Jacob Keller <jacob.e.keller(a)intel.com>
i40e/i40evf: spread CPU affinity hints across online CPUs only
Hans de Goede <hdegoede(a)redhat.com>
Bluetooth: hci_bcm: Fix setting of irq trigger type
Hans de Goede <hdegoede(a)redhat.com>
Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev
Andrew Jeffery <andrew(a)aj.id.au>
leds: pca955x: Don't invert requested value in pca955x_gpio_set_value()
Wei Wang <weiwan(a)google.com>
ipv6: grab rt->rt6i_ref before allocating pcpu rt
William Tu <u9012063(a)gmail.com>
ip_gre: check packet length and mtu correctly in erspan tx
Guoqing Jiang <gqjiang(a)suse.com>
md: always set THREAD_WAKEUP and wake up wqueue if thread existed
Luca Miccio <lucmiccio(a)gmail.com>
block,bfq: Disable writeback throttling
Colin Ian King <colin.king(a)canonical.com>
IB/rxe: check for allocation failure on elem
Emil Tantilov <emil.s.tantilov(a)intel.com>
ixgbe: fix use of uninitialized padding
Lorenzo Bianconi <lorenzo.bianconi83(a)gmail.com>
iio: st_sensors: add register mask for status register
Lihong Yang <lihong.yang(a)intel.com>
i40e: use the safe hash table iterator when deleting mac filters
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
igb: check memory allocation failure
Fabio Estevam <fabio.estevam(a)nxp.com>
PM / OPP: Move error message to debug level
Stuart Hayes <stuart.w.hayes(a)gmail.com>
PCI: Create SR-IOV virtfn/physfn links before attaching driver
Sreekanth Reddy <sreekanth.reddy(a)broadcom.com>
scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive
Varun Prakash <varun(a)chelsio.com>
scsi: cxgb4i: fix Tx skb leak
David Daney <david.daney(a)cavium.com>
PCI: Avoid bus reset if bridge itself is broken
Dan Murphy <dmurphy(a)ti.com>
net: phy: at803x: Change error to EINVAL for invalid MAC
Shakeel Butt <shakeelb(a)google.com>
kvm, mm: account kvm related kmem slabs to kmemcg
Russell King <rmk+kernel(a)armlinux.org.uk>
rtc: pl031: make interrupt optional
Christophe Jaillet <christophe.jaillet(a)wanadoo.fr>
crypto: lrw - Fix an error handling path in 'create()'
Christian Lamparter <chunkeey(a)gmail.com>
crypto: crypto4xx - increase context and scatter ring buffer elements
Chen-Yu Tsai <wens(a)csie.org>
clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider
Chen-Yu Tsai <wens(a)csie.org>
clk: sunxi-ng: nm: Check if requested rate is supported by fractional clock
Shashank Sharma <shashank.sharma(a)intel.com>
drm: Add retries for lspcon mode detection
Derek Basehore <dbasehore(a)chromium.org>
backlight: pwm_bl: Fix overflow condition
Jens Wiklander <jens.wiklander(a)linaro.org>
optee: fix invalid of_node_put() in optee_driver_init()
Thomas Gleixner <tglx(a)linutronix.de>
x86/cpufeatures: Make CPU bugs sticky
Thomas Gleixner <tglx(a)linutronix.de>
x86/paravirt: Provide a way to check for hypervisors
Thomas Gleixner <tglx(a)linutronix.de>
x86/paravirt: Dont patch flush_tlb_single
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Make cpu_entry_area.tss read-only
Andy Lutomirski <luto(a)kernel.org>
x86/entry: Clean up the SYSENTER_stack code
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Remove the SYSENTER stack canary
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Move the IST stacks into struct cpu_entry_area
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Create a per-CPU SYSCALL entry trampoline
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Return to userspace from the trampoline stack
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Use a per-CPU trampoline stack for IDT entries
Andy Lutomirski <luto(a)kernel.org>
x86/espfix/64: Stop assuming that pt_regs is on the entry stack
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0
Andy Lutomirski <luto(a)kernel.org>
x86/entry: Remap the TSS into the CPU entry area
Andy Lutomirski <luto(a)kernel.org>
x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct
Andy Lutomirski <luto(a)kernel.org>
x86/dumpstack: Handle stack overflow on all stacks
Andy Lutomirski <luto(a)kernel.org>
x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss
Andy Lutomirski <luto(a)kernel.org>
x86/kasan/64: Teach KASAN about the cpu_entry_area
Andy Lutomirski <luto(a)kernel.org>
x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area
Andy Lutomirski <luto(a)kernel.org>
x86/entry/gdt: Put per-CPU GDT remaps in ascending order
Andy Lutomirski <luto(a)kernel.org>
x86/dumpstack: Add get_stack_info() support for the SYSENTER stack
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Allocate and enable the SYSENTER stack
Andy Lutomirski <luto(a)kernel.org>
x86/irq/64: Print the offending IP in the stack overflow warning
Andy Lutomirski <luto(a)kernel.org>
x86/irq: Remove an old outdated comment about context tracking races
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/unwinder: Handle stack overflows more gracefully
Andy Lutomirski <luto(a)kernel.org>
x86/unwinder/orc: Dont bail on stack overflow
Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
x86/entry/64/paravirt: Use paravirt-safe macro to access eflags
Andrey Ryabinin <aryabinin(a)virtuozzo.com>
x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
Will Deacon <will.deacon(a)arm.com>
locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
Will Deacon <will.deacon(a)arm.com>
locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: fix build issues on um due to mising bpf_perf_event.h
Andi Kleen <ak(a)linux.intel.com>
perf/x86: Enable free running PEBS for REGS_USER/INTR
Rudolf Marek <r.marek(a)assembler.cz>
x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
x86/cpufeature: Add User-Mode Instruction Prevention definitions
Ingo Molnar <mingo(a)kernel.org>
drivers/misc/intel/pti: Rename the header file to free up the namespace
Juergen Gross <jgross(a)suse.com>
x86/virt: Add enum for hypervisors to replace x86_hyper
Juergen Gross <jgross(a)suse.com>
x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init'
James Morse <james.morse(a)arm.com>
ACPI / APEI: Replace ioremap_page_range() with fixmap
Andy Lutomirski <luto(a)kernel.org>
selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well
Andy Lutomirski <luto(a)kernel.org>
selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area()
Ingo Molnar <mingo(a)kernel.org>
x86/cpufeatures: Fix various details in the feature definitions
Ingo Molnar <mingo(a)kernel.org>
x86/cpufeatures: Re-tabulate the X86_FEATURE definitions
Borislav Petkov <bp(a)suse.de>
x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE
Thomas Gleixner <tglx(a)linutronix.de>
bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h")
Thomas Gleixner <tglx(a)linutronix.de>
x86/cpuid: Replace set/clear_bit32()
Borislav Petkov <bp(a)suse.de>
x86/entry/64: Shorten TEST instructions
Andy Lutomirski <luto(a)kernel.org>
x86/traps: Use a new on_thread_stack() helper to clean up an assertion
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Remove thread_struct::sp0
Andy Lutomirski <luto(a)kernel.org>
x86/entry/32: Fix cpu_current_top_of_stack initialization at boot
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Remove all remaining direct thread_struct::sp0 reads
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Stop initializing TSS.sp0 at boot
Andy Lutomirski <luto(a)kernel.org>
x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context()
Andy Lutomirski <luto(a)kernel.org>
x86/entry: Add task_top_of_stack() to find the top of a task's stack
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Pass SP0 directly to load_sp0()
Andy Lutomirski <luto(a)kernel.org>
x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0()
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: De-Xen-ify our NMI code
Juergen Gross <jgross(a)suse.com>
xen, x86/entry/64: Add xen NMI trap entry
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Remove the RESTORE_..._REGS infrastructure
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Use POP instead of MOV to restore regs on NMI return
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Merge the fast and slow SYSRET paths
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Use pop instead of movq in syscall_return_via_sysret
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Shrink paranoid_exit_restore and make labels local
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Simplify reg restore code in the standard IRET paths
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Move SWAPGS into the common IRET-to-usermode path
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Remove the restore_c_regs_and_iret label
Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
ptrace,x86: Make user_64bit_mode() available to 32-bit builds
Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
x86/boot: Relocate definition of the initial state of CR0
Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com>
x86/mm: Relocate page fault error codes to traps.h
Gayatri Kammela <gayatri.kammela(a)intel.com>
x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features
Baoquan He <bhe(a)redhat.com>
x86/mm/64: Rename the register_page_bootmem_memmap() 'size' parameter to 'nr_pages'
Masahiro Yamada <yamada.masahiro(a)socionext.com>
x86/build: Beautify build log of syscall headers
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/asm: Don't use the confusing '.ifeq' directive
Dongjiu Geng <gengdongjiu(a)huawei.com>
ACPI / APEI: remove the unused dead-code for SEA/NMI notification type
Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
x86/xen: Drop 5-level paging support code from the XEN_PV code
Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y
Andrey Ryabinin <aryabinin(a)virtuozzo.com>
x86/kasan: Use the same shadow offset for 4- and 5-level paging
Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
Thomas Gleixner <tglx(a)linutronix.de>
x86/cpuid: Prevent out of bound access in do_clear_cpu_cap()
Kamalesh Babulal <kamalesh(a)linux.vnet.ibm.com>
objtool: Print top level commands on incorrect usage
Kees Cook <keescook(a)chromium.org>
x86/platform/UV: Convert timers to use timer_setup()
Andi Kleen <ak(a)linux.intel.com>
x86/fpu: Remove the explicit clearing of XSAVE dependent features
Andi Kleen <ak(a)linux.intel.com>
x86/fpu: Make XSAVE check the base CPUID features before enabling
Andi Kleen <ak(a)linux.intel.com>
x86/fpu: Parse clearcpuid= as early XSAVE argument
Andi Kleen <ak(a)linux.intel.com>
x86/cpuid: Add generic table for CPUID dependencies
Andi Kleen <ak(a)linux.intel.com>
bitops: Add clear/set_bit32() to linux/bitops.h
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/unwind: Rename unwinder config options to 'CONFIG_UNWINDER_*'
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
x86/fpu/debug: Remove unused 'x86_fpu_state' and 'x86_fpu_deactivate_state' tracepoints
Ingo Molnar <mingo(a)kernel.org>
x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig
Jan Beulich <JBeulich(a)suse.com>
ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq()
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/head: Add unwind hint annotations
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/xen: Add unwind hint annotations
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/xen: Fix xen head ELF annotations
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/boot: Annotate verify_cpu() as a callable function
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/head: Fix head ELF function annotations
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/head: Remove unused 'bad_address' code
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/head: Remove confusing comment
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't report end of section error after an empty unwind hint
Uros Bizjak <ubizjak(a)gmail.com>
x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates
-------------
Diffstat:
Documentation/x86/orc-unwinder.txt | 2 +-
Documentation/x86/x86_64/mm.txt | 2 +-
Makefile | 8 +-
arch/arm/configs/exynos_defconfig | 2 +-
arch/arm64/include/asm/fixmap.h | 7 +
arch/powerpc/kernel/watchdog.c | 7 +-
arch/powerpc/xmon/xmon.c | 17 +-
arch/um/include/asm/Kbuild | 1 +
arch/x86/Kconfig | 5 +-
arch/x86/Kconfig.debug | 39 +-
arch/x86/configs/tiny.config | 4 +-
arch/x86/configs/x86_64_defconfig | 1 +
arch/x86/entry/calling.h | 69 +--
arch/x86/entry/entry_32.S | 6 +-
arch/x86/entry/entry_64.S | 322 +++++++++---
arch/x86/entry/entry_64_compat.S | 10 +-
arch/x86/entry/syscalls/Makefile | 4 +-
arch/x86/events/core.c | 2 +-
arch/x86/events/intel/core.c | 4 +
arch/x86/events/perf_event.h | 24 +-
arch/x86/hyperv/hv_init.c | 2 +-
arch/x86/include/asm/archrandom.h | 8 +-
arch/x86/include/asm/bitops.h | 10 +-
arch/x86/include/asm/compat.h | 1 +
arch/x86/include/asm/cpufeature.h | 11 +-
arch/x86/include/asm/cpufeatures.h | 538 +++++++++++----------
arch/x86/include/asm/desc.h | 11 +-
arch/x86/include/asm/fixmap.h | 74 ++-
arch/x86/include/asm/hypervisor.h | 53 +-
arch/x86/include/asm/irqflags.h | 3 +
arch/x86/include/asm/kdebug.h | 1 +
arch/x86/include/asm/mmu_context.h | 4 +-
arch/x86/include/asm/module.h | 2 +-
arch/x86/include/asm/paravirt.h | 14 +-
arch/x86/include/asm/paravirt_types.h | 2 +-
arch/x86/include/asm/percpu.h | 2 +-
arch/x86/include/asm/pgtable_types.h | 3 +-
arch/x86/include/asm/processor.h | 109 +++--
arch/x86/include/asm/ptrace.h | 6 +-
arch/x86/include/asm/rmwcc.h | 2 +-
arch/x86/include/asm/stacktrace.h | 3 +
arch/x86/include/asm/switch_to.h | 26 +
arch/x86/include/asm/thread_info.h | 2 +-
arch/x86/include/asm/trace/fpu.h | 10 -
arch/x86/include/asm/traps.h | 21 +-
arch/x86/include/asm/unwind.h | 15 +-
arch/x86/include/asm/x86_init.h | 24 +
arch/x86/include/uapi/asm/processor-flags.h | 3 +
arch/x86/kernel/Makefile | 10 +-
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/apic/x2apic_uv_x.c | 5 +-
arch/x86/kernel/asm-offsets.c | 6 +
arch/x86/kernel/asm-offsets_32.c | 9 +-
arch/x86/kernel/asm-offsets_64.c | 4 +
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/amd.c | 7 +-
arch/x86/kernel/cpu/common.c | 195 +++++---
arch/x86/kernel/cpu/cpuid-deps.c | 121 +++++
arch/x86/kernel/cpu/hypervisor.c | 64 +--
arch/x86/kernel/cpu/mshyperv.c | 6 +-
arch/x86/kernel/cpu/vmware.c | 8 +-
arch/x86/kernel/doublefault.c | 36 +-
arch/x86/kernel/dumpstack.c | 74 ++-
arch/x86/kernel/dumpstack_32.c | 6 +
arch/x86/kernel/dumpstack_64.c | 6 +
arch/x86/kernel/fpu/init.c | 11 +
arch/x86/kernel/fpu/xstate.c | 43 +-
arch/x86/kernel/head_32.S | 5 +-
arch/x86/kernel/head_64.S | 45 +-
arch/x86/kernel/ioport.c | 2 +-
arch/x86/kernel/irq.c | 12 -
arch/x86/kernel/irq_64.c | 4 +-
arch/x86/kernel/kvm.c | 6 +-
arch/x86/kernel/ldt.c | 2 +-
arch/x86/kernel/paravirt_patch_64.c | 2 -
arch/x86/kernel/process.c | 27 +-
arch/x86/kernel/process_32.c | 8 +-
arch/x86/kernel/process_64.c | 19 +-
arch/x86/kernel/smpboot.c | 3 +-
arch/x86/kernel/traps.c | 72 +--
arch/x86/kernel/unwind_orc.c | 88 ++--
arch/x86/kernel/verify_cpu.S | 3 +-
arch/x86/kernel/vm86_32.c | 20 +-
arch/x86/kernel/vmlinux.lds.S | 9 +
arch/x86/kernel/x86_init.c | 9 +
arch/x86/kvm/mmu.c | 4 +-
arch/x86/kvm/vmx.c | 2 +-
arch/x86/lib/delay.c | 4 +-
arch/x86/mm/fault.c | 88 ++--
arch/x86/mm/init.c | 2 +-
arch/x86/mm/init_64.c | 10 +-
arch/x86/mm/kasan_init_64.c | 262 ++++++++--
arch/x86/power/cpu.c | 16 +-
arch/x86/xen/enlighten_hvm.c | 12 +-
arch/x86/xen/enlighten_pv.c | 15 +-
arch/x86/xen/mmu_pv.c | 161 +++---
arch/x86/xen/smp_pv.c | 17 +-
arch/x86/xen/xen-asm_64.S | 2 +-
arch/x86/xen/xen-head.S | 11 +-
block/bfq-iosched.c | 3 +-
block/blk-wbt.c | 2 +-
crypto/lrw.c | 6 +-
drivers/acpi/apei/ghes.c | 78 +--
drivers/base/power/opp/core.c | 2 +-
drivers/bluetooth/hci_bcm.c | 23 +-
drivers/bluetooth/hci_ldisc.c | 7 +
drivers/clk/sunxi-ng/ccu-sun5i.c | 4 +-
drivers/clk/sunxi-ng/ccu-sun6i-a31.c | 2 +-
drivers/clk/sunxi-ng/ccu_nm.c | 3 +
drivers/cpuidle/cpuidle.c | 1 +
drivers/crypto/amcc/crypto4xx_core.h | 10 +-
drivers/gpu/drm/drm_dp_dual_mode_helper.c | 16 +-
drivers/gpu/drm/vc4/vc4_dsi.c | 3 +-
drivers/hv/vmbus_drv.c | 2 +-
drivers/iio/accel/st_accel_core.c | 35 +-
drivers/iio/common/st_sensors/st_sensors_core.c | 2 +-
drivers/iio/common/st_sensors/st_sensors_trigger.c | 16 +-
drivers/iio/gyro/st_gyro_core.c | 15 +-
drivers/iio/magnetometer/st_magn_core.c | 10 +-
drivers/iio/pressure/st_pressure_core.c | 15 +-
drivers/infiniband/hw/hns/hns_roce_hw_v1.c | 5 +
drivers/infiniband/sw/rxe/rxe_pool.c | 2 +
drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.c | 1 +
.../infiniband/ulp/opa_vnic/opa_vnic_vema_iface.c | 8 +-
drivers/input/mouse/vmmouse.c | 10 +-
drivers/leds/leds-pca955x.c | 17 +-
drivers/md/dm-mpath.c | 20 +-
drivers/md/md.c | 4 +-
drivers/misc/pti.c | 2 +-
drivers/misc/vmw_balloon.c | 2 +-
drivers/net/ethernet/ibm/ibmvnic.c | 2 +
drivers/net/ethernet/intel/fm10k/fm10k.h | 4 +-
drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 12 +-
drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 7 +-
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 9 +-
drivers/net/ethernet/intel/igb/igb_main.c | 2 +
drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 4 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 2 +
drivers/net/phy/at803x.c | 2 +-
drivers/pci/iov.c | 3 +-
drivers/pci/pci.c | 4 +
drivers/pci/pcie/aer/aerdrv_core.c | 9 +-
drivers/platform/x86/asus-wireless.c | 1 +
drivers/rtc/interface.c | 2 +-
drivers/rtc/rtc-pl031.c | 14 +-
drivers/scsi/cxgbi/cxgb4i/cxgb4i.c | 1 +
drivers/scsi/lpfc/lpfc_hbadisc.c | 3 +-
drivers/scsi/lpfc/lpfc_hw4.h | 2 +-
drivers/scsi/lpfc/lpfc_nvmet.c | 2 +
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 5 +
drivers/staging/greybus/light.c | 2 +
drivers/tee/optee/core.c | 1 -
drivers/thermal/hisi_thermal.c | 74 ++-
drivers/vfio/pci/vfio_pci_config.c | 6 +-
drivers/video/backlight/pwm_bl.c | 7 +-
fs/dcache.c | 4 +-
fs/overlayfs/ovl_entry.h | 2 +-
fs/overlayfs/readdir.c | 2 +-
include/asm-generic/vmlinux.lds.h | 2 +-
include/linux/compiler.h | 1 +
include/linux/hypervisor.h | 8 +-
include/linux/iio/common/st_sensors.h | 7 +-
include/linux/{pti.h => intel-pti.h} | 6 +-
include/linux/mm.h | 2 +-
include/linux/mmzone.h | 6 +-
include/linux/rculist.h | 4 +-
include/linux/rcupdate.h | 4 +-
kernel/events/core.c | 4 +-
kernel/seccomp.c | 2 +-
kernel/task_work.c | 2 +-
kernel/trace/trace_events_hist.c | 4 +-
lib/Kconfig.debug | 2 +-
mm/page_alloc.c | 10 +
mm/slab.h | 2 +-
mm/sparse.c | 17 +-
net/ipv4/ip_gre.c | 8 +-
net/ipv4/tcp_vegas.c | 2 +-
net/ipv6/addrconf.c | 12 +-
net/ipv6/route.c | 58 +--
net/sctp/stream.c | 8 +-
scripts/Makefile.build | 2 +-
sound/soc/codecs/msm8916-wcd-analog.c | 9 +-
sound/soc/img/img-parallel-out.c | 2 +
tools/objtool/check.c | 7 +-
tools/objtool/objtool.c | 6 +-
tools/testing/selftests/x86/ldt_gdt.c | 61 ++-
virt/kvm/kvm_main.c | 2 +-
188 files changed, 2414 insertions(+), 1428 deletions(-)
This is a note to let you know that I've just added the patch titled
x86/retpoline: Remove the esp/rsp thunk
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-retpoline-remove-the-esp-rsp-thunk.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 19:38:23 CST 2018
From: Waiman Long <longman(a)redhat.com>
Date: Mon, 22 Jan 2018 17:09:34 -0500
Subject: x86/retpoline: Remove the esp/rsp thunk
From: Waiman Long <longman(a)redhat.com>
(cherry picked from commit 1df37383a8aeabb9b418698f0bcdffea01f4b1b2)
It doesn't make sense to have an indirect call thunk with esp/rsp as
retpoline code won't work correctly with the stack pointer register.
Removing it will help compiler writers to catch error in case such
a thunk call is emitted incorrectly.
Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
Suggested-by: Jeff Law <law(a)redhat.com>
Signed-off-by: Waiman Long <longman(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Kees Cook <keescook(a)google.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Jiri Kosina <jikos(a)kernel.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org>
Cc: Paul Turner <pjt(a)google.com>
Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/asm-prototypes.h | 1 -
arch/x86/lib/retpoline.S | 1 -
2 files changed, 2 deletions(-)
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -37,5 +37,4 @@ INDIRECT_THUNK(dx)
INDIRECT_THUNK(si)
INDIRECT_THUNK(di)
INDIRECT_THUNK(bp)
-INDIRECT_THUNK(sp)
#endif /* CONFIG_RETPOLINE */
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -36,7 +36,6 @@ GENERATE_THUNK(_ASM_DX)
GENERATE_THUNK(_ASM_SI)
GENERATE_THUNK(_ASM_DI)
GENERATE_THUNK(_ASM_BP)
-GENERATE_THUNK(_ASM_SP)
#ifdef CONFIG_64BIT
GENERATE_THUNK(r8)
GENERATE_THUNK(r9)
Patches currently in stable-queue which might be from longman(a)redhat.com are
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
This is a note to let you know that I've just added the patch titled
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 19:38:23 CST 2018
From: David Woodhouse <dwmw(a)amazon.co.uk>
Date: Thu, 25 Jan 2018 16:14:13 +0000
Subject: x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
From: David Woodhouse <dwmw(a)amazon.co.uk>
(cherry picked from commit fec9434a12f38d3aeafeb75711b71d8a1fdef621)
Also, for CPUs which don't speculate at all, don't report that they're
vulnerable to the Spectre variants either.
Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it
for now, even though that could be done with a simple comparison, on the
assumption that we'll have more to add.
Based on suggestions from Dave Hansen and Alan Cox.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Acked-by: Dave Hansen <dave.hansen(a)intel.com>
Cc: gnomes(a)lxorguk.ukuu.org.uk
Cc: ak(a)linux.intel.com
Cc: ashok.raj(a)intel.com
Cc: karahmed(a)amazon.de
Cc: arjan(a)linux.intel.com
Cc: torvalds(a)linux-foundation.org
Cc: peterz(a)infradead.org
Cc: bp(a)alien8.de
Cc: pbonzini(a)redhat.com
Cc: tim.c.chen(a)linux.intel.com
Cc: gregkh(a)linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-6-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/common.c | 48 ++++++++++++++++++++++++++++++++++++++-----
1 file changed, 43 insertions(+), 5 deletions(-)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -44,6 +44,8 @@
#include <asm/pat.h>
#include <asm/microcode.h>
#include <asm/microcode_intel.h>
+#include <asm/intel-family.h>
+#include <asm/cpu_device_id.h>
#ifdef CONFIG_X86_LOCAL_APIC
#include <asm/uv/uv.h>
@@ -838,6 +840,41 @@ static void identify_cpu_without_cpuid(s
#endif
}
+static const __initdata struct x86_cpu_id cpu_no_speculation[] = {
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_CENTAUR, 5 },
+ { X86_VENDOR_INTEL, 5 },
+ { X86_VENDOR_NSC, 5 },
+ { X86_VENDOR_ANY, 4 },
+ {}
+};
+
+static const __initdata struct x86_cpu_id cpu_no_meltdown[] = {
+ { X86_VENDOR_AMD },
+ {}
+};
+
+static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c)
+{
+ u64 ia32_cap = 0;
+
+ if (x86_match_cpu(cpu_no_meltdown))
+ return false;
+
+ if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+ /* Rogue Data Cache Load? No! */
+ if (ia32_cap & ARCH_CAP_RDCL_NO)
+ return false;
+
+ return true;
+}
+
/*
* Do minimum CPU detection early.
* Fields really needed: vendor, cpuid_level, family, model, mask,
@@ -884,11 +921,12 @@ static void __init early_identify_cpu(st
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
- if (c->x86_vendor != X86_VENDOR_AMD)
- setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
-
- setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
- setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+ if (!x86_match_cpu(cpu_no_speculation)) {
+ if (cpu_vulnerable_to_meltdown(c))
+ setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+ }
fpu__init_system(c);
Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are
queue-4.9/kvm-vmx-introduce-alloc_loaded_vmcs.patch
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-cpufeature-blacklist-spec_ctrl-pred_cmd-on-early-spectre-v2-microcodes.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
queue-4.9/module-retpoline-warn-about-missing-retpoline-in-module.patch
queue-4.9/kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
queue-4.9/x86-msr-add-definitions-for-new-speculation-control-msrs.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kvm-vmx-make-msr-bitmaps-per-vcpu.patch
queue-4.9/kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
This is a note to let you know that I've just added the patch titled
x86/msr: Add definitions for new speculation control MSRs
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-msr-add-definitions-for-new-speculation-control-msrs.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 19:38:23 CST 2018
From: David Woodhouse <dwmw(a)amazon.co.uk>
Date: Thu, 25 Jan 2018 16:14:12 +0000
Subject: x86/msr: Add definitions for new speculation control MSRs
From: David Woodhouse <dwmw(a)amazon.co.uk>
(cherry picked from commit 1e340c60d0dd3ae07b5bedc16a0469c14b9f3410)
Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES.
See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: gnomes(a)lxorguk.ukuu.org.uk
Cc: ak(a)linux.intel.com
Cc: ashok.raj(a)intel.com
Cc: dave.hansen(a)intel.com
Cc: karahmed(a)amazon.de
Cc: arjan(a)linux.intel.com
Cc: torvalds(a)linux-foundation.org
Cc: peterz(a)infradead.org
Cc: bp(a)alien8.de
Cc: pbonzini(a)redhat.com
Cc: tim.c.chen(a)linux.intel.com
Cc: gregkh(a)linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-5-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/msr-index.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -37,6 +37,13 @@
#define EFER_FFXSR (1<<_EFER_FFXSR)
/* Intel MSRs. Some also available on other CPUs */
+#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */
+#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */
+#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */
+
+#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
+#define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */
+
#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
@@ -50,6 +57,11 @@
#define SNB_C3_AUTO_UNDEMOTE (1UL << 28)
#define MSR_MTRRcap 0x000000fe
+
+#define MSR_IA32_ARCH_CAPABILITIES 0x0000010a
+#define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */
+#define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */
+
#define MSR_IA32_BBL_CR_CTL 0x00000119
#define MSR_IA32_BBL_CR_CTL3 0x0000011e
Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are
queue-4.9/kvm-vmx-introduce-alloc_loaded_vmcs.patch
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-cpufeature-blacklist-spec_ctrl-pred_cmd-on-early-spectre-v2-microcodes.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
queue-4.9/module-retpoline-warn-about-missing-retpoline-in-module.patch
queue-4.9/kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
queue-4.9/x86-msr-add-definitions-for-new-speculation-control-msrs.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kvm-vmx-make-msr-bitmaps-per-vcpu.patch
queue-4.9/kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
This is a note to let you know that I've just added the patch titled
x86/cpufeatures: Add Intel feature bits for Speculation Control
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 19:38:23 CST 2018
From: David Woodhouse <dwmw(a)amazon.co.uk>
Date: Thu, 25 Jan 2018 16:14:10 +0000
Subject: x86/cpufeatures: Add Intel feature bits for Speculation Control
From: David Woodhouse <dwmw(a)amazon.co.uk>
(cherry picked from commit fc67dd70adb711a45d2ef34e12d1a8be75edde61)
Add three feature bits exposed by new microcode on Intel CPUs for
speculation control.
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: gnomes(a)lxorguk.ukuu.org.uk
Cc: ak(a)linux.intel.com
Cc: ashok.raj(a)intel.com
Cc: dave.hansen(a)intel.com
Cc: karahmed(a)amazon.de
Cc: arjan(a)linux.intel.com
Cc: torvalds(a)linux-foundation.org
Cc: peterz(a)infradead.org
Cc: bp(a)alien8.de
Cc: pbonzini(a)redhat.com
Cc: tim.c.chen(a)linux.intel.com
Cc: gregkh(a)linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-3-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/cpufeatures.h | 3 +++
1 file changed, 3 insertions(+)
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -296,6 +296,9 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_SPEC_CTRL (18*32+26) /* Speculation Control (IBRS + IBPB) */
+#define X86_FEATURE_STIBP (18*32+27) /* Single Thread Indirect Branch Predictors */
+#define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
/*
* BUG word(s)
Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are
queue-4.9/kvm-vmx-introduce-alloc_loaded_vmcs.patch
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-cpufeature-blacklist-spec_ctrl-pred_cmd-on-early-spectre-v2-microcodes.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
queue-4.9/module-retpoline-warn-about-missing-retpoline-in-module.patch
queue-4.9/kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
queue-4.9/x86-msr-add-definitions-for-new-speculation-control-msrs.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kvm-vmx-make-msr-bitmaps-per-vcpu.patch
queue-4.9/kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
This is a note to let you know that I've just added the patch titled
x86/cpufeatures: Add AMD feature bits for Speculation Control
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 19:38:23 CST 2018
From: David Woodhouse <dwmw(a)amazon.co.uk>
Date: Thu, 25 Jan 2018 16:14:11 +0000
Subject: x86/cpufeatures: Add AMD feature bits for Speculation Control
From: David Woodhouse <dwmw(a)amazon.co.uk>
(cherry picked from commit 5d10cbc91d9eb5537998b65608441b592eec65e7)
AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel.
See http://lkml.kernel.org/r/2b3e25cc-286d-8bd0-aeaf-9ac4aae39de8@amd.com
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: gnomes(a)lxorguk.ukuu.org.uk
Cc: ak(a)linux.intel.com
Cc: ashok.raj(a)intel.com
Cc: dave.hansen(a)intel.com
Cc: karahmed(a)amazon.de
Cc: arjan(a)linux.intel.com
Cc: torvalds(a)linux-foundation.org
Cc: peterz(a)infradead.org
Cc: bp(a)alien8.de
Cc: pbonzini(a)redhat.com
Cc: tim.c.chen(a)linux.intel.com
Cc: gregkh(a)linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-4-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/cpufeatures.h | 3 +++
1 file changed, 3 insertions(+)
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -258,6 +258,9 @@
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
#define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */
#define X86_FEATURE_IRPERF (13*32+1) /* Instructions Retired Count */
+#define X86_FEATURE_AMD_PRED_CMD (13*32+12) /* Prediction Command MSR (AMD) */
+#define X86_FEATURE_AMD_SPEC_CTRL (13*32+14) /* Speculation Control MSR only (AMD) */
+#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors (AMD) */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are
queue-4.9/kvm-vmx-introduce-alloc_loaded_vmcs.patch
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-cpufeature-blacklist-spec_ctrl-pred_cmd-on-early-spectre-v2-microcodes.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
queue-4.9/module-retpoline-warn-about-missing-retpoline-in-module.patch
queue-4.9/kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
queue-4.9/x86-msr-add-definitions-for-new-speculation-control-msrs.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kvm-vmx-make-msr-bitmaps-per-vcpu.patch
queue-4.9/kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: Make indirect calls in emulator speculation safe
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 19:38:23 CST 2018
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Thu, 25 Jan 2018 10:58:13 +0100
Subject: KVM: x86: Make indirect calls in emulator speculation safe
From: Peter Zijlstra <peterz(a)infradead.org>
(cherry picked from commit 1a29b5b7f347a1a9230c1e0af5b37e3e571588ab)
Replace the indirect calls with CALL_NOSPEC.
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: Jun Nakajima <jun.nakajima(a)intel.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: rga(a)amazon.de
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Asit Mallick <asit.k.mallick(a)intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Jason Baron <jbaron(a)akamai.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven(a)intel.com>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Link: https://lkml.kernel.org/r/20180125095843.595615683@infradead.org
[dwmw2: Use ASM_CALL_CONSTRAINT like upstream, now we have it]
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/emulate.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -25,6 +25,7 @@
#include <asm/kvm_emulate.h>
#include <linux/stringify.h>
#include <asm/debugreg.h>
+#include <asm/nospec-branch.h>
#include "x86.h"
#include "tss.h"
@@ -1012,8 +1013,8 @@ static __always_inline u8 test_cc(unsign
void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);
flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF;
- asm("push %[flags]; popf; call *%[fastop]"
- : "=a"(rc) : [fastop]"r"(fop), [flags]"r"(flags));
+ asm("push %[flags]; popf; " CALL_NOSPEC
+ : "=a"(rc) : [thunk_target]"r"(fop), [flags]"r"(flags));
return rc;
}
@@ -5306,15 +5307,14 @@ static void fetch_possible_mmx_operand(s
static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
{
- register void *__sp asm(_ASM_SP);
ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
if (!(ctxt->d & ByteOp))
fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
- asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n"
+ asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
: "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
- [fastop]"+S"(fop), "+r"(__sp)
+ [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT
: "c"(ctxt->src2.val));
ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-cpufeature-blacklist-spec_ctrl-pred_cmd-on-early-spectre-v2-microcodes.patch
queue-4.9/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
queue-4.9/x86-msr-add-definitions-for-new-speculation-control-msrs.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kaiser-fix-intel_bts-perf-crashes.patch
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
This is a note to let you know that I've just added the patch titled
KVM: VMX: Make indirect call speculation safe
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-make-indirect-call-speculation-safe.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 19:38:23 CST 2018
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Thu, 25 Jan 2018 10:58:14 +0100
Subject: KVM: VMX: Make indirect call speculation safe
From: Peter Zijlstra <peterz(a)infradead.org>
(cherry picked from commit c940a3fb1e2e9b7d03228ab28f375fb5a47ff699)
Replace indirect call with CALL_NOSPEC.
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: Jun Nakajima <jun.nakajima(a)intel.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: rga(a)amazon.de
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Asit Mallick <asit.k.mallick(a)intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Jason Baron <jbaron(a)akamai.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven(a)intel.com>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Link: https://lkml.kernel.org/r/20180125095843.645776917@infradead.org
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8676,14 +8676,14 @@ static void vmx_handle_external_intr(str
#endif
"pushf\n\t"
__ASM_SIZE(push) " $%c[cs]\n\t"
- "call *%[entry]\n\t"
+ CALL_NOSPEC
:
#ifdef CONFIG_X86_64
[sp]"=&r"(tmp),
#endif
"+r"(__sp)
:
- [entry]"r"(entry),
+ THUNK_TARGET(entry),
[ss]"i"(__KERNEL_DS),
[cs]"i"(__KERNEL_CS)
);
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/x86-cpufeatures-add-intel-feature-bits-for-speculation-control.patch
queue-4.9/x86-cpufeatures-add-cpuid_7_edx-cpuid-leaf.patch
queue-4.9/kvm-x86-make-indirect-calls-in-emulator-speculation-safe.patch
queue-4.9/x86-cpufeature-blacklist-spec_ctrl-pred_cmd-on-early-spectre-v2-microcodes.patch
queue-4.9/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
queue-4.9/kvm-vmx-make-indirect-call-speculation-safe.patch
queue-4.9/x86-cpufeatures-add-amd-feature-bits-for-speculation-control.patch
queue-4.9/x86-msr-add-definitions-for-new-speculation-control-msrs.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kaiser-fix-intel_bts-perf-crashes.patch
queue-4.9/x86-retpoline-remove-the-esp-rsp-thunk.patch
queue-4.9/x86-pti-do-not-enable-pti-on-cpus-which-are-not-vulnerable-to-meltdown.patch
Hello, I am not familiar with Linux mailing, sorry for this.
I am writing this mail because I am not sure about commit
6c16fa957e84c8b640dadc4e0264ff0d2dae7aa3 added in 3.18.93.
That backport made the arp.h changes in struct __ipv4_neigh_lookup(),
but compared with upstream commit, I think that 3 lines should be
added in __ipv4_neigh_lookup_noref(), as I go through the context.
Also the latter would be called unconditionally if former is called so
it would cover more cases.
Could you please check this and let me know? Though it would 90% be my derp.
Thanks,
Wang
As discussed earlier,
From: Wang Han <wanghan1995315(a)gmail.com>
Date: Sat, 3 Feb 2018 03:28:35 +1400
Subject: [PATCH] ipv4: Map neigh lookup keys in __ipv4_neigh_lookup_noref()
Commit 6c16fa957e84 is an incorrect backport as we map the keys in
struct __ipv4_neigh_lookup(), but the correct place to add the
code is struct __ipv4_neigh_lookup_noref(), compared to upstream.
Fix it by moving the code, or fewer cases will be covered as
__ipv4_neigh_lookup_noref() will be called unconditionally from
__ipv4_neigh_lookup(), and it can be called from other places
such as ip_output.c.
Fixes: 6c16fa957e84 (ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY)
Signed-off-by: Wang Han <wanghan1995315(a)gmail.com>
---
include/net/arp.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/net/arp.h b/include/net/arp.h
index 174014585ade..5d8c7990582f 100644
--- a/include/net/arp.h
+++ b/include/net/arp.h
@@ -22,6 +22,9 @@ static inline struct neighbour *__ipv4_neigh_lookup_noref(struct net_device *dev
struct neighbour *n;
u32 hash_val;
+ if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
+ key = INADDR_ANY;
+
hash_val = arp_hashfn(key, dev, nht->hash_rnd[0]) >> (32 - nht->hash_shift);
for (n = rcu_dereference_bh(nht->hash_buckets[hash_val]);
n != NULL;
@@ -37,9 +40,6 @@ static inline struct neighbour *__ipv4_neigh_lookup(struct net_device *dev, u32
{
struct neighbour *n;
- if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
- key = INADDR_ANY;
-
rcu_read_lock_bh();
n = __ipv4_neigh_lookup_noref(dev, key);
if (n && !atomic_inc_not_zero(&n->refcnt))
--
2.14.1
On Wed, Feb 07, 2018 at 03:19:07PM +0000, Harsh Shandilya wrote:
> On Tue 6 Feb, 2018, 12:09 AM Greg Kroah-Hartman, <gregkh(a)linuxfoundation.org>
> wrote:
>
> > This is the start of the stable review cycle for the 3.18.94 release.
> > There are 36 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed Feb 7 18:23:41 UTC 2018.
> > Anything received after that time might be too late.
> >
>
> Let's find out if I am too late.
>
> https://www.spinics.net/lists/stable/msg213160.html will have to be lined
> up for 3.18.94, or to .95 if I'm late since it's fixing a erroneous
> backport from 3.18.93 which went unnoticed until after release. The email
> bringing it up and the subsequent patch were probably lost in the usual
> stable noise. I have applied it to the CAF tree for the oneplus3 and
> verified that it does not break compilation or userspace for arm64.
It will be queued up for the next release, I missed it this time, sorry.
greg k-h
On Sat, Feb 03, 2018 at 07:25:54PM +0100, Markus Demleitner wrote:
> On Tue, Jan 30, 2018 at 07:32:04AM +0100, Greg KH wrote:
> > On Mon, Jan 29, 2018 at 07:21:09PM +0100, Markus Demleitner wrote:
> > > A while ago I opened bug #197863 in the kernel bugzilla:
> > >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=197863
> > >
> > > Essentially, xhci_hcd on a resume from RAM takes several seconds on a
> > > Thinkpad X240 that is equipped with a Sierra LTE modem after an
> > > upgrade to 4.13:
> [...]
> > > Now, interestingly, there are quick resumes in 4.15.0, too, now and
> > > then. In that case, things look pretty much like on 4.12.2. Here's
> > > a 4.15.0 fast resume example:
> > >
> > > [ 873.127570] usb 1-4: device firmware changed
> > > [ 873.277351] usb 1-8: reset high-speed USB device number 4 using xhci_hcd
> > > [ 873.515141] restoring control 00000000-0000-0000-0000-000000000101/0/2
> > > [ 873.583238] OOM killer enabled.
> > > [ 873.583240] Restarting tasks ...
> > > [ 873.583339] usb 1-4: USB disconnect, device number 10
> > > [ 873.586356] done.
> > > [ 873.604420] PM: suspend exit
> > > [ 873.737283] usb 1-4: new high-speed USB device number 11 using xhci_hcd
> > > [ 880.867398] usb 1-4: device descriptor read/64, error -110
> > > [ 881.175558] usb 1-4: New USB device found, idVendor=1199, idProduct=a001
> > > [ 881.175563] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
> > > [ 881.175566] usb 1-4: Product: Sierra Wireless EM7345 4G LTE
> > > [ 881.175568] usb 1-4: Manufacturer: Sierra Wireless Inc.
> > > [ 881.175570] usb 1-4: SerialNumber: 013937002544516
> > >
> > > Any guess what might be behind this?
> >
> > Any chance you can run 'git bisect' to find the offending commit for
> > this issue?
>
> It's 662591461c4b9a1e3b9b159dbf37648a585ebaae. To my eyes, it even
> looks plausible that it's causing the problematic behaviour, but
> since I can't say I understand what I'd be doing if I dabbled with
> the change, I've refrained from guessing how to fix it.
>
> I'm happy to try patches, though.
Ok, thanks. I've added the authors of this patch to the email here,
perhaps they have an idea of what is going on?
greg k-h
commit 9862b43624 upstream
There has been a race condition identified with initialization of the dell-laptop
driver due to changes in 4.15. This commit has been shown to resolve the race
condition.
I'd like to see this brought back to 4.15 stable. It doesn't apply to anything earlier.
Thanks,
This reverts commit dbac5d07d13e330e6706813c9fde477140fb5d80.
commit dbac5d07d13e ("usb: musb: host: don't start next rx urb if current one failed")
along with commit b5801212229f ("usb: musb: host: clear rxcsr error bit if set")
try to solve the issue described in [1], but the latter alone is sufficient,
and the former causes the issue as in [2], so now revert it.
[1] https://marc.info/?l=linux-usb&m=146173995117456&w=2
[2] https://marc.info/?l=linux-usb&m=151689238420622&w=2
Signed-off-by: Bin Liu <b-liu(a)ti.com>
---
drivers/usb/musb/musb_host.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 394b4ac86161..45ed32c2cba9 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -391,13 +391,7 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
}
}
- /*
- * The pipe must be broken if current urb->status is set, so don't
- * start next urb.
- * TODO: to minimize the risk of regression, only check urb->status
- * for RX, until we have a test case to understand the behavior of TX.
- */
- if ((!status || !is_in) && qh && qh->is_ready) {
+ if (qh != NULL && qh->is_ready) {
musb_dbg(musb, "... next ep%d %cX urb %p",
hw_ep->epnum, is_in ? 'R' : 'T', next_urb(qh));
musb_start_urb(musb, is_in, qh);
--
1.9.1
The patch titled
Subject: mm, swap, frontswap: Fix THP swap if frontswap enabled
has been added to the -mm tree. Its filename is
mm-swap-frontswap-fix-thp-swap-if-frontswap-enabled.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-frontswap-fix-thp-swap-if-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-frontswap-fix-thp-swap-if-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Huang Ying <huang.ying.caritas(a)gmail.com>
Subject: mm, swap, frontswap: Fix THP swap if frontswap enabled
It was reported by Sergey Senozhatsky that if THP (Transparent Huge Page)
and frontswap (via zswap) are both enabled, when memory goes low so that
swap is triggered, segfault and memory corruption will occur in random
user space applications as follow,
kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
#0 0x00007fc08889ae0d _int_malloc (libc.so.6)
#1 0x00007fc08889c2f3 malloc (libc.so.6)
#2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
#3 0x0000560e6005e75c n/a (urxvt)
#4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
#5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
#6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
#7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
#8 0x0000560e6005cb55 ev_run (urxvt)
#9 0x0000560e6003b9b9 main (urxvt)
#10 0x00007fc08883af4a __libc_start_main (libc.so.6)
#11 0x0000560e6003f9da _start (urxvt)
After bisection, it was found the first bad commit is bd4c82c22c367e068
("mm, THP, swap: delay splitting THP after swapped out").
The root cause is as follows:
When the pages are written to swap device during swapping out in
swap_writepage(), zswap (fontswap) is tried to compress the pages instead
to improve the performance. But zswap (frontswap) will treat THP as
normal page, so only the head page is saved. After swapping in, tail
pages will not be restored to its original contents, so cause the memory
corruption in the applications.
This is fixed via splitting THP before writing the page to swap device if
frontswap is enabled. To deal with the situation where frontswap is
enabled at runtime, whether the page is THP is checked before using
frontswap during swapping out too.
Link: http://lkml.kernel.org/r/20180207070035.30302-1-ying.huang@intel.com
Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out")
Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Dan Streetman <ddstreet(a)ieee.org>
Cc: Seth Jennings <sjenning(a)redhat.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: Shaohua Li <shli(a)kernel.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_io.c | 2 +-
mm/swapfile.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff -puN mm/page_io.c~mm-swap-frontswap-fix-thp-swap-if-frontswap-enabled mm/page_io.c
--- a/mm/page_io.c~mm-swap-frontswap-fix-thp-swap-if-frontswap-enabled
+++ a/mm/page_io.c
@@ -250,7 +250,7 @@ int swap_writepage(struct page *page, st
unlock_page(page);
goto out;
}
- if (frontswap_store(page) == 0) {
+ if (!PageTransHuge(page) && frontswap_store(page) == 0) {
set_page_writeback(page);
unlock_page(page);
end_page_writeback(page);
diff -puN mm/swapfile.c~mm-swap-frontswap-fix-thp-swap-if-frontswap-enabled mm/swapfile.c
--- a/mm/swapfile.c~mm-swap-frontswap-fix-thp-swap-if-frontswap-enabled
+++ a/mm/swapfile.c
@@ -934,6 +934,9 @@ int get_swap_pages(int n_goal, bool clus
/* Only single cluster request supported */
WARN_ON_ONCE(n_goal > 1 && cluster);
+ /* Frontswap doesn't support THP */
+ if (frontswap_enabled() && cluster)
+ goto noswap;
avail_pgs = atomic_long_read(&nr_swap_pages) / nr_pages;
if (avail_pgs <= 0)
_
Patches currently in -mm which might be from huang.ying.caritas(a)gmail.com are
mm-swap-frontswap-fix-thp-swap-if-frontswap-enabled.patch
fontswap-thp-fix.patch
This is a note to let you know that I've just added the patch titled
KEYS: encrypted: fix buffer overread in valid_master_desc()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
keys-encrypted-fix-buffer-overread-in-valid_master_desc.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 794b4bc292f5d31739d89c0202c54e7dc9bc3add Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Thu, 8 Jun 2017 14:48:18 +0100
Subject: KEYS: encrypted: fix buffer overread in valid_master_desc()
From: Eric Biggers <ebiggers(a)google.com>
commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream.
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: Mimi Zohar <zohar(a)linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: James Morris <james.l.morris(a)oracle.com>
Signed-off-by: Jin Qian <jinqian(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const cha
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}
/*
Patches currently in stable-queue which might be from ebiggers(a)google.com are
queue-4.9/keys-encrypted-fix-buffer-overread-in-valid_master_desc.patch
This is a note to let you know that I've just added the patch titled
media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5331aec1bf9c9da557668174e0a4bfcee39f1121 Mon Sep 17 00:00:00 2001
From: Jesse Chan <jc(a)linux.com>
Date: Mon, 20 Nov 2017 15:56:28 -0500
Subject: media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
From: Jesse Chan <jc(a)linux.com>
commit 5331aec1bf9c9da557668174e0a4bfcee39f1121 upstream.
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o
see include/linux/module.h for more information
This adds the license as "GPL", which matches the header of the file.
MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
Signed-off-by: Jesse Chan <jc(a)linux.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/media/platform/soc_camera/soc_scale_crop.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/media/platform/soc_camera/soc_scale_crop.c
+++ b/drivers/media/platform/soc_camera/soc_scale_crop.c
@@ -418,3 +418,7 @@ void soc_camera_calc_client_output(struc
mf->height = soc_camera_shift_scale(rect->height, shift, scale_v);
}
EXPORT_SYMBOL(soc_camera_calc_client_output);
+
+MODULE_DESCRIPTION("soc-camera scaling-cropping functions");
+MODULE_AUTHOR("Guennadi Liakhovetski <kernel(a)pengutronix.de>");
+MODULE_LICENSE("GPL");
Patches currently in stable-queue which might be from jc(a)linux.com are
queue-4.9/media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
queue-4.9/auxdisplay-img-ascii-lcd-add-missing-module_description-author-license.patch
queue-4.9/asoc-pcm512x-add-missing-module_description-author-license.patch
queue-4.9/pinctrl-pxa-pxa2xx-add-missing-module_description-author-license.patch
This is a note to let you know that I've just added the patch titled
b43: Add missing MODULE_FIRMWARE()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
b43-add-missing-module_firmware.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3c89a72ad80c64bdbd5ff851ee9c328a191f7e01 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Thu, 4 May 2017 11:27:52 +0200
Subject: b43: Add missing MODULE_FIRMWARE()
From: Takashi Iwai <tiwai(a)suse.de>
commit 3c89a72ad80c64bdbd5ff851ee9c328a191f7e01 upstream.
Some firmware entries were forgotten to be added via MODULE_FIRMWARE(), which
may result in the non-functional state when the driver is loaded in initrd.
Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1037344
Fixes: 15be8e89cdd9 ("b43: add more bcma cores")
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Kalle Valo <kvalo(a)codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/wireless/broadcom/b43/main.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/drivers/net/wireless/broadcom/b43/main.c
+++ b/drivers/net/wireless/broadcom/b43/main.c
@@ -71,8 +71,18 @@ MODULE_FIRMWARE("b43/ucode11.fw");
MODULE_FIRMWARE("b43/ucode13.fw");
MODULE_FIRMWARE("b43/ucode14.fw");
MODULE_FIRMWARE("b43/ucode15.fw");
+MODULE_FIRMWARE("b43/ucode16_lp.fw");
MODULE_FIRMWARE("b43/ucode16_mimo.fw");
+MODULE_FIRMWARE("b43/ucode24_lcn.fw");
+MODULE_FIRMWARE("b43/ucode25_lcn.fw");
+MODULE_FIRMWARE("b43/ucode25_mimo.fw");
+MODULE_FIRMWARE("b43/ucode26_mimo.fw");
+MODULE_FIRMWARE("b43/ucode29_mimo.fw");
+MODULE_FIRMWARE("b43/ucode33_lcn40.fw");
+MODULE_FIRMWARE("b43/ucode30_mimo.fw");
MODULE_FIRMWARE("b43/ucode5.fw");
+MODULE_FIRMWARE("b43/ucode40.fw");
+MODULE_FIRMWARE("b43/ucode42.fw");
MODULE_FIRMWARE("b43/ucode9.fw");
static int modparam_bad_frames_preempt;
Patches currently in stable-queue which might be from tiwai(a)suse.de are
queue-4.9/b43-add-missing-module_firmware.patch
This is a note to let you know that I've just added the patch titled
media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5331aec1bf9c9da557668174e0a4bfcee39f1121 Mon Sep 17 00:00:00 2001
From: Jesse Chan <jc(a)linux.com>
Date: Mon, 20 Nov 2017 15:56:28 -0500
Subject: media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
From: Jesse Chan <jc(a)linux.com>
commit 5331aec1bf9c9da557668174e0a4bfcee39f1121 upstream.
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o
see include/linux/module.h for more information
This adds the license as "GPL", which matches the header of the file.
MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
Signed-off-by: Jesse Chan <jc(a)linux.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/media/platform/soc_camera/soc_scale_crop.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/media/platform/soc_camera/soc_scale_crop.c
+++ b/drivers/media/platform/soc_camera/soc_scale_crop.c
@@ -405,3 +405,7 @@ void soc_camera_calc_client_output(struc
mf->height = soc_camera_shift_scale(rect->height, shift, scale_v);
}
EXPORT_SYMBOL(soc_camera_calc_client_output);
+
+MODULE_DESCRIPTION("soc-camera scaling-cropping functions");
+MODULE_AUTHOR("Guennadi Liakhovetski <kernel(a)pengutronix.de>");
+MODULE_LICENSE("GPL");
Patches currently in stable-queue which might be from jc(a)linux.com are
queue-4.4/media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
queue-4.4/asoc-pcm512x-add-missing-module_description-author-license.patch
This is a note to let you know that I've just added the patch titled
KEYS: encrypted: fix buffer overread in valid_master_desc()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
keys-encrypted-fix-buffer-overread-in-valid_master_desc.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 794b4bc292f5d31739d89c0202c54e7dc9bc3add Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Thu, 8 Jun 2017 14:48:18 +0100
Subject: KEYS: encrypted: fix buffer overread in valid_master_desc()
From: Eric Biggers <ebiggers(a)google.com>
commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream.
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: Mimi Zohar <zohar(a)linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: James Morris <james.l.morris(a)oracle.com>
Signed-off-by: Jin Qian <jinqian(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const cha
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}
/*
Patches currently in stable-queue which might be from ebiggers(a)google.com are
queue-4.4/keys-encrypted-fix-buffer-overread-in-valid_master_desc.patch
This is a note to let you know that I've just added the patch titled
media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5331aec1bf9c9da557668174e0a4bfcee39f1121 Mon Sep 17 00:00:00 2001
From: Jesse Chan <jc(a)linux.com>
Date: Mon, 20 Nov 2017 15:56:28 -0500
Subject: media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
From: Jesse Chan <jc(a)linux.com>
commit 5331aec1bf9c9da557668174e0a4bfcee39f1121 upstream.
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o
see include/linux/module.h for more information
This adds the license as "GPL", which matches the header of the file.
MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
Signed-off-by: Jesse Chan <jc(a)linux.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/media/platform/soc_camera/soc_scale_crop.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/media/platform/soc_camera/soc_scale_crop.c
+++ b/drivers/media/platform/soc_camera/soc_scale_crop.c
@@ -420,3 +420,7 @@ void soc_camera_calc_client_output(struc
mf->height = soc_camera_shift_scale(rect->height, shift, scale_v);
}
EXPORT_SYMBOL(soc_camera_calc_client_output);
+
+MODULE_DESCRIPTION("soc-camera scaling-cropping functions");
+MODULE_AUTHOR("Guennadi Liakhovetski <kernel(a)pengutronix.de>");
+MODULE_LICENSE("GPL");
Patches currently in stable-queue which might be from jc(a)linux.com are
queue-4.15/media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
queue-4.15/media-mtk-vcodec-add-missing-module_license-description.patch
queue-4.15/media-tegra-cec-add-missing-module_description-author-license.patch
This is a note to let you know that I've just added the patch titled
media: tegra-cec: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
media-tegra-cec-add-missing-module_description-author-license.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 20772c1a6f0e03d4231e2b539863987fa0c94e89 Mon Sep 17 00:00:00 2001
From: Jesse Chan <jc(a)linux.com>
Date: Mon, 20 Nov 2017 15:56:52 -0500
Subject: media: tegra-cec: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
From: Jesse Chan <jc(a)linux.com>
commit 20772c1a6f0e03d4231e2b539863987fa0c94e89 upstream.
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/tegra-cec/tegra_cec.o
see include/linux/module.h for more information
This adds the license as "GPL v2", which matches the header of the file.
MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
Signed-off-by: Jesse Chan <jc(a)linux.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/media/platform/tegra-cec/tegra_cec.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/media/platform/tegra-cec/tegra_cec.c
+++ b/drivers/media/platform/tegra-cec/tegra_cec.c
@@ -493,3 +493,8 @@ static struct platform_driver tegra_cec_
};
module_platform_driver(tegra_cec_driver);
+
+MODULE_DESCRIPTION("Tegra HDMI CEC driver");
+MODULE_AUTHOR("NVIDIA CORPORATION");
+MODULE_AUTHOR("Cisco Systems, Inc. and/or its affiliates");
+MODULE_LICENSE("GPL v2");
Patches currently in stable-queue which might be from jc(a)linux.com are
queue-4.15/media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
queue-4.15/media-mtk-vcodec-add-missing-module_license-description.patch
queue-4.15/media-tegra-cec-add-missing-module_description-author-license.patch
This is a note to let you know that I've just added the patch titled
media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
media-mtk-vcodec-add-missing-module_license-description.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ccbc1e3876d5476fc23429690a683294ef0f755b Mon Sep 17 00:00:00 2001
From: Jesse Chan <jc(a)linux.com>
Date: Mon, 20 Nov 2017 02:47:54 -0500
Subject: media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION
From: Jesse Chan <jc(a)linux.com>
commit ccbc1e3876d5476fc23429690a683294ef0f755b upstream.
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
see include/linux/module.h for more information
This adds the license as "GPL v2", which matches the header of the file.
MODULE_DESCRIPTION is also added.
Signed-off-by: Jesse Chan <jc(a)linux.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/media/platform/mtk-vcodec/mtk_vcodec_util.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_util.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_util.c
@@ -115,3 +115,6 @@ struct mtk_vcodec_ctx *mtk_vcodec_get_cu
return ctx;
}
EXPORT_SYMBOL(mtk_vcodec_get_curr_ctx);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Mediatek video codec driver");
Patches currently in stable-queue which might be from jc(a)linux.com are
queue-4.15/media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
queue-4.15/media-mtk-vcodec-add-missing-module_license-description.patch
queue-4.15/media-tegra-cec-add-missing-module_description-author-license.patch
This is a note to let you know that I've just added the patch titled
gpio: uniphier: fix mismatch between license text and MODULE_LICENSE
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
gpio-uniphier-fix-mismatch-between-license-text-and-module_license.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 13f9d59cef91a027bb338fc79b564fafa7680cd8 Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Date: Thu, 23 Nov 2017 19:01:49 +0900
Subject: gpio: uniphier: fix mismatch between license text and MODULE_LICENSE
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
commit 13f9d59cef91a027bb338fc79b564fafa7680cd8 upstream.
The comment block of this file indicates GPL-2.0 "only", while the
MODULE_LICENSE is GPL-2.0 "or later", as include/linux/module.h
describes as follows:
"GPL" [GNU Public License v2 or later]
"GPL v2" [GNU Public License v2]
I am the author of this driver, and my intention is GPL-2.0 "only".
Fixes: dbe776c2ca54 ("gpio: uniphier: add UniPhier GPIO controller driver")
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpio/gpio-uniphier.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpio/gpio-uniphier.c
+++ b/drivers/gpio/gpio-uniphier.c
@@ -505,4 +505,4 @@ module_platform_driver(uniphier_gpio_dri
MODULE_AUTHOR("Masahiro Yamada <yamada.masahiro(a)socionext.com>");
MODULE_DESCRIPTION("UniPhier GPIO driver");
-MODULE_LICENSE("GPL");
+MODULE_LICENSE("GPL v2");
Patches currently in stable-queue which might be from yamada.masahiro(a)socionext.com are
queue-4.15/gpio-uniphier-fix-mismatch-between-license-text-and-module_license.patch
This is a note to let you know that I've just added the patch titled
media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5331aec1bf9c9da557668174e0a4bfcee39f1121 Mon Sep 17 00:00:00 2001
From: Jesse Chan <jc(a)linux.com>
Date: Mon, 20 Nov 2017 15:56:28 -0500
Subject: media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
From: Jesse Chan <jc(a)linux.com>
commit 5331aec1bf9c9da557668174e0a4bfcee39f1121 upstream.
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o
see include/linux/module.h for more information
This adds the license as "GPL", which matches the header of the file.
MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
Signed-off-by: Jesse Chan <jc(a)linux.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/media/platform/soc_camera/soc_scale_crop.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/media/platform/soc_camera/soc_scale_crop.c
+++ b/drivers/media/platform/soc_camera/soc_scale_crop.c
@@ -419,3 +419,7 @@ void soc_camera_calc_client_output(struc
mf->height = soc_camera_shift_scale(rect->height, shift, scale_v);
}
EXPORT_SYMBOL(soc_camera_calc_client_output);
+
+MODULE_DESCRIPTION("soc-camera scaling-cropping functions");
+MODULE_AUTHOR("Guennadi Liakhovetski <kernel(a)pengutronix.de>");
+MODULE_LICENSE("GPL");
Patches currently in stable-queue which might be from jc(a)linux.com are
queue-4.14/media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
queue-4.14/media-mtk-vcodec-add-missing-module_license-description.patch
This is a note to let you know that I've just added the patch titled
media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
media-mtk-vcodec-add-missing-module_license-description.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ccbc1e3876d5476fc23429690a683294ef0f755b Mon Sep 17 00:00:00 2001
From: Jesse Chan <jc(a)linux.com>
Date: Mon, 20 Nov 2017 02:47:54 -0500
Subject: media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION
From: Jesse Chan <jc(a)linux.com>
commit ccbc1e3876d5476fc23429690a683294ef0f755b upstream.
This change resolves a new compile-time warning
when built as a loadable module:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
see include/linux/module.h for more information
This adds the license as "GPL v2", which matches the header of the file.
MODULE_DESCRIPTION is also added.
Signed-off-by: Jesse Chan <jc(a)linux.com>
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/media/platform/mtk-vcodec/mtk_vcodec_util.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_util.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_util.c
@@ -115,3 +115,6 @@ struct mtk_vcodec_ctx *mtk_vcodec_get_cu
return ctx;
}
EXPORT_SYMBOL(mtk_vcodec_get_curr_ctx);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Mediatek video codec driver");
Patches currently in stable-queue which might be from jc(a)linux.com are
queue-4.14/media-soc_camera-soc_scale_crop-add-missing-module_description-author-license.patch
queue-4.14/media-mtk-vcodec-add-missing-module_license-description.patch
This is a note to let you know that I've just added the patch titled
ARM: exynos_defconfig: Enable options to mount a rootfs via NFS
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-exynos_defconfig-enable-options-to-mount-a-rootfs-via-nfs.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 19f79ccf6d77409cd138bce8db206cdac7fd5ea7 Mon Sep 17 00:00:00 2001
From: Javier Martinez Canillas <javier.martinez(a)collabora.co.uk>
Date: Fri, 27 Mar 2015 01:50:16 +0900
Subject: ARM: exynos_defconfig: Enable options to mount a rootfs via NFS
From: Javier Martinez Canillas <javier.martinez(a)collabora.co.uk>
commit 19f79ccf6d77409cd138bce8db206cdac7fd5ea7 upstream.
This patch enables the options to mount a rootfs over NFS and also
support for automatic configuration of IP addresses during boot as
needed by NFS.
Signed-off-by: Javier Martinez Canillas <javier.martinez(a)collabora.co.uk>
Signed-off-by: Kukjin Kim <kgene(a)kernel.org>
Signed-off-by: Guillaume Tucker <guillaume.tucker(a)collabora.com>
Reviewed-by: Krzysztof Kozlowski <krzk(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/configs/exynos_defconfig | 6 ++++++
1 file changed, 6 insertions(+)
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -33,6 +33,10 @@ CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_NET_KEY=y
CONFIG_INET=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+CONFIG_IP_PNP_RARP=y
CONFIG_RFKILL_REGULATOR=y
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
@@ -170,6 +174,8 @@ CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_CRAMFS=y
CONFIG_ROMFS_FS=y
+CONFIG_NFS_FS=y
+CONFIG_ROOT_NFS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
Patches currently in stable-queue which might be from javier.martinez(a)collabora.co.uk are
queue-3.18/arm-exynos_defconfig-enable-options-to-mount-a-rootfs-via-nfs.patch
This is a note to let you know that I've just added the patch titled
KEYS: encrypted: fix buffer overread in valid_master_desc()
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
keys-encrypted-fix-buffer-overread-in-valid_master_desc.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 794b4bc292f5d31739d89c0202c54e7dc9bc3add Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Thu, 8 Jun 2017 14:48:18 +0100
Subject: KEYS: encrypted: fix buffer overread in valid_master_desc()
From: Eric Biggers <ebiggers(a)google.com>
commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream.
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: Mimi Zohar <zohar(a)linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: James Morris <james.l.morris(a)oracle.com>
Signed-off-by: Jin Qian <jinqian(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const cha
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}
/*
Patches currently in stable-queue which might be from ebiggers(a)google.com are
queue-3.18/keys-encrypted-fix-buffer-overread-in-valid_master_desc.patch
This is a note to let you know that I've just added the patch titled
ARM: exynos_defconfig: Enable NFSv4 client
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-exynos_defconfig-enable-nfsv4-client.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1c1fb9b0c89a2506e556114c813a606bc1508d49 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <k.kozlowski(a)samsung.com>
Date: Wed, 25 Nov 2015 13:09:43 +0900
Subject: ARM: exynos_defconfig: Enable NFSv4 client
From: Krzysztof Kozlowski <k.kozlowski(a)samsung.com>
commit 1c1fb9b0c89a2506e556114c813a606bc1508d49 upstream.
NFS client is already enabled (NFS_FS) and by default it enables clients
for version 2 and 3. Enable explicitly the version 4 client to utilize
the newer protocol.
The NFS client is especially useful for testing kernel in automated
environments (network boot with network file system).
Signed-off-by: Krzysztof Kozlowski <k.kozlowski(a)samsung.com>
Reviewed-by: Javier Martinez Canillas <javier(a)osg.samsung.com>
Signed-off-by: Guillaume Tucker <guillaume.tucker(a)collabora.com>
Reviewed-by: Krzysztof Kozlowski <krzk(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/configs/exynos_defconfig | 1 +
1 file changed, 1 insertion(+)
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -175,6 +175,7 @@ CONFIG_TMPFS_POSIX_ACL=y
CONFIG_CRAMFS=y
CONFIG_ROMFS_FS=y
CONFIG_NFS_FS=y
+CONFIG_NFS_V4=y
CONFIG_ROOT_NFS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
Patches currently in stable-queue which might be from k.kozlowski(a)samsung.com are
queue-3.18/arm-exynos_defconfig-enable-nfsv4-client.patch
The patch titled
Subject: locking/qrwlock: include asm/byteorder.h as needed
has been removed from the -mm tree. Its filename was
locking-qrwlock-include-asm-byteorderh-as-needed.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: locking/qrwlock: include asm/byteorder.h as needed
Moving the qrwlock struct definition into a header file introduced a
subtle bug on all little-endian machines, where some files in some
configurations would see the fields in an incorrect order. This was found
by building with an LTO enabled compiler that warns every time we try to
link together files with incompatible data structures.
A second patch changes linux/kconfig.h to always define the symbols, but
this seems to be the root cause of most of the issues, so I'd suggest we
do both.
On a current linux-next kernel, I verified that this header is responsible
for all type mismatches as a result from the endianess confusion.
Link: http://lkml.kernel.org/r/20180202154104.1522809-1-arnd@arndb.de
Fixes: e0d02285f16e ("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Nicolas Pitre <nico(a)linaro.org>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Babu Moger <babu.moger(a)amd.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/asm-generic/qrwlock_types.h | 1 +
1 file changed, 1 insertion(+)
diff -puN include/asm-generic/qrwlock_types.h~locking-qrwlock-include-asm-byteorderh-as-needed include/asm-generic/qrwlock_types.h
--- a/include/asm-generic/qrwlock_types.h~locking-qrwlock-include-asm-byteorderh-as-needed
+++ a/include/asm-generic/qrwlock_types.h
@@ -3,6 +3,7 @@
#define __ASM_GENERIC_QRWLOCK_TYPES_H
#include <linux/types.h>
+#include <asm/byteorder.h>
#include <asm/spinlock_types.h>
/*
_
Patches currently in -mm which might be from arnd(a)arndb.de are
kbuild-always-define-endianess-in-kconfigh.patch
bugh-work-around-gcc-pr82365-in-bug.patch
The patch titled
Subject: pipe: fix off-by-one error when checking buffer limits
has been removed from the -mm tree. Its filename was
pipe-fix-off-by-one-error-when-checking-buffer-limits.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Eric Biggers <ebiggers(a)google.com>
Subject: pipe: fix off-by-one error when checking buffer limits
With pipe-user-pages-hard set to 'N', users were actually only allowed up
to 'N - 1' buffers; and likewise for pipe-user-pages-soft.
Fix this to allow up to 'N' buffers, as would be expected.
Link: http://lkml.kernel.org/r/20180111052902.14409-5-ebiggers3@gmail.com
Fixes: b0b91d18e2e9 ("pipe: fix limit checking in pipe_set_size()")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Acked-by: Willy Tarreau <w(a)1wt.eu>
Acked-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Joe Lawrence <joe.lawrence(a)redhat.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof(a)kernel.org>
Cc: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/pipe.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff -puN fs/pipe.c~pipe-fix-off-by-one-error-when-checking-buffer-limits fs/pipe.c
--- a/fs/pipe.c~pipe-fix-off-by-one-error-when-checking-buffer-limits
+++ a/fs/pipe.c
@@ -605,12 +605,12 @@ static unsigned long account_pipe_buffer
static bool too_many_pipe_buffers_soft(unsigned long user_bufs)
{
- return pipe_user_pages_soft && user_bufs >= pipe_user_pages_soft;
+ return pipe_user_pages_soft && user_bufs > pipe_user_pages_soft;
}
static bool too_many_pipe_buffers_hard(unsigned long user_bufs)
{
- return pipe_user_pages_hard && user_bufs >= pipe_user_pages_hard;
+ return pipe_user_pages_hard && user_bufs > pipe_user_pages_hard;
}
static bool is_unprivileged_user(void)
_
Patches currently in -mm which might be from ebiggers(a)google.com are
The patch titled
Subject: pipe: actually allow root to exceed the pipe buffer limits
has been removed from the -mm tree. Its filename was
pipe-actually-allow-root-to-exceed-the-pipe-buffer-limits.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Eric Biggers <ebiggers(a)google.com>
Subject: pipe: actually allow root to exceed the pipe buffer limits
pipe-user-pages-hard and pipe-user-pages-soft are only supposed to apply
to unprivileged users, as documented in both Documentation/sysctl/fs.txt
and the pipe(7) man page.
However, the capabilities are actually only checked when increasing a
pipe's size using F_SETPIPE_SZ, not when creating a new pipe. Therefore,
if pipe-user-pages-hard has been set, the root user can run into it and be
unable to create pipes. Similarly, if pipe-user-pages-soft has been set,
the root user can run into it and have their pipes limited to 1 page each.
Fix this by allowing the privileged override in both cases.
Link: http://lkml.kernel.org/r/20180111052902.14409-4-ebiggers3@gmail.com
Fixes: 759c01142a5d ("pipe: limit the per-user amount of pages allocated in pipes")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Joe Lawrence <joe.lawrence(a)redhat.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof(a)kernel.org>
Cc: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/pipe.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff -puN fs/pipe.c~pipe-actually-allow-root-to-exceed-the-pipe-buffer-limits fs/pipe.c
--- a/fs/pipe.c~pipe-actually-allow-root-to-exceed-the-pipe-buffer-limits
+++ a/fs/pipe.c
@@ -613,6 +613,11 @@ static bool too_many_pipe_buffers_hard(u
return pipe_user_pages_hard && user_bufs >= pipe_user_pages_hard;
}
+static bool is_unprivileged_user(void)
+{
+ return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
+}
+
struct pipe_inode_info *alloc_pipe_info(void)
{
struct pipe_inode_info *pipe;
@@ -629,12 +634,12 @@ struct pipe_inode_info *alloc_pipe_info(
user_bufs = account_pipe_buffers(user, 0, pipe_bufs);
- if (too_many_pipe_buffers_soft(user_bufs)) {
+ if (too_many_pipe_buffers_soft(user_bufs) && is_unprivileged_user()) {
user_bufs = account_pipe_buffers(user, pipe_bufs, 1);
pipe_bufs = 1;
}
- if (too_many_pipe_buffers_hard(user_bufs))
+ if (too_many_pipe_buffers_hard(user_bufs) && is_unprivileged_user())
goto out_revert_acct;
pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer),
@@ -1065,7 +1070,7 @@ static long pipe_set_size(struct pipe_in
if (nr_pages > pipe->buffers &&
(too_many_pipe_buffers_hard(user_bufs) ||
too_many_pipe_buffers_soft(user_bufs)) &&
- !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) {
+ is_unprivileged_user()) {
ret = -EPERM;
goto out_revert_acct;
}
_
Patches currently in -mm which might be from ebiggers(a)google.com are
The patch titled
Subject: kasan: rework Kconfig settings
has been removed from the -mm tree. Its filename was
kasan-rework-kconfig-settings.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: kasan: rework Kconfig settings
We get a lot of very large stack frames using gcc-7.0.1 with the default
-fsanitize-address-use-after-scope --param asan-stack=1 options, which can
easily cause an overflow of the kernel stack, e.g.
drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes
To reduce this risk, -fsanitize-address-use-after-scope is now split out
into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack frames
that are smaller than 2 kilobytes most of the time on x86_64. An earlier
version of this patch also prevented combining KASAN_EXTRA with
KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.
All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y and
CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can bring
back that default now. KASAN_EXTRA=y still causes lots of warnings but
now defaults to !COMPILE_TEST to disable it in allmodconfig, and it
remains disabled in all other defconfigs since it is a new option. I
arbitrarily raise the warning limit for KASAN_EXTRA to 3072 to reduce the
noise, but an allmodconfig kernel still has around 50 warnings on gcc-7.
I experimented a bit more with smaller stack frames and have another
follow-up series that reduces the warning limit for 64-bit architectures
to 1280 bytes (without CONFIG_KASAN).
With earlier versions of this patch series, I also had patches to address
the warnings we get with KASAN and/or KASAN_EXTRA, using a
"noinline_if_stackbloat" annotation. That annotation now got replaced
with a gcc-8 bugfix (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
older compilers, which means that KASAN_EXTRA is now just as bad as before
and will lead to an instant stack overflow in a few extreme cases.
This reverts parts of commit commit 3f181b4 ("lib/Kconfig.debug: disable
-Wframe-larger-than warnings with KASAN=y"). Two patches in linux-next
should be merged first to avoid introducing warnings in an allmodconfig
build:
3cd890dbe2a4 ("media: dvb-frontends: fix i2c access helpers for KASAN")
16c3ada89cff ("media: r820t: fix r820t_write_reg for KASAN")
Do we really need to backport this?
I think we do: without this patch, enabling KASAN will lead to
unavoidable kernel stack overflow in certain device drivers when built
with gcc-7 or higher on linux-4.10+ or any version that contains a
backport of commit c5caf21ab0cf8. Most people are probably still on
older compilers, but it will get worse over time as they upgrade their
distros.
The warnings we get on kernels older than this should all be for code
that uses dangerously large stack frames, though most of them do not
cause an actual stack overflow by themselves.The asan-stack option was
added in linux-4.0, and commit 3f181b4d8652 ("lib/Kconfig.debug:
disable -Wframe-larger-than warnings with KASAN=y") effectively turned
off the warning for allmodconfig kernels, so I would like to see this
fix backported to any kernels later than 4.0.
I have done dozens of fixes for individual functions with stack frames
larger than 2048 bytes with asan-stack, and I plan to make sure that
all those fixes make it into the stable kernels as well (most are
already there).
Part of the complication here is that asan-stack (from 4.0) was
originally assumed to always require much larger stacks, but that
turned out to be a combination of multiple gcc bugs that we have now
worked around and fixed, but sanitize-address-use-after-scope (from
v4.10) has a much higher inherent stack usage and also suffers from at
least three other problems that we have analyzed but not yet fixed
upstream, each of them makes the stack usage more severe than it should
be.
Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Acked-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/Kconfig.debug | 2 +-
lib/Kconfig.kasan | 11 +++++++++++
scripts/Makefile.kasan | 2 ++
3 files changed, 14 insertions(+), 1 deletion(-)
diff -puN lib/Kconfig.debug~kasan-rework-kconfig-settings lib/Kconfig.debug
--- a/lib/Kconfig.debug~kasan-rework-kconfig-settings
+++ a/lib/Kconfig.debug
@@ -217,7 +217,7 @@ config ENABLE_MUST_CHECK
config FRAME_WARN
int "Warn for stack frames larger than (needs gcc 4.4)"
range 0 8192
- default 0 if KASAN
+ default 3072 if KASAN_EXTRA
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
default 1280 if (!64BIT && PARISC)
default 1024 if (!64BIT && !PARISC)
diff -puN lib/Kconfig.kasan~kasan-rework-kconfig-settings lib/Kconfig.kasan
--- a/lib/Kconfig.kasan~kasan-rework-kconfig-settings
+++ a/lib/Kconfig.kasan
@@ -20,6 +20,17 @@ config KASAN
Currently CONFIG_KASAN doesn't work with CONFIG_DEBUG_SLAB
(the resulting kernel does not boot).
+config KASAN_EXTRA
+ bool "KAsan: extra checks"
+ depends on KASAN && DEBUG_KERNEL && !COMPILE_TEST
+ help
+ This enables further checks in the kernel address sanitizer, for now
+ it only includes the address-use-after-scope check that can lead
+ to excessive kernel stack usage, frame size warnings and longer
+ compile time.
+ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 has more
+
+
choice
prompt "Instrumentation type"
depends on KASAN
diff -puN scripts/Makefile.kasan~kasan-rework-kconfig-settings scripts/Makefile.kasan
--- a/scripts/Makefile.kasan~kasan-rework-kconfig-settings
+++ a/scripts/Makefile.kasan
@@ -38,7 +38,9 @@ else
endif
+ifdef CONFIG_KASAN_EXTRA
CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope)
+endif
CFLAGS_KASAN_NOSANITIZE := -fno-builtin
_
Patches currently in -mm which might be from arnd(a)arndb.de are
kbuild-always-define-endianess-in-kconfigh.patch
bugh-work-around-gcc-pr82365-in-bug.patch
The patch titled
Subject: kernel/relay.c: revert "kernel/relay.c: fix potential memory leak"
has been removed from the -mm tree. Its filename was
revert-kernel-relayc-fix-potential-memory-leak.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Eric Biggers <ebiggers(a)google.com>
Subject: kernel/relay.c: revert "kernel/relay.c: fix potential memory leak"
This reverts ba62bafe942b159a6 ("kernel/relay.c: fix potential memory leak").
This commit introduced a double free bug, because 'chan' is already
freed by the line:
kref_put(&chan->kref, relay_destroy_channel);
This bug was found by syzkaller, using the BLKTRACESETUP ioctl.
Link: http://lkml.kernel.org/r/20180127004759.101823-1-ebiggers3@gmail.com
Fixes: ba62bafe942b ("kernel/relay.c: fix potential memory leak")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Zhouyi Zhou <yizhouzhou(a)ict.ac.cn>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: <stable(a)vger.kernel.org> [4.7+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/relay.c | 1 -
1 file changed, 1 deletion(-)
diff -puN kernel/relay.c~revert-kernel-relayc-fix-potential-memory-leak kernel/relay.c
--- a/kernel/relay.c~revert-kernel-relayc-fix-potential-memory-leak
+++ a/kernel/relay.c
@@ -611,7 +611,6 @@ free_bufs:
kref_put(&chan->kref, relay_destroy_channel);
mutex_unlock(&relay_channels_mutex);
- kfree(chan);
return NULL;
}
EXPORT_SYMBOL_GPL(relay_open);
_
Patches currently in -mm which might be from ebiggers(a)google.com are
The patch titled
Subject: kernel/async.c: revert "async: simplify lowest_in_progress()"
has been removed from the -mm tree. Its filename was
revert-async-simplify-lowest_in_progress.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Subject: kernel/async.c: revert "async: simplify lowest_in_progress()"
This reverts 92266d6ef60c2381 ("async: simplify lowest_in_progress()"),
which was simply wrong: In the case where domain is NULL, we now use the
wrong offsetof() in the list_first_entry macro, so we don't actually fetch
the ->cookie value, but rather the eight bytes located sizeof(struct
list_head) further into the struct async_entry.
On 64 bit, that's the data member, while on 32 bit, that's a u64 built
from func and data in some order.
I think the bug happens to be harmless in practice: It obviously only
affects callers which pass a NULL domain, and AFAICT the only such caller
is
async_synchronize_full() ->
async_synchronize_full_domain(NULL) ->
async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, NULL)
and the ASYNC_COOKIE_MAX means that in practice we end up waiting for the
async_global_pending list to be empty - but it would break if somebody
happened to pass (void*)-1 as the data element to async_schedule, and of
course also if somebody ever does a async_synchronize_cookie_domain(,
NULL) with a "finite" cookie value.
Maybe the "harmless in practice" means this isn't -stable material. But
I'm not completely confident my quick git grep'ing is enough, and there
might be affected code in one of the earlier kernels that has since been
removed, so I'll leave the decision to the stable guys.
Link: http://lkml.kernel.org/r/20171128104938.3921-1-linux@rasmusvillemoes.dk
Fixes: 92266d6ef60c "async: simplify lowest_in_progress()"
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Acked-by: Tejun Heo <tj(a)kernel.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Adam Wallis <awallis(a)codeaurora.org>
Cc: Lai Jiangshan <laijs(a)cn.fujitsu.com>
Cc: <stable(a)vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/async.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff -puN kernel/async.c~revert-async-simplify-lowest_in_progress kernel/async.c
--- a/kernel/async.c~revert-async-simplify-lowest_in_progress
+++ a/kernel/async.c
@@ -84,20 +84,24 @@ static atomic_t entry_count;
static async_cookie_t lowest_in_progress(struct async_domain *domain)
{
- struct list_head *pending;
+ struct async_entry *first = NULL;
async_cookie_t ret = ASYNC_COOKIE_MAX;
unsigned long flags;
spin_lock_irqsave(&async_lock, flags);
- if (domain)
- pending = &domain->pending;
- else
- pending = &async_global_pending;
+ if (domain) {
+ if (!list_empty(&domain->pending))
+ first = list_first_entry(&domain->pending,
+ struct async_entry, domain_list);
+ } else {
+ if (!list_empty(&async_global_pending))
+ first = list_first_entry(&async_global_pending,
+ struct async_entry, global_list);
+ }
- if (!list_empty(pending))
- ret = list_first_entry(pending, struct async_entry,
- domain_list)->cookie;
+ if (first)
+ ret = first->cookie;
spin_unlock_irqrestore(&async_lock, flags);
return ret;
_
Patches currently in -mm which might be from linux(a)rasmusvillemoes.dk are
ida-do-zeroing-in-ida_pre_get.patch
The patch titled
Subject: fs/proc/kcore.c: use probe_kernel_read() instead of memcpy()
has been removed from the -mm tree. Its filename was
fs-proc-kcorec-use-probe_kernel_read-instead-of-memcpy.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Subject: fs/proc/kcore.c: use probe_kernel_read() instead of memcpy()
df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data") added a
bounce buffer to avoid hardened usercopy checks. Copying to the bounce
buffer was implemented with a simple memcpy() assuming that it is always
valid to read from kernel memory iff the kern_addr_valid() check passed.
A simple, but pointless, test case like "dd if=/proc/kcore of=/dev/null"
now can easily crash the kernel, since the former execption handling on
invalid kernel addresses now doesn't work anymore.
Also adding a kern_addr_valid() implementation wouldn't help here. Most
architectures simply return 1 here, while a couple implemented a page
table walk to figure out if something is mapped at the address in
question.
With DEBUG_PAGEALLOC active mappings are established and removed all the
time, so that relying on the result of kern_addr_valid() before executing
the memcpy() also doesn't work.
Therefore simply use probe_kernel_read() to copy to the bounce buffer.
This also allows to simplify read_kcore().
At least on s390 this fixes the observed crashes and doesn't introduce
warnings that were removed with df04abfd181a ("fs/proc/kcore.c: Add bounce
buffer for ktext data"), even though the generic probe_kernel_read()
implementation uses uaccess functions.
While looking into this I'm also wondering if kern_addr_valid() could be
completely removed...(?)
Link: http://lkml.kernel.org/r/20171202132739.99971-1-heiko.carstens@de.ibm.com
Fixes: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Al Viro <viro(a)ZenIV.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/kcore.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff -puN fs/proc/kcore.c~fs-proc-kcorec-use-probe_kernel_read-instead-of-memcpy fs/proc/kcore.c
--- a/fs/proc/kcore.c~fs-proc-kcorec-use-probe_kernel_read-instead-of-memcpy
+++ a/fs/proc/kcore.c
@@ -512,23 +512,15 @@ read_kcore(struct file *file, char __use
return -EFAULT;
} else {
if (kern_addr_valid(start)) {
- unsigned long n;
-
/*
* Using bounce buffer to bypass the
* hardened user copy kernel text checks.
*/
- memcpy(buf, (char *) start, tsz);
- n = copy_to_user(buffer, buf, tsz);
- /*
- * We cannot distinguish between fault on source
- * and fault on destination. When this happens
- * we clear too and hope it will trigger the
- * EFAULT again.
- */
- if (n) {
- if (clear_user(buffer + tsz - n,
- n))
+ if (probe_kernel_read(buf, (void *) start, tsz)) {
+ if (clear_user(buffer, tsz))
+ return -EFAULT;
+ } else {
+ if (copy_to_user(buffer, buf, tsz))
return -EFAULT;
}
} else {
_
Patches currently in -mm which might be from heiko.carstens(a)de.ibm.com are
The patch titled
Subject: kasan: don't emit builtin calls when sanitization is off
has been removed from the -mm tree. Its filename was
kasan-dont-emit-builtin-calls-when-sanitization-is-off.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Andrey Konovalov <andreyknvl(a)google.com>
Subject: kasan: don't emit builtin calls when sanitization is off
With KASAN enabled the kernel has two different memset() functions, one
with KASAN checks (memset) and one without (__memset). KASAN uses some
macro tricks to use the proper version where required. For example
memset() calls in mm/slub.c are without KASAN checks, since they operate
on poisoned slab object metadata.
The issue is that clang emits memset() calls even when there is no
memset() in the source code. They get linked with improper memset()
implementation and the kernel fails to boot due to a huge amount of KASAN
reports during early boot stages.
The solution is to add -fno-builtin flag for files with KASAN_SANITIZE :=
n marker.
Link: http://lkml.kernel.org/r/8ffecfffe04088c52c42b92739c2bd8a0bcb3f5e.151638459…
Signed-off-by: Andrey Konovalov <andreyknvl(a)google.com>
Acked-by: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Cc: Michal Marek <michal.lkml(a)markovi.net>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
Makefile | 3 ++-
scripts/Makefile.kasan | 3 +++
scripts/Makefile.lib | 2 +-
3 files changed, 6 insertions(+), 2 deletions(-)
diff -puN Makefile~kasan-dont-emit-builtin-calls-when-sanitization-is-off Makefile
--- a/Makefile~kasan-dont-emit-builtin-calls-when-sanitization-is-off
+++ a/Makefile
@@ -434,7 +434,8 @@ export MAKE LEX YACC AWK GENKSYMS INSTAL
export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
-export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_KASAN CFLAGS_UBSAN
+export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE
+export CFLAGS_KASAN CFLAGS_KASAN_NOSANITIZE CFLAGS_UBSAN
export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
diff -puN scripts/Makefile.kasan~kasan-dont-emit-builtin-calls-when-sanitization-is-off scripts/Makefile.kasan
--- a/scripts/Makefile.kasan~kasan-dont-emit-builtin-calls-when-sanitization-is-off
+++ a/scripts/Makefile.kasan
@@ -31,4 +31,7 @@ else
endif
CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope)
+
+CFLAGS_KASAN_NOSANITIZE := -fno-builtin
+
endif
diff -puN scripts/Makefile.lib~kasan-dont-emit-builtin-calls-when-sanitization-is-off scripts/Makefile.lib
--- a/scripts/Makefile.lib~kasan-dont-emit-builtin-calls-when-sanitization-is-off
+++ a/scripts/Makefile.lib
@@ -121,7 +121,7 @@ endif
ifeq ($(CONFIG_KASAN),y)
_c_flags += $(if $(patsubst n%,, \
$(KASAN_SANITIZE_$(basetarget).o)$(KASAN_SANITIZE)y), \
- $(CFLAGS_KASAN))
+ $(CFLAGS_KASAN), $(CFLAGS_KASAN_NOSANITIZE))
endif
ifeq ($(CONFIG_UBSAN),y)
_
Patches currently in -mm which might be from andreyknvl(a)google.com are
This is a note to let you know that I've just added the patch titled
tcp: release sk_frag.page in tcp_disconnect
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-release-sk_frag.page-in-tcp_disconnect.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:38:15 PST 2018
From: Li RongQing <lirongqing(a)baidu.com>
Date: Fri, 26 Jan 2018 16:40:41 +0800
Subject: tcp: release sk_frag.page in tcp_disconnect
From: Li RongQing <lirongqing(a)baidu.com>
[ Upstream commit 9b42d55a66d388e4dd5550107df051a9637564fc ]
socket can be disconnected and gets transformed back to a listening
socket, if sk_frag.page is not released, which will be cloned into
a new socket by sk_clone_lock, but the reference count of this page
is increased, lead to a use after free or double free issue
Signed-off-by: Li RongQing <lirongqing(a)baidu.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2280,6 +2280,12 @@ int tcp_disconnect(struct sock *sk, int
WARN_ON(inet->inet_num && !icsk->icsk_bind_hash);
+ if (sk->sk_frag.page) {
+ put_page(sk->sk_frag.page);
+ sk->sk_frag.page = NULL;
+ sk->sk_frag.offset = 0;
+ }
+
sk->sk_error_report(sk);
return err;
}
Patches currently in stable-queue which might be from lirongqing(a)baidu.com are
queue-3.18/tcp-release-sk_frag.page-in-tcp_disconnect.patch
This is a note to let you know that I've just added the patch titled
r8169: fix RTL8168EP take too long to complete driver initialization.
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:38:15 PST 2018
From: Chunhao Lin <hau(a)realtek.com>
Date: Wed, 31 Jan 2018 01:32:36 +0800
Subject: r8169: fix RTL8168EP take too long to complete driver initialization.
From: Chunhao Lin <hau(a)realtek.com>
[ Upstream commit 086ca23d03c0d2f4088f472386778d293e15c5f6 ]
Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver
waiting until timeout.
Fix this by waiting for the right register bit.
Signed-off-by: Chunhao Lin <hau(a)realtek.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/realtek/r8169.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1375,7 +1375,7 @@ DECLARE_RTL_COND(rtl_ocp_tx_cond)
{
void __iomem *ioaddr = tp->mmio_addr;
- return RTL_R8(IBISR0) & 0x02;
+ return RTL_R8(IBISR0) & 0x20;
}
static void rtl8168dp_driver_start(struct rtl8169_private *tp)
@@ -1421,7 +1421,7 @@ static void rtl8168ep_driver_stop(struct
void __iomem *ioaddr = tp->mmio_addr;
RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
- rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
+ rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
ocp_write(tp, 0x01, 0x180, OOB_CMD_DRIVER_STOP);
Patches currently in stable-queue which might be from hau(a)realtek.com are
queue-3.18/r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
This is a note to let you know that I've just added the patch titled
net: igmp: add a missing rcu locking section
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-igmp-add-a-missing-rcu-locking-section.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:38:15 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 1 Feb 2018 10:26:57 -0800
Subject: net: igmp: add a missing rcu locking section
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ]
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
Timer callbacks do not ensure this locking.
=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/igmp.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -384,7 +384,11 @@ static struct sk_buff *igmpv3_newpack(st
pip->frag_off = htons(IP_DF);
pip->ttl = 1;
pip->daddr = fl4.daddr;
+
+ rcu_read_lock();
pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
+ rcu_read_unlock();
+
pip->protocol = IPPROTO_IGMP;
pip->tot_len = 0; /* filled in later */
ip_select_ident(skb, NULL);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-3.18/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-3.18/net-igmp-add-a-missing-rcu-locking-section.patch
This is a note to let you know that I've just added the patch titled
ip6mr: fix stale iterator
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ip6mr-fix-stale-iterator.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:38:15 PST 2018
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Date: Wed, 31 Jan 2018 16:29:30 +0200
Subject: ip6mr: fix stale iterator
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
[ Upstream commit 4adfa79fc254efb7b0eb3cd58f62c2c3f805f1ba ]
When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.
Here's syzbot's trace:
WARNING: bad unlock balance detected!
4.15.0-rc3+ #128 Not tainted
syzkaller971460/3195 is trying to release lock (mrt_lock) at:
[<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syzkaller971460/3195:
#0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
fs/seq_file.c:165
stack backtrace:
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
__lock_release kernel/locking/lockdep.c:3775 [inline]
lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
__raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
_raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
traverse+0x3bc/0xa00 fs/seq_file.c:135
seq_read+0x96a/0x13d0 fs/seq_file.c:189
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at lib/usercopy.c:25
in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
INFO: lockdep is turned off.
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
__might_sleep+0x95/0x190 kernel/sched/core.c:6013
__might_fault+0xab/0x1d0 mm/memory.c:4525
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
seq_read+0xcb4/0x13d0 fs/seq_file.c:279
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
lib/usercopy.c:26
Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669(a)syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6mr.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -498,6 +498,7 @@ static void *ipmr_mfc_seq_start(struct s
return ERR_PTR(-ENOENT);
it->mrt = mrt;
+ it->cache = NULL;
return *pos ? ipmr_mfc_seq_idx(net, seq->private, *pos - 1)
: SEQ_START_TOKEN;
}
Patches currently in stable-queue which might be from nikolay(a)cumulusnetworks.com are
queue-3.18/ip6mr-fix-stale-iterator.patch
This is a note to let you know that I've just added the patch titled
vhost_net: stop device during reset owner
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vhost_net-stop-device-during-reset-owner.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:22:35 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Thu, 25 Jan 2018 22:03:52 +0800
Subject: vhost_net: stop device during reset owner
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ]
We don't stop device before reset owner, this means we could try to
serve any virtqueue kick before reset dev->worker. This will result a
warn since the work was pending at llist during owner resetting. Fix
this by stopping device during owner reset.
Reported-by: syzbot+eb17c6162478cc50632c(a)syzkaller.appspotmail.com
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/vhost/net.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -981,6 +981,7 @@ static long vhost_net_reset_owner(struct
}
vhost_net_stop(n, &tx_sock, &rx_sock);
vhost_net_flush(n);
+ vhost_dev_stop(&n->dev);
vhost_dev_reset_owner(&n->dev, memory);
vhost_net_vq_reset(n);
done:
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-4.4/vhost_net-stop-device-during-reset-owner.patch
This is a note to let you know that I've just added the patch titled
vhost_net: stop device during reset owner
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vhost_net-stop-device-during-reset-owner.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:22:35 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Thu, 25 Jan 2018 22:03:52 +0800
Subject: vhost_net: stop device during reset owner
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ]
We don't stop device before reset owner, this means we could try to
serve any virtqueue kick before reset dev->worker. This will result a
warn since the work was pending at llist during owner resetting. Fix
this by stopping device during owner reset.
Reported-by: syzbot+eb17c6162478cc50632c(a)syzkaller.appspotmail.com
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/vhost/net.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1009,6 +1009,7 @@ static long vhost_net_reset_owner(struct
}
vhost_net_stop(n, &tx_sock, &rx_sock);
vhost_net_flush(n);
+ vhost_dev_stop(&n->dev);
vhost_dev_reset_owner(&n->dev, memory);
vhost_net_vq_reset(n);
done:
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-3.18/vhost_net-stop-device-during-reset-owner.patch
This is a note to let you know that I've just added the patch titled
r8169: fix RTL8168EP take too long to complete driver initialization.
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:22:35 PST 2018
From: Chunhao Lin <hau(a)realtek.com>
Date: Wed, 31 Jan 2018 01:32:36 +0800
Subject: r8169: fix RTL8168EP take too long to complete driver initialization.
From: Chunhao Lin <hau(a)realtek.com>
[ Upstream commit 086ca23d03c0d2f4088f472386778d293e15c5f6 ]
Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver
waiting until timeout.
Fix this by waiting for the right register bit.
Signed-off-by: Chunhao Lin <hau(a)realtek.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/realtek/r8169.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1387,7 +1387,7 @@ DECLARE_RTL_COND(rtl_ocp_tx_cond)
{
void __iomem *ioaddr = tp->mmio_addr;
- return RTL_R8(IBISR0) & 0x02;
+ return RTL_R8(IBISR0) & 0x20;
}
static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
@@ -1395,7 +1395,7 @@ static void rtl8168ep_stop_cmac(struct r
void __iomem *ioaddr = tp->mmio_addr;
RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
- rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
+ rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
}
Patches currently in stable-queue which might be from hau(a)realtek.com are
queue-4.4/r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
This is a note to let you know that I've just added the patch titled
tcp: release sk_frag.page in tcp_disconnect
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-release-sk_frag.page-in-tcp_disconnect.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:22:35 PST 2018
From: Li RongQing <lirongqing(a)baidu.com>
Date: Fri, 26 Jan 2018 16:40:41 +0800
Subject: tcp: release sk_frag.page in tcp_disconnect
From: Li RongQing <lirongqing(a)baidu.com>
[ Upstream commit 9b42d55a66d388e4dd5550107df051a9637564fc ]
socket can be disconnected and gets transformed back to a listening
socket, if sk_frag.page is not released, which will be cloned into
a new socket by sk_clone_lock, but the reference count of this page
is increased, lead to a use after free or double free issue
Signed-off-by: Li RongQing <lirongqing(a)baidu.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2276,6 +2276,12 @@ int tcp_disconnect(struct sock *sk, int
WARN_ON(inet->inet_num && !icsk->icsk_bind_hash);
+ if (sk->sk_frag.page) {
+ put_page(sk->sk_frag.page);
+ sk->sk_frag.page = NULL;
+ sk->sk_frag.offset = 0;
+ }
+
sk->sk_error_report(sk);
return err;
}
Patches currently in stable-queue which might be from lirongqing(a)baidu.com are
queue-4.4/tcp-release-sk_frag.page-in-tcp_disconnect.patch
This is a note to let you know that I've just added the patch titled
ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:22:35 PST 2018
From: Martin KaFai Lau <kafai(a)fb.com>
Date: Wed, 24 Jan 2018 23:15:27 -0800
Subject: ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
From: Martin KaFai Lau <kafai(a)fb.com>
[ Upstream commit 7ece54a60ee2ba7a386308cae73c790bd580589c ]
If a sk_v6_rcv_saddr is !IPV6_ADDR_ANY and !IPV6_ADDR_MAPPED, it
implicitly implies it is an ipv6only socket. However, in inet6_bind(),
this addr_type checking and setting sk->sk_ipv6only to 1 are only done
after sk->sk_prot->get_port(sk, snum) has been completed successfully.
This inconsistency between sk_v6_rcv_saddr and sk_ipv6only confuses
the 'get_port()'.
In particular, when binding SO_REUSEPORT UDP sockets,
udp_reuseport_add_sock(sk,...) is called. udp_reuseport_add_sock()
checks "ipv6_only_sock(sk2) == ipv6_only_sock(sk)" before adding sk to
sk2->sk_reuseport_cb. In this case, ipv6_only_sock(sk2) could be
1 while ipv6_only_sock(sk) is still 0 here. The end result is,
reuseport_alloc(sk) is called instead of adding sk to the existing
sk2->sk_reuseport_cb.
It can be reproduced by binding two SO_REUSEPORT UDP sockets on an
IPv6 address (!ANY and !MAPPED). Only one of the socket will
receive packet.
The fix is to set the implicit sk_ipv6only before calling get_port().
The original sk_ipv6only has to be saved such that it can be restored
in case get_port() failed. The situation is similar to the
inet_reset_saddr(sk) after get_port() has failed.
Thanks to Calvin Owens <calvinowens(a)fb.com> who created an easy
reproduction which leads to a fix.
Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection")
Signed-off-by: Martin KaFai Lau <kafai(a)fb.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/af_inet6.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -261,6 +261,7 @@ int inet6_bind(struct socket *sock, stru
struct net *net = sock_net(sk);
__be32 v4addr = 0;
unsigned short snum;
+ bool saved_ipv6only;
int addr_type = 0;
int err = 0;
@@ -365,19 +366,21 @@ int inet6_bind(struct socket *sock, stru
if (!(addr_type & IPV6_ADDR_MULTICAST))
np->saddr = addr->sin6_addr;
+ saved_ipv6only = sk->sk_ipv6only;
+ if (addr_type != IPV6_ADDR_ANY && addr_type != IPV6_ADDR_MAPPED)
+ sk->sk_ipv6only = 1;
+
/* Make sure we are allowed to bind here. */
if ((snum || !inet->bind_address_no_port) &&
sk->sk_prot->get_port(sk, snum)) {
+ sk->sk_ipv6only = saved_ipv6only;
inet_reset_saddr(sk);
err = -EADDRINUSE;
goto out;
}
- if (addr_type != IPV6_ADDR_ANY) {
+ if (addr_type != IPV6_ADDR_ANY)
sk->sk_userlocks |= SOCK_BINDADDR_LOCK;
- if (addr_type != IPV6_ADDR_MAPPED)
- sk->sk_ipv6only = 1;
- }
if (snum)
sk->sk_userlocks |= SOCK_BINDPORT_LOCK;
inet->inet_sport = htons(inet->inet_num);
Patches currently in stable-queue which might be from kafai(a)fb.com are
queue-4.4/ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
This is a note to let you know that I've just added the patch titled
net: igmp: add a missing rcu locking section
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-igmp-add-a-missing-rcu-locking-section.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:22:35 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 1 Feb 2018 10:26:57 -0800
Subject: net: igmp: add a missing rcu locking section
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ]
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
Timer callbacks do not ensure this locking.
=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/igmp.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -392,7 +392,11 @@ static struct sk_buff *igmpv3_newpack(st
pip->frag_off = htons(IP_DF);
pip->ttl = 1;
pip->daddr = fl4.daddr;
+
+ rcu_read_lock();
pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
+ rcu_read_unlock();
+
pip->protocol = IPPROTO_IGMP;
pip->tot_len = 0; /* filled in later */
ip_select_ident(net, skb, NULL);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.4/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-4.4/net-igmp-add-a-missing-rcu-locking-section.patch
This is a note to let you know that I've just added the patch titled
ip6mr: fix stale iterator
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ip6mr-fix-stale-iterator.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:22:35 PST 2018
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Date: Wed, 31 Jan 2018 16:29:30 +0200
Subject: ip6mr: fix stale iterator
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
[ Upstream commit 4adfa79fc254efb7b0eb3cd58f62c2c3f805f1ba ]
When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.
Here's syzbot's trace:
WARNING: bad unlock balance detected!
4.15.0-rc3+ #128 Not tainted
syzkaller971460/3195 is trying to release lock (mrt_lock) at:
[<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syzkaller971460/3195:
#0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
fs/seq_file.c:165
stack backtrace:
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
__lock_release kernel/locking/lockdep.c:3775 [inline]
lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
__raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
_raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
traverse+0x3bc/0xa00 fs/seq_file.c:135
seq_read+0x96a/0x13d0 fs/seq_file.c:189
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at lib/usercopy.c:25
in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
INFO: lockdep is turned off.
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
__might_sleep+0x95/0x190 kernel/sched/core.c:6013
__might_fault+0xab/0x1d0 mm/memory.c:4525
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
seq_read+0xcb4/0x13d0 fs/seq_file.c:279
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
lib/usercopy.c:26
Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669(a)syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6mr.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -495,6 +495,7 @@ static void *ipmr_mfc_seq_start(struct s
return ERR_PTR(-ENOENT);
it->mrt = mrt;
+ it->cache = NULL;
return *pos ? ipmr_mfc_seq_idx(net, seq->private, *pos - 1)
: SEQ_START_TOKEN;
}
Patches currently in stable-queue which might be from nikolay(a)cumulusnetworks.com are
queue-4.4/ip6mr-fix-stale-iterator.patch
This is a note to let you know that I've just added the patch titled
vhost_net: stop device during reset owner
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vhost_net-stop-device-during-reset-owner.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Thu, 25 Jan 2018 22:03:52 +0800
Subject: vhost_net: stop device during reset owner
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ]
We don't stop device before reset owner, this means we could try to
serve any virtqueue kick before reset dev->worker. This will result a
warn since the work was pending at llist during owner resetting. Fix
this by stopping device during owner reset.
Reported-by: syzbot+eb17c6162478cc50632c(a)syzkaller.appspotmail.com
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/vhost/net.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1212,6 +1212,7 @@ static long vhost_net_reset_owner(struct
}
vhost_net_stop(n, &tx_sock, &rx_sock);
vhost_net_flush(n);
+ vhost_dev_stop(&n->dev);
vhost_dev_reset_owner(&n->dev, umem);
vhost_net_vq_reset(n);
done:
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-4.14/vhost_net-stop-device-during-reset-owner.patch
This is a note to let you know that I've just added the patch titled
tcp_bbr: fix pacing_gain to always be unity when using lt_bw
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp_bbr-fix-pacing_gain-to-always-be-unity-when-using-lt_bw.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Neal Cardwell <ncardwell(a)google.com>
Date: Wed, 31 Jan 2018 15:43:05 -0500
Subject: tcp_bbr: fix pacing_gain to always be unity when using lt_bw
From: Neal Cardwell <ncardwell(a)google.com>
[ Upstream commit 3aff3b4b986e51bcf4ab249e5d48d39596e0df6a ]
This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when
using lt_bw and returning from the PROBE_RTT state to PROBE_BW.
Previously, when using lt_bw, upon exiting PROBE_RTT and entering
PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly
end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase()
set a pacing gain above 1.0. In such cases this would result in a
pacing rate that is 1.25x higher than intended, potentially resulting
in a high loss rate for a little while until we stop using the lt_bw a
bit later.
This commit is a stable candidate for kernels back as far as 4.9.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: Yuchung Cheng <ycheng(a)google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Reported-by: Beyers Cronje <bcronje(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_bbr.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -481,7 +481,8 @@ static void bbr_advance_cycle_phase(stru
bbr->cycle_idx = (bbr->cycle_idx + 1) & (CYCLE_LEN - 1);
bbr->cycle_mstamp = tp->delivered_mstamp;
- bbr->pacing_gain = bbr_pacing_gain[bbr->cycle_idx];
+ bbr->pacing_gain = bbr->lt_use_bw ? BBR_UNIT :
+ bbr_pacing_gain[bbr->cycle_idx];
}
/* Gain cycling: cycle pacing gain to converge to fair share of available bw. */
@@ -490,8 +491,7 @@ static void bbr_update_cycle_phase(struc
{
struct bbr *bbr = inet_csk_ca(sk);
- if ((bbr->mode == BBR_PROBE_BW) && !bbr->lt_use_bw &&
- bbr_is_next_cycle_phase(sk, rs))
+ if (bbr->mode == BBR_PROBE_BW && bbr_is_next_cycle_phase(sk, rs))
bbr_advance_cycle_phase(sk);
}
Patches currently in stable-queue which might be from ncardwell(a)google.com are
queue-4.14/tcp_bbr-fix-pacing_gain-to-always-be-unity-when-using-lt_bw.patch
This is a note to let you know that I've just added the patch titled
soreuseport: fix mem leak in reuseport_add_sock()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Fri, 2 Feb 2018 10:27:27 -0800
Subject: soreuseport: fix mem leak in reuseport_add_sock()
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 4db428a7c9ab07e08783e0fcdc4ca0f555da0567 ]
reuseport_add_sock() needs to deal with attaching a socket having
its own sk_reuseport_cb, after a prior
setsockopt(SO_ATTACH_REUSEPORT_?BPF)
Without this fix, not only a WARN_ONCE() was issued, but we were also
leaking memory.
Thanks to sysbot and Eric Biggers for providing us nice C repros.
------------[ cut here ]------------
socket already in reuseport group
WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119
reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x211/0x2d0 lib/bug.c:184
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079
Fixes: ef456144da8e ("soreuseport: define reuseport groups")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot+c0ea2226f77a42936bf7(a)syzkaller.appspotmail.com
Acked-by: Craig Gallek <kraig(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/sock_reuseport.c | 35 ++++++++++++++++++++---------------
1 file changed, 20 insertions(+), 15 deletions(-)
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -94,6 +94,16 @@ static struct sock_reuseport *reuseport_
return more_reuse;
}
+static void reuseport_free_rcu(struct rcu_head *head)
+{
+ struct sock_reuseport *reuse;
+
+ reuse = container_of(head, struct sock_reuseport, rcu);
+ if (reuse->prog)
+ bpf_prog_destroy(reuse->prog);
+ kfree(reuse);
+}
+
/**
* reuseport_add_sock - Add a socket to the reuseport group of another.
* @sk: New socket to add to the group.
@@ -102,7 +112,7 @@ static struct sock_reuseport *reuseport_
*/
int reuseport_add_sock(struct sock *sk, struct sock *sk2)
{
- struct sock_reuseport *reuse;
+ struct sock_reuseport *old_reuse, *reuse;
if (!rcu_access_pointer(sk2->sk_reuseport_cb)) {
int err = reuseport_alloc(sk2);
@@ -113,10 +123,13 @@ int reuseport_add_sock(struct sock *sk,
spin_lock_bh(&reuseport_lock);
reuse = rcu_dereference_protected(sk2->sk_reuseport_cb,
- lockdep_is_held(&reuseport_lock)),
- WARN_ONCE(rcu_dereference_protected(sk->sk_reuseport_cb,
- lockdep_is_held(&reuseport_lock)),
- "socket already in reuseport group");
+ lockdep_is_held(&reuseport_lock));
+ old_reuse = rcu_dereference_protected(sk->sk_reuseport_cb,
+ lockdep_is_held(&reuseport_lock));
+ if (old_reuse && old_reuse->num_socks != 1) {
+ spin_unlock_bh(&reuseport_lock);
+ return -EBUSY;
+ }
if (reuse->num_socks == reuse->max_socks) {
reuse = reuseport_grow(reuse);
@@ -134,19 +147,11 @@ int reuseport_add_sock(struct sock *sk,
spin_unlock_bh(&reuseport_lock);
+ if (old_reuse)
+ call_rcu(&old_reuse->rcu, reuseport_free_rcu);
return 0;
}
-static void reuseport_free_rcu(struct rcu_head *head)
-{
- struct sock_reuseport *reuse;
-
- reuse = container_of(head, struct sock_reuseport, rcu);
- if (reuse->prog)
- bpf_prog_destroy(reuse->prog);
- kfree(reuse);
-}
-
void reuseport_detach_sock(struct sock *sk)
{
struct sock_reuseport *reuse;
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.14/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-4.14/net-igmp-add-a-missing-rcu-locking-section.patch
queue-4.14/soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
queue-4.14/revert-defer-call-to-mem_cgroup_sk_alloc.patch
This is a note to let you know that I've just added the patch titled
tcp: release sk_frag.page in tcp_disconnect
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-release-sk_frag.page-in-tcp_disconnect.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Li RongQing <lirongqing(a)baidu.com>
Date: Fri, 26 Jan 2018 16:40:41 +0800
Subject: tcp: release sk_frag.page in tcp_disconnect
From: Li RongQing <lirongqing(a)baidu.com>
[ Upstream commit 9b42d55a66d388e4dd5550107df051a9637564fc ]
socket can be disconnected and gets transformed back to a listening
socket, if sk_frag.page is not released, which will be cloned into
a new socket by sk_clone_lock, but the reference count of this page
is increased, lead to a use after free or double free issue
Signed-off-by: Li RongQing <lirongqing(a)baidu.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2379,6 +2379,12 @@ int tcp_disconnect(struct sock *sk, int
WARN_ON(inet->inet_num && !icsk->icsk_bind_hash);
+ if (sk->sk_frag.page) {
+ put_page(sk->sk_frag.page);
+ sk->sk_frag.page = NULL;
+ sk->sk_frag.offset = 0;
+ }
+
sk->sk_error_report(sk);
return err;
}
Patches currently in stable-queue which might be from lirongqing(a)baidu.com are
queue-4.14/tcp-release-sk_frag.page-in-tcp_disconnect.patch
This is a note to let you know that I've just added the patch titled
Revert "defer call to mem_cgroup_sk_alloc()"
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
revert-defer-call-to-mem_cgroup_sk_alloc.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Roman Gushchin <guro(a)fb.com>
Date: Fri, 2 Feb 2018 15:26:57 +0000
Subject: Revert "defer call to mem_cgroup_sk_alloc()"
From: Roman Gushchin <guro(a)fb.com>
[ Upstream commit edbe69ef2c90fc86998a74b08319a01c508bd497 ]
This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol:
defer call to mem_cgroup_sk_alloc()").
Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
memcg socket memory accounting, as packets received before memcg
pointer initialization are not accounted and are causing refcounting
underflow on socket release.
Actually the free-after-use problem was fixed by
commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in
sk_clone_lock()") for the cgroup pointer.
So, let's revert it and call mem_cgroup_sk_alloc() just before
cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
we're cloning, and it holds a reference to the memcg.
Also, let's drop BUG_ON(mem_cgroup_is_root()) check from
mem_cgroup_sk_alloc(). I see no reasons why bumping the root
memcg counter is a good reason to panic, and there are no realistic
ways to hit it.
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Tejun Heo <tj(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/memcontrol.c | 14 ++++++++++++++
net/core/sock.c | 5 +----
net/ipv4/inet_connection_sock.c | 1 -
3 files changed, 15 insertions(+), 5 deletions(-)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5828,6 +5828,20 @@ void mem_cgroup_sk_alloc(struct sock *sk
if (!mem_cgroup_sockets_enabled)
return;
+ /*
+ * Socket cloning can throw us here with sk_memcg already
+ * filled. It won't however, necessarily happen from
+ * process context. So the test for root memcg given
+ * the current task's memcg won't help us in this case.
+ *
+ * Respecting the original socket's memcg is a better
+ * decision in this case.
+ */
+ if (sk->sk_memcg) {
+ css_get(&sk->sk_memcg->css);
+ return;
+ }
+
rcu_read_lock();
memcg = mem_cgroup_from_task(current);
if (memcg == root_mem_cgroup)
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1677,16 +1677,13 @@ struct sock *sk_clone_lock(const struct
newsk->sk_dst_pending_confirm = 0;
newsk->sk_wmem_queued = 0;
newsk->sk_forward_alloc = 0;
-
- /* sk->sk_memcg will be populated at accept() time */
- newsk->sk_memcg = NULL;
-
atomic_set(&newsk->sk_drops, 0);
newsk->sk_send_head = NULL;
newsk->sk_userlocks = sk->sk_userlocks & ~SOCK_BINDPORT_LOCK;
atomic_set(&newsk->sk_zckey, 0);
sock_reset_flag(newsk, SOCK_DONE);
+ mem_cgroup_sk_alloc(newsk);
cgroup_sk_alloc(&newsk->sk_cgrp_data);
rcu_read_lock();
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -475,7 +475,6 @@ struct sock *inet_csk_accept(struct sock
}
spin_unlock_bh(&queue->fastopenq.lock);
}
- mem_cgroup_sk_alloc(newsk);
out:
release_sock(sk);
if (req)
Patches currently in stable-queue which might be from guro(a)fb.com are
queue-4.14/revert-defer-call-to-mem_cgroup_sk_alloc.patch
This is a note to let you know that I've just added the patch titled
rocker: fix possible null pointer dereference in rocker_router_fib_event_work
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rocker-fix-possible-null-pointer-dereference-in-rocker_router_fib_event_work.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Jiri Pirko <jiri(a)mellanox.com>
Date: Thu, 1 Feb 2018 12:21:15 +0100
Subject: rocker: fix possible null pointer dereference in rocker_router_fib_event_work
From: Jiri Pirko <jiri(a)mellanox.com>
[ Upstream commit a83165f00f16c0e0ef5b7cec3cbd0d4788699265 ]
Currently, rocker user may experience following null pointer
derefence bug:
[ 3.062141] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[ 3.065163] IP: rocker_router_fib_event_work+0x36/0x110 [rocker]
The problem is uninitialized rocker->wops pointer that is initialized
only with the first initialized port. So move the port initialization
before registering the fib events.
Fixes: 936bd486564a ("rocker: use FIB notifications instead of switchdev calls")
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/rocker/rocker_main.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -2902,6 +2902,12 @@ static int rocker_probe(struct pci_dev *
goto err_alloc_ordered_workqueue;
}
+ err = rocker_probe_ports(rocker);
+ if (err) {
+ dev_err(&pdev->dev, "failed to probe ports\n");
+ goto err_probe_ports;
+ }
+
/* Only FIBs pointing to our own netdevs are programmed into
* the device, so no need to pass a callback.
*/
@@ -2918,22 +2924,16 @@ static int rocker_probe(struct pci_dev *
rocker->hw.id = rocker_read64(rocker, SWITCH_ID);
- err = rocker_probe_ports(rocker);
- if (err) {
- dev_err(&pdev->dev, "failed to probe ports\n");
- goto err_probe_ports;
- }
-
dev_info(&pdev->dev, "Rocker switch with id %*phN\n",
(int)sizeof(rocker->hw.id), &rocker->hw.id);
return 0;
-err_probe_ports:
- unregister_switchdev_notifier(&rocker_switchdev_notifier);
err_register_switchdev_notifier:
unregister_fib_notifier(&rocker->fib_nb);
err_register_fib_notifier:
+ rocker_remove_ports(rocker);
+err_probe_ports:
destroy_workqueue(rocker->rocker_owq);
err_alloc_ordered_workqueue:
free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_EVENT), rocker);
@@ -2961,9 +2961,9 @@ static void rocker_remove(struct pci_dev
{
struct rocker *rocker = pci_get_drvdata(pdev);
- rocker_remove_ports(rocker);
unregister_switchdev_notifier(&rocker_switchdev_notifier);
unregister_fib_notifier(&rocker->fib_nb);
+ rocker_remove_ports(rocker);
rocker_write32(rocker, CONTROL, ROCKER_CONTROL_RESET);
destroy_workqueue(rocker->rocker_owq);
free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_EVENT), rocker);
Patches currently in stable-queue which might be from jiri(a)mellanox.com are
queue-4.14/rocker-fix-possible-null-pointer-dereference-in-rocker_router_fib_event_work.patch
This is a note to let you know that I've just added the patch titled
r8169: fix RTL8168EP take too long to complete driver initialization.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Chunhao Lin <hau(a)realtek.com>
Date: Wed, 31 Jan 2018 01:32:36 +0800
Subject: r8169: fix RTL8168EP take too long to complete driver initialization.
From: Chunhao Lin <hau(a)realtek.com>
[ Upstream commit 086ca23d03c0d2f4088f472386778d293e15c5f6 ]
Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver
waiting until timeout.
Fix this by waiting for the right register bit.
Signed-off-by: Chunhao Lin <hau(a)realtek.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/realtek/r8169.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1388,7 +1388,7 @@ DECLARE_RTL_COND(rtl_ocp_tx_cond)
{
void __iomem *ioaddr = tp->mmio_addr;
- return RTL_R8(IBISR0) & 0x02;
+ return RTL_R8(IBISR0) & 0x20;
}
static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
@@ -1396,7 +1396,7 @@ static void rtl8168ep_stop_cmac(struct r
void __iomem *ioaddr = tp->mmio_addr;
RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
- rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
+ rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
}
Patches currently in stable-queue which might be from hau(a)realtek.com are
queue-4.14/r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
This is a note to let you know that I've just added the patch titled
qmi_wwan: Add support for Quectel EP06
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
qmi_wwan-add-support-for-quectel-ep06.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Kristian Evensen <kristian.evensen(a)gmail.com>
Date: Tue, 30 Jan 2018 14:12:55 +0100
Subject: qmi_wwan: Add support for Quectel EP06
From: Kristian Evensen <kristian.evensen(a)gmail.com>
[ Upstream commit c0b91a56a2e57a5a370655b25d677ae0ebf8a2d0 ]
The Quectel EP06 is a Cat. 6 LTE modem. It uses the same interface as
the EC20/EC25 for QMI, and requires the same "set DTR"-quirk to work.
Signed-off-by: Kristian Evensen <kristian.evensen(a)gmail.com>
Acked-by: Bjørn Mork <bjorn(a)mork.no>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/qmi_wwan.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1243,6 +1243,7 @@ static const struct usb_device_id produc
{QMI_QUIRK_SET_DTR(0x2c7c, 0x0125, 4)}, /* Quectel EC25, EC20 R2.0 Mini PCIe */
{QMI_QUIRK_SET_DTR(0x2c7c, 0x0121, 4)}, /* Quectel EC21 Mini PCIe */
{QMI_FIXED_INTF(0x2c7c, 0x0296, 4)}, /* Quectel BG96 */
+ {QMI_QUIRK_SET_DTR(0x2c7c, 0x0306, 4)}, /* Quectel EP06 Mini PCIe */
/* 4. Gobi 1000 devices */
{QMI_GOBI1K_DEVICE(0x05c6, 0x9212)}, /* Acer Gobi Modem Device */
Patches currently in stable-queue which might be from kristian.evensen(a)gmail.com are
queue-4.14/qmi_wwan-add-support-for-quectel-ep06.patch
This is a note to let you know that I've just added the patch titled
net: igmp: add a missing rcu locking section
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-igmp-add-a-missing-rcu-locking-section.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 1 Feb 2018 10:26:57 -0800
Subject: net: igmp: add a missing rcu locking section
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ]
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
Timer callbacks do not ensure this locking.
=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/igmp.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -386,7 +386,11 @@ static struct sk_buff *igmpv3_newpack(st
pip->frag_off = htons(IP_DF);
pip->ttl = 1;
pip->daddr = fl4.daddr;
+
+ rcu_read_lock();
pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
+ rcu_read_unlock();
+
pip->protocol = IPPROTO_IGMP;
pip->tot_len = 0; /* filled in later */
ip_select_ident(net, skb, NULL);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.14/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-4.14/net-igmp-add-a-missing-rcu-locking-section.patch
queue-4.14/soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
queue-4.14/revert-defer-call-to-mem_cgroup_sk_alloc.patch
This is a note to let you know that I've just added the patch titled
ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Martin KaFai Lau <kafai(a)fb.com>
Date: Wed, 24 Jan 2018 23:15:27 -0800
Subject: ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
From: Martin KaFai Lau <kafai(a)fb.com>
[ Upstream commit 7ece54a60ee2ba7a386308cae73c790bd580589c ]
If a sk_v6_rcv_saddr is !IPV6_ADDR_ANY and !IPV6_ADDR_MAPPED, it
implicitly implies it is an ipv6only socket. However, in inet6_bind(),
this addr_type checking and setting sk->sk_ipv6only to 1 are only done
after sk->sk_prot->get_port(sk, snum) has been completed successfully.
This inconsistency between sk_v6_rcv_saddr and sk_ipv6only confuses
the 'get_port()'.
In particular, when binding SO_REUSEPORT UDP sockets,
udp_reuseport_add_sock(sk,...) is called. udp_reuseport_add_sock()
checks "ipv6_only_sock(sk2) == ipv6_only_sock(sk)" before adding sk to
sk2->sk_reuseport_cb. In this case, ipv6_only_sock(sk2) could be
1 while ipv6_only_sock(sk) is still 0 here. The end result is,
reuseport_alloc(sk) is called instead of adding sk to the existing
sk2->sk_reuseport_cb.
It can be reproduced by binding two SO_REUSEPORT UDP sockets on an
IPv6 address (!ANY and !MAPPED). Only one of the socket will
receive packet.
The fix is to set the implicit sk_ipv6only before calling get_port().
The original sk_ipv6only has to be saved such that it can be restored
in case get_port() failed. The situation is similar to the
inet_reset_saddr(sk) after get_port() has failed.
Thanks to Calvin Owens <calvinowens(a)fb.com> who created an easy
reproduction which leads to a fix.
Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection")
Signed-off-by: Martin KaFai Lau <kafai(a)fb.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/af_inet6.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -284,6 +284,7 @@ int inet6_bind(struct socket *sock, stru
struct net *net = sock_net(sk);
__be32 v4addr = 0;
unsigned short snum;
+ bool saved_ipv6only;
int addr_type = 0;
int err = 0;
@@ -389,19 +390,21 @@ int inet6_bind(struct socket *sock, stru
if (!(addr_type & IPV6_ADDR_MULTICAST))
np->saddr = addr->sin6_addr;
+ saved_ipv6only = sk->sk_ipv6only;
+ if (addr_type != IPV6_ADDR_ANY && addr_type != IPV6_ADDR_MAPPED)
+ sk->sk_ipv6only = 1;
+
/* Make sure we are allowed to bind here. */
if ((snum || !inet->bind_address_no_port) &&
sk->sk_prot->get_port(sk, snum)) {
+ sk->sk_ipv6only = saved_ipv6only;
inet_reset_saddr(sk);
err = -EADDRINUSE;
goto out;
}
- if (addr_type != IPV6_ADDR_ANY) {
+ if (addr_type != IPV6_ADDR_ANY)
sk->sk_userlocks |= SOCK_BINDADDR_LOCK;
- if (addr_type != IPV6_ADDR_MAPPED)
- sk->sk_ipv6only = 1;
- }
if (snum)
sk->sk_userlocks |= SOCK_BINDPORT_LOCK;
inet->inet_sport = htons(inet->inet_num);
Patches currently in stable-queue which might be from kafai(a)fb.com are
queue-4.14/ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
This is a note to let you know that I've just added the patch titled
ip6mr: fix stale iterator
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ip6mr-fix-stale-iterator.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:20 PST 2018
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Date: Wed, 31 Jan 2018 16:29:30 +0200
Subject: ip6mr: fix stale iterator
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
[ Upstream commit 4adfa79fc254efb7b0eb3cd58f62c2c3f805f1ba ]
When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.
Here's syzbot's trace:
WARNING: bad unlock balance detected!
4.15.0-rc3+ #128 Not tainted
syzkaller971460/3195 is trying to release lock (mrt_lock) at:
[<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syzkaller971460/3195:
#0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
fs/seq_file.c:165
stack backtrace:
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
__lock_release kernel/locking/lockdep.c:3775 [inline]
lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
__raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
_raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
traverse+0x3bc/0xa00 fs/seq_file.c:135
seq_read+0x96a/0x13d0 fs/seq_file.c:189
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at lib/usercopy.c:25
in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
INFO: lockdep is turned off.
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
__might_sleep+0x95/0x190 kernel/sched/core.c:6013
__might_fault+0xab/0x1d0 mm/memory.c:4525
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
seq_read+0xcb4/0x13d0 fs/seq_file.c:279
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
lib/usercopy.c:26
Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669(a)syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6mr.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -496,6 +496,7 @@ static void *ipmr_mfc_seq_start(struct s
return ERR_PTR(-ENOENT);
it->mrt = mrt;
+ it->cache = NULL;
return *pos ? ipmr_mfc_seq_idx(net, seq->private, *pos - 1)
: SEQ_START_TOKEN;
}
Patches currently in stable-queue which might be from nikolay(a)cumulusnetworks.com are
queue-4.14/ip6mr-fix-stale-iterator.patch
This is a note to let you know that I've just added the patch titled
tcp_bbr: fix pacing_gain to always be unity when using lt_bw
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp_bbr-fix-pacing_gain-to-always-be-unity-when-using-lt_bw.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Neal Cardwell <ncardwell(a)google.com>
Date: Wed, 31 Jan 2018 15:43:05 -0500
Subject: tcp_bbr: fix pacing_gain to always be unity when using lt_bw
From: Neal Cardwell <ncardwell(a)google.com>
[ Upstream commit 3aff3b4b986e51bcf4ab249e5d48d39596e0df6a ]
This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when
using lt_bw and returning from the PROBE_RTT state to PROBE_BW.
Previously, when using lt_bw, upon exiting PROBE_RTT and entering
PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly
end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase()
set a pacing gain above 1.0. In such cases this would result in a
pacing rate that is 1.25x higher than intended, potentially resulting
in a high loss rate for a little while until we stop using the lt_bw a
bit later.
This commit is a stable candidate for kernels back as far as 4.9.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: Yuchung Cheng <ycheng(a)google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Reported-by: Beyers Cronje <bcronje(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_bbr.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -481,7 +481,8 @@ static void bbr_advance_cycle_phase(stru
bbr->cycle_idx = (bbr->cycle_idx + 1) & (CYCLE_LEN - 1);
bbr->cycle_mstamp = tp->delivered_mstamp;
- bbr->pacing_gain = bbr_pacing_gain[bbr->cycle_idx];
+ bbr->pacing_gain = bbr->lt_use_bw ? BBR_UNIT :
+ bbr_pacing_gain[bbr->cycle_idx];
}
/* Gain cycling: cycle pacing gain to converge to fair share of available bw. */
@@ -490,8 +491,7 @@ static void bbr_update_cycle_phase(struc
{
struct bbr *bbr = inet_csk_ca(sk);
- if ((bbr->mode == BBR_PROBE_BW) && !bbr->lt_use_bw &&
- bbr_is_next_cycle_phase(sk, rs))
+ if (bbr->mode == BBR_PROBE_BW && bbr_is_next_cycle_phase(sk, rs))
bbr_advance_cycle_phase(sk);
}
Patches currently in stable-queue which might be from ncardwell(a)google.com are
queue-4.15/tcp_bbr-fix-pacing_gain-to-always-be-unity-when-using-lt_bw.patch
This is a note to let you know that I've just added the patch titled
vhost_net: stop device during reset owner
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vhost_net-stop-device-during-reset-owner.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Thu, 25 Jan 2018 22:03:52 +0800
Subject: vhost_net: stop device during reset owner
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ]
We don't stop device before reset owner, this means we could try to
serve any virtqueue kick before reset dev->worker. This will result a
warn since the work was pending at llist during owner resetting. Fix
this by stopping device during owner reset.
Reported-by: syzbot+eb17c6162478cc50632c(a)syzkaller.appspotmail.com
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/vhost/net.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1208,6 +1208,7 @@ static long vhost_net_reset_owner(struct
}
vhost_net_stop(n, &tx_sock, &rx_sock);
vhost_net_flush(n);
+ vhost_dev_stop(&n->dev);
vhost_dev_reset_owner(&n->dev, umem);
vhost_net_vq_reset(n);
done:
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-4.15/vhost_net-stop-device-during-reset-owner.patch
This is a note to let you know that I've just added the patch titled
tcp: release sk_frag.page in tcp_disconnect
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-release-sk_frag.page-in-tcp_disconnect.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Li RongQing <lirongqing(a)baidu.com>
Date: Fri, 26 Jan 2018 16:40:41 +0800
Subject: tcp: release sk_frag.page in tcp_disconnect
From: Li RongQing <lirongqing(a)baidu.com>
[ Upstream commit 9b42d55a66d388e4dd5550107df051a9637564fc ]
socket can be disconnected and gets transformed back to a listening
socket, if sk_frag.page is not released, which will be cloned into
a new socket by sk_clone_lock, but the reference count of this page
is increased, lead to a use after free or double free issue
Signed-off-by: Li RongQing <lirongqing(a)baidu.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2434,6 +2434,12 @@ int tcp_disconnect(struct sock *sk, int
WARN_ON(inet->inet_num && !icsk->icsk_bind_hash);
+ if (sk->sk_frag.page) {
+ put_page(sk->sk_frag.page);
+ sk->sk_frag.page = NULL;
+ sk->sk_frag.offset = 0;
+ }
+
sk->sk_error_report(sk);
return err;
}
Patches currently in stable-queue which might be from lirongqing(a)baidu.com are
queue-4.15/tcp-release-sk_frag.page-in-tcp_disconnect.patch
This is a note to let you know that I've just added the patch titled
rocker: fix possible null pointer dereference in rocker_router_fib_event_work
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rocker-fix-possible-null-pointer-dereference-in-rocker_router_fib_event_work.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Jiri Pirko <jiri(a)mellanox.com>
Date: Thu, 1 Feb 2018 12:21:15 +0100
Subject: rocker: fix possible null pointer dereference in rocker_router_fib_event_work
From: Jiri Pirko <jiri(a)mellanox.com>
[ Upstream commit a83165f00f16c0e0ef5b7cec3cbd0d4788699265 ]
Currently, rocker user may experience following null pointer
derefence bug:
[ 3.062141] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[ 3.065163] IP: rocker_router_fib_event_work+0x36/0x110 [rocker]
The problem is uninitialized rocker->wops pointer that is initialized
only with the first initialized port. So move the port initialization
before registering the fib events.
Fixes: 936bd486564a ("rocker: use FIB notifications instead of switchdev calls")
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/rocker/rocker_main.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -2902,6 +2902,12 @@ static int rocker_probe(struct pci_dev *
goto err_alloc_ordered_workqueue;
}
+ err = rocker_probe_ports(rocker);
+ if (err) {
+ dev_err(&pdev->dev, "failed to probe ports\n");
+ goto err_probe_ports;
+ }
+
/* Only FIBs pointing to our own netdevs are programmed into
* the device, so no need to pass a callback.
*/
@@ -2918,22 +2924,16 @@ static int rocker_probe(struct pci_dev *
rocker->hw.id = rocker_read64(rocker, SWITCH_ID);
- err = rocker_probe_ports(rocker);
- if (err) {
- dev_err(&pdev->dev, "failed to probe ports\n");
- goto err_probe_ports;
- }
-
dev_info(&pdev->dev, "Rocker switch with id %*phN\n",
(int)sizeof(rocker->hw.id), &rocker->hw.id);
return 0;
-err_probe_ports:
- unregister_switchdev_notifier(&rocker_switchdev_notifier);
err_register_switchdev_notifier:
unregister_fib_notifier(&rocker->fib_nb);
err_register_fib_notifier:
+ rocker_remove_ports(rocker);
+err_probe_ports:
destroy_workqueue(rocker->rocker_owq);
err_alloc_ordered_workqueue:
free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_EVENT), rocker);
@@ -2961,9 +2961,9 @@ static void rocker_remove(struct pci_dev
{
struct rocker *rocker = pci_get_drvdata(pdev);
- rocker_remove_ports(rocker);
unregister_switchdev_notifier(&rocker_switchdev_notifier);
unregister_fib_notifier(&rocker->fib_nb);
+ rocker_remove_ports(rocker);
rocker_write32(rocker, CONTROL, ROCKER_CONTROL_RESET);
destroy_workqueue(rocker->rocker_owq);
free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_EVENT), rocker);
Patches currently in stable-queue which might be from jiri(a)mellanox.com are
queue-4.15/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.15/rocker-fix-possible-null-pointer-dereference-in-rocker_router_fib_event_work.patch
queue-4.15/net-sched-fix-use-after-free-in-tcf_block_put_ext.patch
This is a note to let you know that I've just added the patch titled
soreuseport: fix mem leak in reuseport_add_sock()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Fri, 2 Feb 2018 10:27:27 -0800
Subject: soreuseport: fix mem leak in reuseport_add_sock()
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 4db428a7c9ab07e08783e0fcdc4ca0f555da0567 ]
reuseport_add_sock() needs to deal with attaching a socket having
its own sk_reuseport_cb, after a prior
setsockopt(SO_ATTACH_REUSEPORT_?BPF)
Without this fix, not only a WARN_ONCE() was issued, but we were also
leaking memory.
Thanks to sysbot and Eric Biggers for providing us nice C repros.
------------[ cut here ]------------
socket already in reuseport group
WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119
reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x211/0x2d0 lib/bug.c:184
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079
Fixes: ef456144da8e ("soreuseport: define reuseport groups")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot+c0ea2226f77a42936bf7(a)syzkaller.appspotmail.com
Acked-by: Craig Gallek <kraig(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/sock_reuseport.c | 35 ++++++++++++++++++++---------------
1 file changed, 20 insertions(+), 15 deletions(-)
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -94,6 +94,16 @@ static struct sock_reuseport *reuseport_
return more_reuse;
}
+static void reuseport_free_rcu(struct rcu_head *head)
+{
+ struct sock_reuseport *reuse;
+
+ reuse = container_of(head, struct sock_reuseport, rcu);
+ if (reuse->prog)
+ bpf_prog_destroy(reuse->prog);
+ kfree(reuse);
+}
+
/**
* reuseport_add_sock - Add a socket to the reuseport group of another.
* @sk: New socket to add to the group.
@@ -102,7 +112,7 @@ static struct sock_reuseport *reuseport_
*/
int reuseport_add_sock(struct sock *sk, struct sock *sk2)
{
- struct sock_reuseport *reuse;
+ struct sock_reuseport *old_reuse, *reuse;
if (!rcu_access_pointer(sk2->sk_reuseport_cb)) {
int err = reuseport_alloc(sk2);
@@ -113,10 +123,13 @@ int reuseport_add_sock(struct sock *sk,
spin_lock_bh(&reuseport_lock);
reuse = rcu_dereference_protected(sk2->sk_reuseport_cb,
- lockdep_is_held(&reuseport_lock)),
- WARN_ONCE(rcu_dereference_protected(sk->sk_reuseport_cb,
- lockdep_is_held(&reuseport_lock)),
- "socket already in reuseport group");
+ lockdep_is_held(&reuseport_lock));
+ old_reuse = rcu_dereference_protected(sk->sk_reuseport_cb,
+ lockdep_is_held(&reuseport_lock));
+ if (old_reuse && old_reuse->num_socks != 1) {
+ spin_unlock_bh(&reuseport_lock);
+ return -EBUSY;
+ }
if (reuse->num_socks == reuse->max_socks) {
reuse = reuseport_grow(reuse);
@@ -134,19 +147,11 @@ int reuseport_add_sock(struct sock *sk,
spin_unlock_bh(&reuseport_lock);
+ if (old_reuse)
+ call_rcu(&old_reuse->rcu, reuseport_free_rcu);
return 0;
}
-static void reuseport_free_rcu(struct rcu_head *head)
-{
- struct sock_reuseport *reuse;
-
- reuse = container_of(head, struct sock_reuseport, rcu);
- if (reuse->prog)
- bpf_prog_destroy(reuse->prog);
- kfree(reuse);
-}
-
void reuseport_detach_sock(struct sock *sk)
{
struct sock_reuseport *reuse;
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.15/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-4.15/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.15/net-igmp-add-a-missing-rcu-locking-section.patch
queue-4.15/ipv6-change-route-cache-aging-logic.patch
queue-4.15/soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
queue-4.15/revert-defer-call-to-mem_cgroup_sk_alloc.patch
queue-4.15/ipv6-addrconf-break-critical-section-in-addrconf_verify_rtnl.patch
This is a note to let you know that I've just added the patch titled
Revert "defer call to mem_cgroup_sk_alloc()"
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
revert-defer-call-to-mem_cgroup_sk_alloc.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Roman Gushchin <guro(a)fb.com>
Date: Fri, 2 Feb 2018 15:26:57 +0000
Subject: Revert "defer call to mem_cgroup_sk_alloc()"
From: Roman Gushchin <guro(a)fb.com>
[ Upstream commit edbe69ef2c90fc86998a74b08319a01c508bd497 ]
This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol:
defer call to mem_cgroup_sk_alloc()").
Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
memcg socket memory accounting, as packets received before memcg
pointer initialization are not accounted and are causing refcounting
underflow on socket release.
Actually the free-after-use problem was fixed by
commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in
sk_clone_lock()") for the cgroup pointer.
So, let's revert it and call mem_cgroup_sk_alloc() just before
cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
we're cloning, and it holds a reference to the memcg.
Also, let's drop BUG_ON(mem_cgroup_is_root()) check from
mem_cgroup_sk_alloc(). I see no reasons why bumping the root
memcg counter is a good reason to panic, and there are no realistic
ways to hit it.
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Tejun Heo <tj(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/memcontrol.c | 14 ++++++++++++++
net/core/sock.c | 5 +----
net/ipv4/inet_connection_sock.c | 1 -
3 files changed, 15 insertions(+), 5 deletions(-)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5828,6 +5828,20 @@ void mem_cgroup_sk_alloc(struct sock *sk
if (!mem_cgroup_sockets_enabled)
return;
+ /*
+ * Socket cloning can throw us here with sk_memcg already
+ * filled. It won't however, necessarily happen from
+ * process context. So the test for root memcg given
+ * the current task's memcg won't help us in this case.
+ *
+ * Respecting the original socket's memcg is a better
+ * decision in this case.
+ */
+ if (sk->sk_memcg) {
+ css_get(&sk->sk_memcg->css);
+ return;
+ }
+
rcu_read_lock();
memcg = mem_cgroup_from_task(current);
if (memcg == root_mem_cgroup)
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1675,16 +1675,13 @@ struct sock *sk_clone_lock(const struct
newsk->sk_dst_pending_confirm = 0;
newsk->sk_wmem_queued = 0;
newsk->sk_forward_alloc = 0;
-
- /* sk->sk_memcg will be populated at accept() time */
- newsk->sk_memcg = NULL;
-
atomic_set(&newsk->sk_drops, 0);
newsk->sk_send_head = NULL;
newsk->sk_userlocks = sk->sk_userlocks & ~SOCK_BINDPORT_LOCK;
atomic_set(&newsk->sk_zckey, 0);
sock_reset_flag(newsk, SOCK_DONE);
+ mem_cgroup_sk_alloc(newsk);
cgroup_sk_alloc(&newsk->sk_cgrp_data);
rcu_read_lock();
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -475,7 +475,6 @@ struct sock *inet_csk_accept(struct sock
}
spin_unlock_bh(&queue->fastopenq.lock);
}
- mem_cgroup_sk_alloc(newsk);
out:
release_sock(sk);
if (req)
Patches currently in stable-queue which might be from guro(a)fb.com are
queue-4.15/revert-defer-call-to-mem_cgroup_sk_alloc.patch
This is a note to let you know that I've just added the patch titled
r8169: fix RTL8168EP take too long to complete driver initialization.
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Chunhao Lin <hau(a)realtek.com>
Date: Wed, 31 Jan 2018 01:32:36 +0800
Subject: r8169: fix RTL8168EP take too long to complete driver initialization.
From: Chunhao Lin <hau(a)realtek.com>
[ Upstream commit 086ca23d03c0d2f4088f472386778d293e15c5f6 ]
Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver
waiting until timeout.
Fix this by waiting for the right register bit.
Signed-off-by: Chunhao Lin <hau(a)realtek.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/realtek/r8169.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1395,7 +1395,7 @@ DECLARE_RTL_COND(rtl_ocp_tx_cond)
{
void __iomem *ioaddr = tp->mmio_addr;
- return RTL_R8(IBISR0) & 0x02;
+ return RTL_R8(IBISR0) & 0x20;
}
static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
@@ -1403,7 +1403,7 @@ static void rtl8168ep_stop_cmac(struct r
void __iomem *ioaddr = tp->mmio_addr;
RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
- rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
+ rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
}
Patches currently in stable-queue which might be from hau(a)realtek.com are
queue-4.15/r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
This is a note to let you know that I've just added the patch titled
qmi_wwan: Add support for Quectel EP06
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
qmi_wwan-add-support-for-quectel-ep06.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Kristian Evensen <kristian.evensen(a)gmail.com>
Date: Tue, 30 Jan 2018 14:12:55 +0100
Subject: qmi_wwan: Add support for Quectel EP06
From: Kristian Evensen <kristian.evensen(a)gmail.com>
[ Upstream commit c0b91a56a2e57a5a370655b25d677ae0ebf8a2d0 ]
The Quectel EP06 is a Cat. 6 LTE modem. It uses the same interface as
the EC20/EC25 for QMI, and requires the same "set DTR"-quirk to work.
Signed-off-by: Kristian Evensen <kristian.evensen(a)gmail.com>
Acked-by: Bjørn Mork <bjorn(a)mork.no>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/qmi_wwan.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1245,6 +1245,7 @@ static const struct usb_device_id produc
{QMI_QUIRK_SET_DTR(0x2c7c, 0x0125, 4)}, /* Quectel EC25, EC20 R2.0 Mini PCIe */
{QMI_QUIRK_SET_DTR(0x2c7c, 0x0121, 4)}, /* Quectel EC21 Mini PCIe */
{QMI_FIXED_INTF(0x2c7c, 0x0296, 4)}, /* Quectel BG96 */
+ {QMI_QUIRK_SET_DTR(0x2c7c, 0x0306, 4)}, /* Quectel EP06 Mini PCIe */
/* 4. Gobi 1000 devices */
{QMI_GOBI1K_DEVICE(0x05c6, 0x9212)}, /* Acer Gobi Modem Device */
Patches currently in stable-queue which might be from kristian.evensen(a)gmail.com are
queue-4.15/qmi_wwan-add-support-for-quectel-ep06.patch
This is a note to let you know that I've just added the patch titled
net_sched: get rid of rcu_barrier() in tcf_block_put_ext()
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Mon, 4 Dec 2017 10:48:18 -0800
Subject: net_sched: get rid of rcu_barrier() in tcf_block_put_ext()
From: Cong Wang <xiyou.wangcong(a)gmail.com>
[ Upstream commit efbf78973978b0d25af59bc26c8013a942af6e64 ]
Both Eric and Paolo noticed the rcu_barrier() we use in
tcf_block_put_ext() could be a performance bottleneck when
we have a lot of tc classes.
Paolo provided the following to demonstrate the issue:
tc qdisc add dev lo root htb
for I in `seq 1 1000`; do
tc class add dev lo parent 1: classid 1:$I htb rate 100kbit
tc qdisc add dev lo parent 1:$I handle $((I + 1)): htb
for J in `seq 1 10`; do
tc filter add dev lo parent $((I + 1)): u32 match ip src 1.1.1.$J
done
done
time tc qdisc del dev root
real 0m54.764s
user 0m0.023s
sys 0m0.000s
The rcu_barrier() there is to ensure we free the block after all chains
are gone, that is, to queue tcf_block_put_final() at the tail of workqueue.
We can achieve this ordering requirement by refcnt'ing tcf block instead,
that is, the tcf block is freed only when the last chain in this block is
gone. This also simplifies the code.
Paolo reported after this patch we get:
real 0m0.017s
user 0m0.000s
sys 0m0.017s
Tested-by: Paolo Abeni <pabeni(a)redhat.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jiri Pirko <jiri(a)mellanox.com>
Cc: Jamal Hadi Salim <jhs(a)mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/sch_generic.h | 1 -
net/sched/cls_api.c | 30 +++++++++---------------------
2 files changed, 9 insertions(+), 22 deletions(-)
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -280,7 +280,6 @@ struct tcf_block {
struct net *net;
struct Qdisc *q;
struct list_head cb_list;
- struct work_struct work;
};
static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -217,8 +217,12 @@ static void tcf_chain_flush(struct tcf_c
static void tcf_chain_destroy(struct tcf_chain *chain)
{
+ struct tcf_block *block = chain->block;
+
list_del(&chain->list);
kfree(chain);
+ if (list_empty(&block->chain_list))
+ kfree(block);
}
static void tcf_chain_hold(struct tcf_chain *chain)
@@ -329,27 +333,13 @@ int tcf_block_get(struct tcf_block **p_b
}
EXPORT_SYMBOL(tcf_block_get);
-static void tcf_block_put_final(struct work_struct *work)
-{
- struct tcf_block *block = container_of(work, struct tcf_block, work);
- struct tcf_chain *chain, *tmp;
-
- rtnl_lock();
-
- /* At this point, all the chains should have refcnt == 1. */
- list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
- tcf_chain_put(chain);
- rtnl_unlock();
- kfree(block);
-}
-
/* XXX: Standalone actions are not allowed to jump to any chain, and bound
* actions should be all removed after flushing.
*/
void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
struct tcf_block_ext_info *ei)
{
- struct tcf_chain *chain;
+ struct tcf_chain *chain, *tmp;
if (!block)
return;
@@ -365,13 +355,11 @@ void tcf_block_put_ext(struct tcf_block
tcf_block_offload_unbind(block, q, ei);
- INIT_WORK(&block->work, tcf_block_put_final);
- /* Wait for existing RCU callbacks to cool down, make sure their works
- * have been queued before this. We can not flush pending works here
- * because we are holding the RTNL lock.
+ /* At this point, all the chains should have refcnt >= 1. Block will be
+ * freed after all chains are gone.
*/
- rcu_barrier();
- tcf_queue_work(&block->work);
+ list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+ tcf_chain_put(chain);
}
EXPORT_SYMBOL(tcf_block_put_ext);
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-4.15/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.15/cls_u32-add-missing-rcu-annotation.patch
This is a note to let you know that I've just added the patch titled
net: sched: fix use-after-free in tcf_block_put_ext
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-sched-fix-use-after-free-in-tcf_block_put_ext.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Jiri Pirko <jiri(a)mellanox.com>
Date: Fri, 8 Dec 2017 19:27:27 +0100
Subject: net: sched: fix use-after-free in tcf_block_put_ext
From: Jiri Pirko <jiri(a)mellanox.com>
[ Upstream commit df45bf84e4f5a48f23d4b1a07d21d566e8b587b2 ]
Since the block is freed with last chain being put, once we reach the
end of iteration of list_for_each_entry_safe, the block may be
already freed. I'm hitting this only by creating and deleting clsact:
[ 202.171952] ==================================================================
[ 202.180182] BUG: KASAN: use-after-free in tcf_block_put_ext+0x240/0x390
[ 202.187590] Read of size 8 at addr ffff880225539a80 by task tc/796
[ 202.194508]
[ 202.196185] CPU: 0 PID: 796 Comm: tc Not tainted 4.15.0-rc2jiri+ #5
[ 202.203200] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
[ 202.213613] Call Trace:
[ 202.216369] dump_stack+0xda/0x169
[ 202.220192] ? dma_virt_map_sg+0x147/0x147
[ 202.224790] ? show_regs_print_info+0x54/0x54
[ 202.229691] ? tcf_chain_destroy+0x1dc/0x250
[ 202.234494] print_address_description+0x83/0x3d0
[ 202.239781] ? tcf_block_put_ext+0x240/0x390
[ 202.244575] kasan_report+0x1ba/0x460
[ 202.248707] ? tcf_block_put_ext+0x240/0x390
[ 202.253518] tcf_block_put_ext+0x240/0x390
[ 202.258117] ? tcf_chain_flush+0x290/0x290
[ 202.262708] ? qdisc_hash_del+0x82/0x1a0
[ 202.267111] ? qdisc_hash_add+0x50/0x50
[ 202.271411] ? __lock_is_held+0x5f/0x1a0
[ 202.275843] clsact_destroy+0x3d/0x80 [sch_ingress]
[ 202.281323] qdisc_destroy+0xcb/0x240
[ 202.285445] qdisc_graft+0x216/0x7b0
[ 202.289497] tc_get_qdisc+0x260/0x560
Fix this by holding the block also by chain 0 and put chain 0
explicitly, out of the list_for_each_entry_safe loop at the very
end of tcf_block_put_ext.
Fixes: efbf78973978 ("net_sched: get rid of rcu_barrier() in tcf_block_put_ext()")
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sched/cls_api.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -343,23 +343,24 @@ void tcf_block_put_ext(struct tcf_block
if (!block)
return;
- /* Hold a refcnt for all chains, except 0, so that they don't disappear
+ /* Hold a refcnt for all chains, so that they don't disappear
* while we are iterating.
*/
list_for_each_entry(chain, &block->chain_list, list)
- if (chain->index)
- tcf_chain_hold(chain);
+ tcf_chain_hold(chain);
list_for_each_entry(chain, &block->chain_list, list)
tcf_chain_flush(chain);
tcf_block_offload_unbind(block, q, ei);
- /* At this point, all the chains should have refcnt >= 1. Block will be
- * freed after all chains are gone.
- */
+ /* At this point, all the chains should have refcnt >= 1. */
list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
tcf_chain_put(chain);
+
+ /* Finally, put chain 0 and allow block to be freed. */
+ chain = list_first_entry(&block->chain_list, struct tcf_chain, list);
+ tcf_chain_put(chain);
}
EXPORT_SYMBOL(tcf_block_put_ext);
Patches currently in stable-queue which might be from jiri(a)mellanox.com are
queue-4.15/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.15/rocker-fix-possible-null-pointer-dereference-in-rocker_router_fib_event_work.patch
queue-4.15/net-sched-fix-use-after-free-in-tcf_block_put_ext.patch
This is a note to let you know that I've just added the patch titled
net: igmp: add a missing rcu locking section
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-igmp-add-a-missing-rcu-locking-section.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 1 Feb 2018 10:26:57 -0800
Subject: net: igmp: add a missing rcu locking section
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ]
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
Timer callbacks do not ensure this locking.
=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/igmp.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -386,7 +386,11 @@ static struct sk_buff *igmpv3_newpack(st
pip->frag_off = htons(IP_DF);
pip->ttl = 1;
pip->daddr = fl4.daddr;
+
+ rcu_read_lock();
pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
+ rcu_read_unlock();
+
pip->protocol = IPPROTO_IGMP;
pip->tot_len = 0; /* filled in later */
ip_select_ident(net, skb, NULL);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.15/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-4.15/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.15/net-igmp-add-a-missing-rcu-locking-section.patch
queue-4.15/ipv6-change-route-cache-aging-logic.patch
queue-4.15/soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
queue-4.15/revert-defer-call-to-mem_cgroup_sk_alloc.patch
queue-4.15/ipv6-addrconf-break-critical-section-in-addrconf_verify_rtnl.patch
This is a note to let you know that I've just added the patch titled
ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Martin KaFai Lau <kafai(a)fb.com>
Date: Wed, 24 Jan 2018 23:15:27 -0800
Subject: ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
From: Martin KaFai Lau <kafai(a)fb.com>
[ Upstream commit 7ece54a60ee2ba7a386308cae73c790bd580589c ]
If a sk_v6_rcv_saddr is !IPV6_ADDR_ANY and !IPV6_ADDR_MAPPED, it
implicitly implies it is an ipv6only socket. However, in inet6_bind(),
this addr_type checking and setting sk->sk_ipv6only to 1 are only done
after sk->sk_prot->get_port(sk, snum) has been completed successfully.
This inconsistency between sk_v6_rcv_saddr and sk_ipv6only confuses
the 'get_port()'.
In particular, when binding SO_REUSEPORT UDP sockets,
udp_reuseport_add_sock(sk,...) is called. udp_reuseport_add_sock()
checks "ipv6_only_sock(sk2) == ipv6_only_sock(sk)" before adding sk to
sk2->sk_reuseport_cb. In this case, ipv6_only_sock(sk2) could be
1 while ipv6_only_sock(sk) is still 0 here. The end result is,
reuseport_alloc(sk) is called instead of adding sk to the existing
sk2->sk_reuseport_cb.
It can be reproduced by binding two SO_REUSEPORT UDP sockets on an
IPv6 address (!ANY and !MAPPED). Only one of the socket will
receive packet.
The fix is to set the implicit sk_ipv6only before calling get_port().
The original sk_ipv6only has to be saved such that it can be restored
in case get_port() failed. The situation is similar to the
inet_reset_saddr(sk) after get_port() has failed.
Thanks to Calvin Owens <calvinowens(a)fb.com> who created an easy
reproduction which leads to a fix.
Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection")
Signed-off-by: Martin KaFai Lau <kafai(a)fb.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/af_inet6.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -284,6 +284,7 @@ int inet6_bind(struct socket *sock, stru
struct net *net = sock_net(sk);
__be32 v4addr = 0;
unsigned short snum;
+ bool saved_ipv6only;
int addr_type = 0;
int err = 0;
@@ -389,19 +390,21 @@ int inet6_bind(struct socket *sock, stru
if (!(addr_type & IPV6_ADDR_MULTICAST))
np->saddr = addr->sin6_addr;
+ saved_ipv6only = sk->sk_ipv6only;
+ if (addr_type != IPV6_ADDR_ANY && addr_type != IPV6_ADDR_MAPPED)
+ sk->sk_ipv6only = 1;
+
/* Make sure we are allowed to bind here. */
if ((snum || !inet->bind_address_no_port) &&
sk->sk_prot->get_port(sk, snum)) {
+ sk->sk_ipv6only = saved_ipv6only;
inet_reset_saddr(sk);
err = -EADDRINUSE;
goto out;
}
- if (addr_type != IPV6_ADDR_ANY) {
+ if (addr_type != IPV6_ADDR_ANY)
sk->sk_userlocks |= SOCK_BINDADDR_LOCK;
- if (addr_type != IPV6_ADDR_MAPPED)
- sk->sk_ipv6only = 1;
- }
if (snum)
sk->sk_userlocks |= SOCK_BINDPORT_LOCK;
inet->inet_sport = htons(inet->inet_num);
Patches currently in stable-queue which might be from kafai(a)fb.com are
queue-4.15/ipv6-change-route-cache-aging-logic.patch
queue-4.15/ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
This is a note to let you know that I've just added the patch titled
ipv6: change route cache aging logic
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-change-route-cache-aging-logic.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Wei Wang <weiwan(a)google.com>
Date: Fri, 26 Jan 2018 11:40:17 -0800
Subject: ipv6: change route cache aging logic
From: Wei Wang <weiwan(a)google.com>
[ Upstream commit 31afeb425f7fad8bcf9561aeb0b8405479f97a98 ]
In current route cache aging logic, if a route has both RTF_EXPIRE and
RTF_GATEWAY set, the route will only be removed if the neighbor cache
has no NTF_ROUTER flag. Otherwise, even if the route has expired, it
won't get deleted.
Fix this logic to always check if the route has expired first and then
do the gateway neighbor cache check if previous check decide to not
remove the exception entry.
Fixes: 1859bac04fb6 ("ipv6: remove from fib tree aged out RTF_CACHE dst")
Signed-off-by: Wei Wang <weiwan(a)google.com>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Acked-by: Martin KaFai Lau <kafai(a)fb.com>
Acked-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/route.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1586,12 +1586,19 @@ static void rt6_age_examine_exception(st
* EXPIRES exceptions - e.g. pmtu-generated ones are pruned when
* expired, independently from their aging, as per RFC 8201 section 4
*/
- if (!(rt->rt6i_flags & RTF_EXPIRES) &&
- time_after_eq(now, rt->dst.lastuse + gc_args->timeout)) {
- RT6_TRACE("aging clone %p\n", rt);
+ if (!(rt->rt6i_flags & RTF_EXPIRES)) {
+ if (time_after_eq(now, rt->dst.lastuse + gc_args->timeout)) {
+ RT6_TRACE("aging clone %p\n", rt);
+ rt6_remove_exception(bucket, rt6_ex);
+ return;
+ }
+ } else if (time_after(jiffies, rt->dst.expires)) {
+ RT6_TRACE("purging expired route %p\n", rt);
rt6_remove_exception(bucket, rt6_ex);
return;
- } else if (rt->rt6i_flags & RTF_GATEWAY) {
+ }
+
+ if (rt->rt6i_flags & RTF_GATEWAY) {
struct neighbour *neigh;
__u8 neigh_flags = 0;
@@ -1606,11 +1613,8 @@ static void rt6_age_examine_exception(st
rt6_remove_exception(bucket, rt6_ex);
return;
}
- } else if (__rt6_check_expired(rt)) {
- RT6_TRACE("purging expired route %p\n", rt);
- rt6_remove_exception(bucket, rt6_ex);
- return;
}
+
gc_args->more++;
}
Patches currently in stable-queue which might be from weiwan(a)google.com are
queue-4.15/ipv6-change-route-cache-aging-logic.patch
This is a note to let you know that I've just added the patch titled
ip6mr: fix stale iterator
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ip6mr-fix-stale-iterator.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Date: Wed, 31 Jan 2018 16:29:30 +0200
Subject: ip6mr: fix stale iterator
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
[ Upstream commit 4adfa79fc254efb7b0eb3cd58f62c2c3f805f1ba ]
When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.
Here's syzbot's trace:
WARNING: bad unlock balance detected!
4.15.0-rc3+ #128 Not tainted
syzkaller971460/3195 is trying to release lock (mrt_lock) at:
[<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syzkaller971460/3195:
#0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
fs/seq_file.c:165
stack backtrace:
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
__lock_release kernel/locking/lockdep.c:3775 [inline]
lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
__raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
_raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
traverse+0x3bc/0xa00 fs/seq_file.c:135
seq_read+0x96a/0x13d0 fs/seq_file.c:189
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at lib/usercopy.c:25
in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
INFO: lockdep is turned off.
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
__might_sleep+0x95/0x190 kernel/sched/core.c:6013
__might_fault+0xab/0x1d0 mm/memory.c:4525
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
seq_read+0xcb4/0x13d0 fs/seq_file.c:279
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
lib/usercopy.c:26
Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669(a)syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6mr.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -495,6 +495,7 @@ static void *ipmr_mfc_seq_start(struct s
return ERR_PTR(-ENOENT);
it->mrt = mrt;
+ it->cache = NULL;
return *pos ? ipmr_mfc_seq_idx(net, seq->private, *pos - 1)
: SEQ_START_TOKEN;
}
Patches currently in stable-queue which might be from nikolay(a)cumulusnetworks.com are
queue-4.15/ip6mr-fix-stale-iterator.patch
This is a note to let you know that I've just added the patch titled
cls_u32: add missing RCU annotation.
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
cls_u32-add-missing-rcu-annotation.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Feb 7 11:29:33 PST 2018
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Fri, 2 Feb 2018 16:02:22 +0100
Subject: cls_u32: add missing RCU annotation.
From: Paolo Abeni <pabeni(a)redhat.com>
[ Upstream commit 058a6c033488494a6b1477b05fe8e1a16e344462 ]
In a couple of points of the control path, n->ht_down is currently
accessed without the required RCU annotation. The accesses are
safe, but sparse complaints. Since we already held the
rtnl lock, let use rtnl_dereference().
Fixes: a1b7c5fd7fe9 ("net: sched: add cls_u32 offload hooks for netdevs")
Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Acked-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sched/cls_u32.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -544,6 +544,7 @@ static void u32_remove_hw_knode(struct t
static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
u32 flags)
{
+ struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
struct tcf_block *block = tp->chain->block;
struct tc_cls_u32_offload cls_u32 = {};
bool skip_sw = tc_skip_sw(flags);
@@ -563,7 +564,7 @@ static int u32_replace_hw_knode(struct t
cls_u32.knode.sel = &n->sel;
cls_u32.knode.exts = &n->exts;
if (n->ht_down)
- cls_u32.knode.link_handle = n->ht_down->handle;
+ cls_u32.knode.link_handle = ht->handle;
err = tc_setup_cb_call(block, NULL, TC_SETUP_CLSU32, &cls_u32, skip_sw);
if (err < 0) {
@@ -840,8 +841,9 @@ static void u32_replace_knode(struct tcf
static struct tc_u_knode *u32_init_knode(struct tcf_proto *tp,
struct tc_u_knode *n)
{
- struct tc_u_knode *new;
+ struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
struct tc_u32_sel *s = &n->sel;
+ struct tc_u_knode *new;
new = kzalloc(sizeof(*n) + s->nkeys*sizeof(struct tc_u32_key),
GFP_KERNEL);
@@ -859,11 +861,11 @@ static struct tc_u_knode *u32_init_knode
new->fshift = n->fshift;
new->res = n->res;
new->flags = n->flags;
- RCU_INIT_POINTER(new->ht_down, n->ht_down);
+ RCU_INIT_POINTER(new->ht_down, ht);
/* bump reference count as long as we hold pointer to structure */
- if (new->ht_down)
- new->ht_down->refcnt++;
+ if (ht)
+ ht->refcnt++;
#ifdef CONFIG_CLS_U32_PERF
/* Statistics may be incremented by readers during update
Patches currently in stable-queue which might be from pabeni(a)redhat.com are
queue-4.15/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.15/ipv6-change-route-cache-aging-logic.patch
queue-4.15/cls_u32-add-missing-rcu-annotation.patch
This is a note to let you know that I've just added the patch titled
kbuild: rpm-pkg: keep spec file until make mrproper
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kbuild-rpm-pkg-keep-spec-file-until-make-mrproper.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From af60e207087975d069858741c44ed4f450330ac4 Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Date: Sat, 30 Sep 2017 10:10:10 +0900
Subject: kbuild: rpm-pkg: keep spec file until make mrproper
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
commit af60e207087975d069858741c44ed4f450330ac4 upstream.
If build fails during (bin)rpm-pkg, the spec file is not cleaned by
anyone until the next successful build of the package.
We do not have to immediately delete the spec file in case somebody
may want to take a look at it. Instead, make them ignored by git,
and cleaned up by make mrproper.
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.gitignore | 5 +++++
scripts/package/Makefile | 4 ++--
2 files changed, 7 insertions(+), 2 deletions(-)
--- a/.gitignore
+++ b/.gitignore
@@ -56,6 +56,11 @@ modules.builtin
/Module.markers
#
+# RPM spec file (make rpm-pkg)
+#
+/*.spec
+
+#
# Debian directory (make deb-pkg)
#
/debian/
--- a/scripts/package/Makefile
+++ b/scripts/package/Makefile
@@ -50,7 +50,6 @@ rpm-pkg rpm: FORCE
$(CONFIG_SHELL) $(MKSPEC) >$(objtree)/kernel.spec
$(call cmd,src_tar,$(KERNELPATH),kernel.spec)
+rpmbuild $(RPMOPTS) --target $(UTS_MACHINE) -ta $(KERNELPATH).tar.gz
- rm $(KERNELPATH).tar.gz kernel.spec
# binrpm-pkg
# ---------------------------------------------------------------------------
@@ -59,7 +58,8 @@ binrpm-pkg: FORCE
$(CONFIG_SHELL) $(MKSPEC) prebuilt > $(objtree)/binkernel.spec
+rpmbuild $(RPMOPTS) --define "_builddir $(objtree)" --target \
$(UTS_MACHINE) -bb $(objtree)/binkernel.spec
- rm binkernel.spec
+
+clean-files += $(objtree)/*.spec
# Deb target
# ---------------------------------------------------------------------------
Patches currently in stable-queue which might be from yamada.masahiro(a)socionext.com are
queue-4.14/gitignore-move-.dtb-and-.dtb.s-patterns-to-the-top-level-.gitignore.patch
queue-4.14/kbuild-rpm-pkg-keep-spec-file-until-make-mrproper.patch
queue-4.14/gitignore-sort-normal-pattern-rules-alphabetically.patch
This is a note to let you know that I've just added the patch titled
.gitignore: sort normal pattern rules alphabetically
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
gitignore-sort-normal-pattern-rules-alphabetically.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1377dd3e29878b8f5d9f5c9000975f50a428a0cd Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Date: Tue, 31 Oct 2017 00:33:45 +0900
Subject: .gitignore: sort normal pattern rules alphabetically
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
commit 1377dd3e29878b8f5d9f5c9000975f50a428a0cd upstream.
We are having more and more ignore patterns. Sort the list
alphabetically. We will easily catch duplicated patterns if any.
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Signed-off-by: Rob Herring <robh(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.gitignore | 42 +++++++++++++++++++++---------------------
1 file changed, 21 insertions(+), 21 deletions(-)
--- a/.gitignore
+++ b/.gitignore
@@ -7,38 +7,38 @@
# command after changing this file, to see if there are
# any tracked files which get ignored after the change.
#
-# Normal rules
+# Normal rules (sorted alphabetically)
#
.*
+*.a
+*.bin
+*.bz2
+*.c.[012]*.*
+*.dwo
+*.elf
+*.gcno
+*.gz
+*.i
+*.ko
+*.ll
+*.lst
+*.lz4
+*.lzma
+*.lzo
+*.mod.c
*.o
*.o.*
-*.a
+*.order
+*.patch
*.s
-*.ko
*.so
*.so.dbg
-*.mod.c
-*.i
-*.lst
+*.su
*.symtypes
-*.order
-*.elf
-*.bin
*.tar
-*.gz
-*.bz2
-*.lzma
*.xz
-*.lz4
-*.lzo
-*.patch
-*.gcno
-*.ll
-modules.builtin
Module.symvers
-*.dwo
-*.su
-*.c.[012]*.*
+modules.builtin
#
# Top-level generic files
Patches currently in stable-queue which might be from yamada.masahiro(a)socionext.com are
queue-4.14/gitignore-move-.dtb-and-.dtb.s-patterns-to-the-top-level-.gitignore.patch
queue-4.14/kbuild-rpm-pkg-keep-spec-file-until-make-mrproper.patch
queue-4.14/gitignore-sort-normal-pattern-rules-alphabetically.patch
This is a note to let you know that I've just added the patch titled
.gitignore: move *.dtb and *.dtb.S patterns to the top-level .gitignore
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
gitignore-move-.dtb-and-.dtb.s-patterns-to-the-top-level-.gitignore.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 10b62a2f785ab55857380f0c63d9fa468fd8c676 Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Date: Tue, 31 Oct 2017 00:33:46 +0900
Subject: .gitignore: move *.dtb and *.dtb.S patterns to the top-level .gitignore
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
commit 10b62a2f785ab55857380f0c63d9fa468fd8c676 upstream.
Most of DT files are compiled under arch/*/boot/dts/, but we have some
other directories, like drivers/of/unittest-data/. We often miss to
add gitignore patterns per directory. Since there are no source files
that end with .dtb or .dtb.S, we can ignore the patterns globally.
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Signed-off-by: Rob Herring <robh(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.gitignore | 2 ++
arch/arc/boot/.gitignore | 1 -
arch/arm/boot/.gitignore | 1 -
arch/arm64/boot/dts/.gitignore | 1 -
arch/metag/boot/.gitignore | 1 -
arch/microblaze/boot/.gitignore | 1 -
arch/mips/boot/.gitignore | 1 -
arch/nios2/boot/.gitignore | 1 -
arch/powerpc/boot/.gitignore | 1 -
arch/xtensa/boot/.gitignore | 1 -
drivers/of/unittest-data/.gitignore | 2 --
11 files changed, 2 insertions(+), 11 deletions(-)
--- a/.gitignore
+++ b/.gitignore
@@ -14,6 +14,8 @@
*.bin
*.bz2
*.c.[012]*.*
+*.dtb
+*.dtb.S
*.dwo
*.elf
*.gcno
--- a/arch/arc/boot/.gitignore
+++ b/arch/arc/boot/.gitignore
@@ -1,2 +1 @@
-*.dtb*
uImage
--- a/arch/arm/boot/.gitignore
+++ b/arch/arm/boot/.gitignore
@@ -3,4 +3,3 @@ zImage
xipImage
bootpImage
uImage
-*.dtb
--- a/arch/arm64/boot/dts/.gitignore
+++ /dev/null
@@ -1 +0,0 @@
-*.dtb
--- a/arch/metag/boot/.gitignore
+++ b/arch/metag/boot/.gitignore
@@ -1,4 +1,3 @@
vmlinux*
uImage*
ramdisk.*
-*.dtb*
--- a/arch/microblaze/boot/.gitignore
+++ b/arch/microblaze/boot/.gitignore
@@ -1,3 +1,2 @@
-*.dtb
linux.bin*
simpleImage.*
--- a/arch/mips/boot/.gitignore
+++ b/arch/mips/boot/.gitignore
@@ -5,4 +5,3 @@ zImage
zImage.tmp
calc_vmlinuz_load_addr
uImage
-*.dtb
--- a/arch/nios2/boot/.gitignore
+++ b/arch/nios2/boot/.gitignore
@@ -1,2 +1 @@
-*.dtb
vmImage
--- a/arch/powerpc/boot/.gitignore
+++ b/arch/powerpc/boot/.gitignore
@@ -18,7 +18,6 @@ otheros.bld
uImage
cuImage.*
dtbImage.*
-*.dtb
treeImage.*
vmlinux.strip
zImage
--- a/arch/xtensa/boot/.gitignore
+++ b/arch/xtensa/boot/.gitignore
@@ -1,3 +1,2 @@
uImage
zImage.redboot
-*.dtb
--- a/drivers/of/unittest-data/.gitignore
+++ /dev/null
@@ -1,2 +0,0 @@
-testcases.dtb
-testcases.dtb.S
Patches currently in stable-queue which might be from yamada.masahiro(a)socionext.com are
queue-4.14/gitignore-move-.dtb-and-.dtb.s-patterns-to-the-top-level-.gitignore.patch
queue-4.14/kbuild-rpm-pkg-keep-spec-file-until-make-mrproper.patch
queue-4.14/gitignore-sort-normal-pattern-rules-alphabetically.patch
On Wed, Feb 7, 2018 at 12:23 PM, Yves-Alexis Perez <corsac(a)debian.org> wrote:
> On Wed, 2018-02-07 at 18:05 +0100, Yves-Alexis Perez wrote:
>> I'll try to printk the mtu before returning EINVAL to see why it's lower than
>> 1280, but maybe the IP encapsulation is not correctly handled?
>
> I did:
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 3763dc01e374..d3c651158d35 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1215,7 +1215,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
> mtu = np->frag_size;
> }
> if (mtu < IPV6_MIN_MTU)
> - return -EINVAL;
> + printk("mtu: %d\n", mtu);
> cork->base.fragsize = mtu;
> if (dst_allfrag(rt->dst.path))
> cork->base.flags |= IPCORK_ALLFRAG;
>
> and I get:
>
> févr. 07 18:19:50 scapa kernel: mtu: 1218
>
> and it doesn't depend on the original packet size (same thing happens with
> ping -s 100). It also happens with UDP (DNS) traffic, but apparently not with
> TCP.
>
> Regards,
> --
> Yves-Alexis
Hi Yves-Alexis -
I apologize for the problem. It seems to me that tunneling with an
outer MTU that causes the inner MTU to be smaller than the min, is
potentially problematic in other ways as well.
But also it could seem unfortunate that the code with my fix does not
look at actual packet size, but instead only looks at the MTU and then
fails, even if no packet was going to be so large. The intention of
my patch was to prevent a negative number while calculating the
maxfraglen in __ip6_append_data(). An alternative fix maybe to
instead return an error only if the mtu is less than or equal to the
fragheaderlen. Something like:
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3763dc01e374..5d912a289b95 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1214,8 +1214,6 @@ static int ip6_setup_cork(struct sock *sk,
struct inet_cork_full *cork,
if (np->frag_size)
mtu = np->frag_size;
}
- if (mtu < IPV6_MIN_MTU)
- return -EINVAL;
cork->base.fragsize = mtu;
if (dst_allfrag(rt->dst.path))
cork->base.flags |= IPCORK_ALLFRAG;
@@ -1264,6 +1262,8 @@ static int __ip6_append_data(struct sock *sk,
fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len +
(opt ? opt->opt_nflen : 0);
+ if (mtu < fragheaderlen + 8)
+ return -EINVAL;
maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
sizeof(struct frag_hdr);
(opt ? opt->opt_nflen : 0);
But then we also have to convince ourselves that maxfraglen can never
be <= 0. I'd have to think about that.
I am not sure if others have thoughts on supporting MTUs configured
below the min in the spec.
Thanks.
--
Mike Maloney
On Tue, Feb 06, 2018 at 11:42:15AM +0000, Harsh Shandilya wrote:
> On Tue 6 Feb, 2018, 4:04 PM Greg Kroah-Hartman, <gregkh(a)linuxfoundation.org>
> wrote:
>
> > On Tue, Feb 06, 2018 at 06:48:53AM +0000, Harsh Shandilya wrote:
> > > On Tue 6 Feb, 2018, 12:09 AM Greg Kroah-Hartman, <
> > gregkh(a)linuxfoundation.org>
> > > wrote:
> > >
> > > > This is the start of the stable review cycle for the 3.18.94 release.
> > > > There are 36 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Wed Feb 7 18:23:41 UTC 2018.
> > > > Anything received after that time might be too late.
> > > >
> > > > The whole patch series can be found in one patch at:
> > > >
> > > > kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.94-rc1.gz
> > > > or in the git tree and branch at:
> > > > git://
> > git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> > > > linux-3.18.y
> > > > and the diffstat can be found below.
> > > >
> > > > thanks,
> > > >
> > > > greg k-h
> > > >
> > >
> > > Builds and boots on the OnePlus 3T, no regressions noticed.
> >
> > Yeah! That device is the only reason I keep this tree alive :)
> >
> > thanks for testing and letting me know.
> >
> > greg k-h
> >
>
> Guess I should drop my kernel tree for the device and save you the pain of
> maintaining 3.18 :P
Heh, maybe :)
> Atleast CAF has been keeping up with upstream now thanks to your
> kernel-common merges so there's still hope for MSM platform users :)
That's good to see. Now if only those platform users would actually
update their kernels to these new versions :(
> P.S. common merge should go in cleanly, both the merge conflicts I had were
> from CAF changes.
Thanks for letting me know, that's great to hear as I just had a
question from some companies who are worried that taking stable patches
will cause tons of merge issues. It hasn't in my experience, and seeing
reports of this from others is great news.
thanks,
greg k-h
This is a note to let you know that I've just added the patch titled
x86/microcode/AMD: Do not load when running on a hypervisor
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-microcode-amd-do-not-load-when-running-on-a-hypervisor.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a15a753539eca8ba243d576f02e7ca9c4b7d7042 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp(a)suse.de>
Date: Sun, 18 Dec 2016 17:44:13 +0100
Subject: x86/microcode/AMD: Do not load when running on a hypervisor
From: Borislav Petkov <bp(a)suse.de>
commit a15a753539eca8ba243d576f02e7ca9c4b7d7042 upstream.
Doing so is completely void of sense for multiple reasons so prevent
it. Set dis_ucode_ldr to true and thus disable the microcode loader by
default to address xen pv guests which execute the AP path but not the
BSP path.
By having it turned off by default, the APs won't run into the loader
either.
Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the
xen pv one. That one gets the aforementioned "fix".
Also, improve the detection method by caching the final decision whether
to continue loading in dis_ucode_ldr and do it once on the BSP. The APs
then simply test that value.
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Tested-by: Juergen Gross <jgross(a)suse.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Acked-by: Juergen Gross <jgross(a)suse.com>
Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Rolf Neugebauer <rolf.neugebauer(a)docker.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/microcode/core.c | 28 +++++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -43,7 +43,7 @@
#define MICROCODE_VERSION "2.01"
static struct microcode_ops *microcode_ops;
-static bool dis_ucode_ldr;
+static bool dis_ucode_ldr = true;
/*
* Synchronization.
@@ -73,6 +73,7 @@ struct cpu_info_ctx {
static bool __init check_loader_disabled_bsp(void)
{
static const char *__dis_opt_str = "dis_ucode_ldr";
+ u32 a, b, c, d;
#ifdef CONFIG_X86_32
const char *cmdline = (const char *)__pa_nodebug(boot_command_line);
@@ -85,8 +86,23 @@ static bool __init check_loader_disabled
bool *res = &dis_ucode_ldr;
#endif
- if (cmdline_find_option_bool(cmdline, option))
- *res = true;
+ if (!have_cpuid_p())
+ return *res;
+
+ a = 1;
+ c = 0;
+ native_cpuid(&a, &b, &c, &d);
+
+ /*
+ * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
+ * completely accurate as xen pv guests don't see that CPUID bit set but
+ * that's good enough as they don't land on the BSP path anyway.
+ */
+ if (c & BIT(31))
+ return *res;
+
+ if (cmdline_find_option_bool(cmdline, option) <= 0)
+ *res = false;
return *res;
}
@@ -118,9 +134,6 @@ void __init load_ucode_bsp(void)
if (check_loader_disabled_bsp())
return;
- if (!have_cpuid_p())
- return;
-
vendor = x86_cpuid_vendor();
family = x86_cpuid_family();
@@ -154,9 +167,6 @@ void load_ucode_ap(void)
if (check_loader_disabled_ap())
return;
- if (!have_cpuid_p())
- return;
-
vendor = x86_cpuid_vendor();
family = x86_cpuid_family();
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-microcode-amd-do-not-load-when-running-on-a-hypervisor.patch
There has been a coding error in rtl8821ae since it was first introduced,
namely that an 8-bit register was read using a 16-bit read in
_rtl8821ae_dbi_read(). This error was fixed with commit 40b368af4b75
("rtlwifi: Fix alignment issues"); however, this change led to
instability in the connection. To restore stability, this change
was reverted in commit b8b8b16352cd ("rtlwifi: rtl8821ae: Fix connection
lost problem").
Unfortunately, the unaligned access causes machine checks in ARM
architecture, and we were finally forced to find the actual cause of the
problem on x86 platforms. Following a suggestion from Pkshih
<pkshih(a)realtek.com>, it was found that increasing the ASPM L1
latency from 0 to 7 fixed the instability. This parameter was varied to
see if a smaller value would work; however, it appears that 7 is the
safest value. A new symbol is defined for this quantity, thus it can be
easily changed if necessary.
Fixes: b8b8b16352cd ("rtlwifi: rtl8821ae: Fix connection lost problem")
Cc: Stable <stable(a)vger.kernel.org> # 4.14+
Fix-suggested-by: Pkshih <pkshih(a)realtek.com>
Signed-off-by: Larry Finger <Larry.Finger(a)lwfinger.net>
---
Kalle,
This patch should be submitted to 4.16.
Larry
---
drivers/net/wireless/realtek/rtlwifi/rtl8821ae/hw.c | 5 +++--
drivers/net/wireless/realtek/rtlwifi/wifi.h | 1 +
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/hw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/hw.c
index f20e77b4bb65..317c1b3101da 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/hw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/hw.c
@@ -1123,7 +1123,7 @@ static u8 _rtl8821ae_dbi_read(struct rtl_priv *rtlpriv, u16 addr)
}
if (0 == tmp) {
read_addr = REG_DBI_RDATA + addr % 4;
- ret = rtl_read_word(rtlpriv, read_addr);
+ ret = rtl_read_byte(rtlpriv, read_addr);
}
return ret;
}
@@ -1165,7 +1165,8 @@ static void _rtl8821ae_enable_aspm_back_door(struct ieee80211_hw *hw)
}
tmp = _rtl8821ae_dbi_read(rtlpriv, 0x70f);
- _rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7));
+ _rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7) |
+ ASPM_L1_LATENCY << 3);
tmp = _rtl8821ae_dbi_read(rtlpriv, 0x719);
_rtl8821ae_dbi_write(rtlpriv, 0x719, tmp | BIT(3) | BIT(4));
diff --git a/drivers/net/wireless/realtek/rtlwifi/wifi.h b/drivers/net/wireless/realtek/rtlwifi/wifi.h
index 1c9ed28b42da..4f48b934ec01 100644
--- a/drivers/net/wireless/realtek/rtlwifi/wifi.h
+++ b/drivers/net/wireless/realtek/rtlwifi/wifi.h
@@ -99,6 +99,7 @@
#define RTL_USB_MAX_RX_COUNT 100
#define QBSS_LOAD_SIZE 5
#define MAX_WMMELE_LENGTH 64
+#define ASPM_L1_LATENCY 7
#define TOTAL_CAM_ENTRY 32
--
2.16.1
When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno. This confuses the signaler into action and
signaling the fence.
Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v4.12+
---
drivers/gpu/drm/i915/intel_breadcrumbs.c | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index efbc627a2a25..b955f7d7bd0f 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -588,29 +588,16 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
spin_unlock_irq(&b->rb_lock);
}
-static bool signal_valid(const struct drm_i915_gem_request *request)
-{
- return intel_wait_check_request(&request->signaling.wait, request);
-}
-
static bool signal_complete(const struct drm_i915_gem_request *request)
{
if (!request)
return false;
- /* If another process served as the bottom-half it may have already
- * signalled that this wait is already completed.
- */
- if (intel_wait_complete(&request->signaling.wait))
- return signal_valid(request);
-
- /* Carefully check if the request is complete, giving time for the
+ /*
+ * Carefully check if the request is complete, giving time for the
* seqno to be visible or if the GPU hung.
*/
- if (__i915_request_irq_complete(request))
- return true;
-
- return false;
+ return __i915_request_irq_complete(request);
}
static struct drm_i915_gem_request *to_signaler(struct rb_node *rb)
@@ -712,6 +699,7 @@ static int intel_breadcrumbs_signaler(void *arg)
&request->fence.flags)) {
local_bh_disable();
dma_fence_signal(&request->fence);
+ GEM_BUG_ON(!i915_gem_request_completed(request));
local_bh_enable(); /* kick start the tasklets */
}
--
2.15.1
From: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
commit 641307df71fe77d7b38a477067495ede05d47295 upstream.
When stopping the CRTC the driver must disable all planes and wait for
the change to take effect at the next vblank. Merely calling
drm_crtc_wait_one_vblank() is not enough, as the function doesn't
include any mechanism to handle the race with vblank interrupts.
Replace the drm_crtc_wait_one_vblank() call with a manual mechanism that
handles the vblank interrupt race.
Cc: stable(a)vger.kernel.org
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
---
drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 53 ++++++++++++++++++++++++++++++----
drivers/gpu/drm/rcar-du/rcar_du_crtc.h | 8 +++++
2 files changed, 55 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
index 848f7f2..3322b15 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -392,6 +392,31 @@ static void rcar_du_crtc_start(struct rcar_du_crtc *rcrtc)
rcrtc->started = true;
}
+static void rcar_du_crtc_disable_planes(struct rcar_du_crtc *rcrtc)
+{
+ struct rcar_du_device *rcdu = rcrtc->group->dev;
+ struct drm_crtc *crtc = &rcrtc->crtc;
+ u32 status;
+ /* Make sure vblank interrupts are enabled. */
+ drm_crtc_vblank_get(crtc);
+ /*
+ * Disable planes and calculate how many vertical blanking interrupts we
+ * have to wait for. If a vertical blanking interrupt has been triggered
+ * but not processed yet, we don't know whether it occurred before or
+ * after the planes got disabled. We thus have to wait for two vblank
+ * interrupts in that case.
+ */
+ spin_lock_irq(&rcrtc->vblank_lock);
+ rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
+ status = rcar_du_crtc_read(rcrtc, DSSR);
+ rcrtc->vblank_count = status & DSSR_VBK ? 2 : 1;
+ spin_unlock_irq(&rcrtc->vblank_lock);
+ if (!wait_event_timeout(rcrtc->vblank_wait, rcrtc->vblank_count == 0,
+ msecs_to_jiffies(100)))
+ dev_warn(rcdu->dev, "vertical blanking timeout\n");
+ drm_crtc_vblank_put(crtc);
+}
+
static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
{
struct drm_crtc *crtc = &rcrtc->crtc;
@@ -400,17 +425,16 @@ static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
return;
/* Disable all planes and wait for the change to take effect. This is
- * required as the DSnPR registers are updated on vblank, and no vblank
- * will occur once the CRTC is stopped. Disabling planes when starting
- * the CRTC thus wouldn't be enough as it would start scanning out
- * immediately from old frame buffers until the next vblank.
+ * required as the plane enable registers are updated on vblank, and no
+ * vblank will occur once the CRTC is stopped. Disabling planes when
+ * starting the CRTC thus wouldn't be enough as it would start scanning
+ * out immediately from old frame buffers until the next vblank.
*
* This increases the CRTC stop delay, especially when multiple CRTCs
* are stopped in one operation as we now wait for one vblank per CRTC.
* Whether this can be improved needs to be researched.
*/
- rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
- drm_crtc_wait_one_vblank(crtc);
+ rcar_du_crtc_disable_planes(rcrtc);
/* Disable vertical blanking interrupt reporting. We first need to wait
* for page flip completion before stopping the CRTC as userspace
@@ -548,10 +572,25 @@ static irqreturn_t rcar_du_crtc_irq(int irq, void *arg)
irqreturn_t ret = IRQ_NONE;
u32 status;
+ spin_lock(&rcrtc->vblank_lock);
+
status = rcar_du_crtc_read(rcrtc, DSSR);
rcar_du_crtc_write(rcrtc, DSRCR, status & DSRCR_MASK);
if (status & DSSR_VBK) {
+ /*
+ * Wake up the vblank wait if the counter reaches 0. This must
+ * be protected by the vblank_lock to avoid races in
+ * rcar_du_crtc_disable_planes().
+ */
+ if (rcrtc->vblank_count) {
+ if (--rcrtc->vblank_count == 0)
+ wake_up(&rcrtc->vblank_wait);
+ }
+ }
+ spin_unlock(&rcrtc->vblank_lock);
+
+ if (status & DSSR_VBK) {
drm_crtc_handle_vblank(&rcrtc->crtc);
rcar_du_crtc_finish_page_flip(rcrtc);
ret = IRQ_HANDLED;
@@ -606,6 +645,8 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, unsigned int index)
}
init_waitqueue_head(&rcrtc->flip_wait);
+ init_waitqueue_head(&rcrtc->vblank_wait);
+ spin_lock_init(&rcrtc->vblank_lock);
rcrtc->group = rgrp;
rcrtc->mmio_offset = mmio_offsets[index];
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
index 6f08b7e..48bef05 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -15,6 +15,7 @@
#define __RCAR_DU_CRTC_H__
#include <linux/mutex.h>
+#include <linux/spinlock.h>
#include <linux/wait.h>
#include <drm/drmP.h>
@@ -33,6 +34,9 @@
* @started: whether the CRTC has been started and is running
* @event: event to post when the pending page flip completes
* @flip_wait: wait queue used to signal page flip completion
+ * @vblank_lock: protects vblank_wait and vblank_count
+ * @vblank_wait: wait queue used to signal vertical blanking
+ * @vblank_count: number of vertical blanking interrupts to wait for
* @outputs: bitmask of the outputs (enum rcar_du_output) driven by this CRTC
* @group: CRTC group this CRTC belongs to
*/
@@ -48,6 +52,10 @@ struct rcar_du_crtc {
struct drm_pending_vblank_event *event;
wait_queue_head_t flip_wait;
+ spinlock_t vblank_lock;
+ wait_queue_head_t vblank_wait;
+ unsigned int vblank_count;
+
unsigned int outputs;
struct rcar_du_group *group;
--
1.9.1
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
commit 1f8754d4daea5f257370a52a30fcb22798c54516 upstream.
If SSI uses shared pin, some SSI will be used as parent SSI.
Then, normal SSI's remove and Parent SSI's remove
(these are same SSI) will be called when unbind or remove timing.
In this case, free_irq() will be called twice.
This patch solve this issue.
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx(a)renesas.com>
Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx(a)renesas.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
---
sound/soc/sh/rcar/ssi.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/sound/soc/sh/rcar/ssi.c b/sound/soc/sh/rcar/ssi.c
index 6cb6db0..9472d99 100644
--- a/sound/soc/sh/rcar/ssi.c
+++ b/sound/soc/sh/rcar/ssi.c
@@ -694,9 +694,14 @@ static int rsnd_ssi_dma_remove(struct rsnd_mod *mod,
struct rsnd_priv *priv)
{
struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
+ struct rsnd_mod *ssi_parent_mod = rsnd_io_to_mod_ssip(io);
struct device *dev = rsnd_priv_to_dev(priv);
int irq = ssi->irq;
+ /* Do nothing for SSI parent mod */
+ if (ssi_parent_mod == mod)
+ return 0;
+
/* PIO will request IRQ again */
devm_free_irq(dev, irq, mod);
--
1.9.1
I see this was just applied to Linus's tree. This too probably should be
tagged for stable as well.
-- Steve
On Tue, 6 Feb 2018 03:54:42 -0800
"tip-bot for Steven Rostedt (VMware)" <tipbot(a)zytor.com> wrote:
> Commit-ID: 364f56653708ba8bcdefd4f0da2a42904baa8eeb
> Gitweb: https://git.kernel.org/tip/364f56653708ba8bcdefd4f0da2a42904baa8eeb
> Author: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
> AuthorDate: Tue, 23 Jan 2018 20:45:38 -0500
> Committer: Ingo Molnar <mingo(a)kernel.org>
> CommitDate: Tue, 6 Feb 2018 10:20:33 +0100
>
> sched/rt: Up the root domain ref count when passing it around via IPIs
>
> When issuing an IPI RT push, where an IPI is sent to each CPU that has more
> than one RT task scheduled on it, it references the root domain's rto_mask,
> that contains all the CPUs within the root domain that has more than one RT
> task in the runable state. The problem is, after the IPIs are initiated, the
> rq->lock is released. This means that the root domain that is associated to
> the run queue could be freed while the IPIs are going around.
>
> Add a sched_get_rd() and a sched_put_rd() that will increment and decrement
> the root domain's ref count respectively. This way when initiating the IPIs,
> the scheduler will up the root domain's ref count before releasing the
> rq->lock, ensuring that the root domain does not go away until the IPI round
> is complete.
>
> Reported-by: Pavan Kondeti <pkondeti(a)codeaurora.org>
> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
> Cc: Andrew Morton <akpm(a)linux-foundation.org>
> Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
> Cc: Mike Galbraith <efault(a)gmx.de>
> Cc: Peter Zijlstra <peterz(a)infradead.org>
> Cc: Thomas Gleixner <tglx(a)linutronix.de>
> Fixes: 4bdced5c9a292 ("sched/rt: Simplify the IPI based RT balancing logic")
> Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtN…
> Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
> ---
> kernel/sched/rt.c | 9 +++++++--
> kernel/sched/sched.h | 2 ++
> kernel/sched/topology.c | 13 +++++++++++++
> 3 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 2fb627d..89a086e 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1990,8 +1990,11 @@ static void tell_cpu_to_push(struct rq *rq)
>
> rto_start_unlock(&rq->rd->rto_loop_start);
>
> - if (cpu >= 0)
> + if (cpu >= 0) {
> + /* Make sure the rd does not get freed while pushing */
> + sched_get_rd(rq->rd);
> irq_work_queue_on(&rq->rd->rto_push_work, cpu);
> + }
> }
>
> /* Called from hardirq context */
> @@ -2021,8 +2024,10 @@ void rto_push_irq_work_func(struct irq_work *work)
>
> raw_spin_unlock(&rd->rto_lock);
>
> - if (cpu < 0)
> + if (cpu < 0) {
> + sched_put_rd(rd);
> return;
> + }
>
> /* Try the next RT overloaded CPU */
> irq_work_queue_on(&rd->rto_push_work, cpu);
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 2e95505..fb5fc45 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -691,6 +691,8 @@ extern struct mutex sched_domains_mutex;
> extern void init_defrootdomain(void);
> extern int sched_init_domains(const struct cpumask *cpu_map);
> extern void rq_attach_root(struct rq *rq, struct root_domain *rd);
> +extern void sched_get_rd(struct root_domain *rd);
> +extern void sched_put_rd(struct root_domain *rd);
>
> #ifdef HAVE_RT_PUSH_IPI
> extern void rto_push_irq_work_func(struct irq_work *work);
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 034cbed..519b024 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -259,6 +259,19 @@ void rq_attach_root(struct rq *rq, struct root_domain *rd)
> call_rcu_sched(&old_rd->rcu, free_rootdomain);
> }
>
> +void sched_get_rd(struct root_domain *rd)
> +{
> + atomic_inc(&rd->refcount);
> +}
> +
> +void sched_put_rd(struct root_domain *rd)
> +{
> + if (!atomic_dec_and_test(&rd->refcount))
> + return;
> +
> + call_rcu_sched(&rd->rcu, free_rootdomain);
> +}
> +
> static int init_rootdomain(struct root_domain *rd)
> {
> if (!zalloc_cpumask_var(&rd->span, GFP_KERNEL))
I see this was just applied to Linus's tree. It probably should be
tagged for stable as well.
-- Steve
On Tue, 6 Feb 2018 03:54:16 -0800
"tip-bot for Steven Rostedt (VMware)" <tipbot(a)zytor.com> wrote:
> Commit-ID: ad0f1d9d65938aec72a698116cd73a980916895e
> Gitweb: https://git.kernel.org/tip/ad0f1d9d65938aec72a698116cd73a980916895e
> Author: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
> AuthorDate: Tue, 23 Jan 2018 20:45:37 -0500
> Committer: Ingo Molnar <mingo(a)kernel.org>
> CommitDate: Tue, 6 Feb 2018 10:20:33 +0100
>
> sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()
>
> When the rto_push_irq_work_func() is called, it looks at the RT overloaded
> bitmask in the root domain via the runqueue (rq->rd). The problem is that
> during CPU up and down, nothing here stops rq->rd from changing between
> taking the rq->rd->rto_lock and releasing it. That means the lock that is
> released is not the same lock that was taken.
>
> Instead of using this_rq()->rd to get the root domain, as the irq work is
> part of the root domain, we can simply get the root domain from the irq work
> that is passed to the routine:
>
> container_of(work, struct root_domain, rto_push_work)
>
> This keeps the root domain consistent.
>
> Reported-by: Pavan Kondeti <pkondeti(a)codeaurora.org>
> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
> Cc: Andrew Morton <akpm(a)linux-foundation.org>
> Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
> Cc: Mike Galbraith <efault(a)gmx.de>
> Cc: Peter Zijlstra <peterz(a)infradead.org>
> Cc: Thomas Gleixner <tglx(a)linutronix.de>
> Fixes: 4bdced5c9a292 ("sched/rt: Simplify the IPI based RT balancing logic")
> Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtN…
> Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
> ---
> kernel/sched/rt.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 862a513..2fb627d 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1907,9 +1907,8 @@ static void push_rt_tasks(struct rq *rq)
> * the rt_loop_next will cause the iterator to perform another scan.
> *
> */
> -static int rto_next_cpu(struct rq *rq)
> +static int rto_next_cpu(struct root_domain *rd)
> {
> - struct root_domain *rd = rq->rd;
> int next;
> int cpu;
>
> @@ -1985,7 +1984,7 @@ static void tell_cpu_to_push(struct rq *rq)
> * Otherwise it is finishing up and an ipi needs to be sent.
> */
> if (rq->rd->rto_cpu < 0)
> - cpu = rto_next_cpu(rq);
> + cpu = rto_next_cpu(rq->rd);
>
> raw_spin_unlock(&rq->rd->rto_lock);
>
> @@ -1998,6 +1997,8 @@ static void tell_cpu_to_push(struct rq *rq)
> /* Called from hardirq context */
> void rto_push_irq_work_func(struct irq_work *work)
> {
> + struct root_domain *rd =
> + container_of(work, struct root_domain, rto_push_work);
> struct rq *rq;
> int cpu;
>
> @@ -2013,18 +2014,18 @@ void rto_push_irq_work_func(struct irq_work *work)
> raw_spin_unlock(&rq->lock);
> }
>
> - raw_spin_lock(&rq->rd->rto_lock);
> + raw_spin_lock(&rd->rto_lock);
>
> /* Pass the IPI to the next rt overloaded queue */
> - cpu = rto_next_cpu(rq);
> + cpu = rto_next_cpu(rd);
>
> - raw_spin_unlock(&rq->rd->rto_lock);
> + raw_spin_unlock(&rd->rto_lock);
>
> if (cpu < 0)
> return;
>
> /* Try the next RT overloaded CPU */
> - irq_work_queue_on(&rq->rd->rto_push_work, cpu);
> + irq_work_queue_on(&rd->rto_push_work, cpu);
> }
> #endif /* HAVE_RT_PUSH_IPI */
>
From: Huang Ying <ying.huang(a)intel.com>
It was reported by Sergey Senozhatsky that if THP (Transparent Huge
Page) and frontswap (via zswap) are both enabled, when memory goes low
so that swap is triggered, segfault and memory corruption will occur
in random user space applications as follow,
kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
#0 0x00007fc08889ae0d _int_malloc (libc.so.6)
#1 0x00007fc08889c2f3 malloc (libc.so.6)
#2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
#3 0x0000560e6005e75c n/a (urxvt)
#4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
#5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
#6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
#7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
#8 0x0000560e6005cb55 ev_run (urxvt)
#9 0x0000560e6003b9b9 main (urxvt)
#10 0x00007fc08883af4a __libc_start_main (libc.so.6)
#11 0x0000560e6003f9da _start (urxvt)
After bisection, it was found the first bad commit is
bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped
out").
The root cause is as follow.
When the pages are written to storage device during swapping out in
swap_writepage(), zswap (fontswap) is tried to compress the pages
instead to improve the performance. But zswap (frontswap) will treat
THP as normal page, so only the head page is saved. After swapping
in, tail pages will not be restored to its original contents, so cause
the memory corruption in the applications.
This is fixed via splitting THP at the begin of swapping out if
frontswap is enabled. To avoid frontswap to be enabled at runtime,
whether the page is THP is checked before using frontswap during
swapping out too.
Reported-and-tested-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Dan Streetman <ddstreet(a)ieee.org>
Cc: Seth Jennings <sjenning(a)redhat.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: Shaohua Li <shli(a)kernel.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: stable(a)vger.kernel.org # 4.14
Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out")
---
mm/page_io.c | 2 +-
mm/vmscan.c | 16 +++++++++++++---
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/mm/page_io.c b/mm/page_io.c
index b41cf9644585..6dca817ae7a0 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -250,7 +250,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
unlock_page(page);
goto out;
}
- if (frontswap_store(page) == 0) {
+ if (!PageTransHuge(page) && frontswap_store(page) == 0) {
set_page_writeback(page);
unlock_page(page);
end_page_writeback(page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index bee53495a829..d1c1e00b08bb 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -55,6 +55,7 @@
#include <linux/swapops.h>
#include <linux/balloon_compaction.h>
+#include <linux/frontswap.h>
#include "internal.h"
@@ -1063,14 +1064,23 @@ static unsigned long shrink_page_list(struct list_head *page_list,
/* cannot split THP, skip it */
if (!can_split_huge_page(page, NULL))
goto activate_locked;
+ /*
+ * Split THP if frontswap enabled,
+ * because it cannot process THP
+ */
+ if (frontswap_enabled()) {
+ if (split_huge_page_to_list(
+ page, page_list))
+ goto activate_locked;
+ }
/*
* Split pages without a PMD map right
* away. Chances are some or all of the
* tail pages can be freed without IO.
*/
- if (!compound_mapcount(page) &&
- split_huge_page_to_list(page,
- page_list))
+ else if (!compound_mapcount(page) &&
+ split_huge_page_to_list(page,
+ page_list))
goto activate_locked;
}
if (!add_to_swap(page)) {
--
2.15.1
From: Eric Biggers <ebiggers(a)google.com>
The asymmetric key type allows an X.509 certificate to be added even if
its signature's hash algorithm is not available in the crypto API. In
that case 'payload.data[asym_auth]' will be NULL. But the key
restriction code failed to check for this case before trying to use the
signature, resulting in a NULL pointer dereference in
key_or_keyring_common() or in restrict_link_by_signature().
Fix this by returning -ENOPKG when the signature is unsupported.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled and
keyctl has support for the 'restrict_keyring' command:
keyctl new_session
keyctl restrict_keyring @s asymmetric builtin_trusted
openssl req -new -sha512 -x509 -batch -nodes -outform der \
| keyctl padd asymmetric desc @s
Fixes: a511e1af8b12 ("KEYS: Move the point of trust determination to __key_link()")
Cc: <stable(a)vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/restrict.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/crypto/asymmetric_keys/restrict.c b/crypto/asymmetric_keys/restrict.c
index 86fb68508952..7c93c7728454 100644
--- a/crypto/asymmetric_keys/restrict.c
+++ b/crypto/asymmetric_keys/restrict.c
@@ -67,8 +67,9 @@ __setup("ca_keys=", ca_keys_setup);
*
* Returns 0 if the new certificate was accepted, -ENOKEY if we couldn't find a
* matching parent certificate in the trusted list, -EKEYREJECTED if the
- * signature check fails or the key is blacklisted and some other error if
- * there is a matching certificate but the signature check cannot be performed.
+ * signature check fails or the key is blacklisted, -ENOPKG if the signature
+ * uses unsupported crypto, or some other error if there is a matching
+ * certificate but the signature check cannot be performed.
*/
int restrict_link_by_signature(struct key *dest_keyring,
const struct key_type *type,
@@ -88,6 +89,8 @@ int restrict_link_by_signature(struct key *dest_keyring,
return -EOPNOTSUPP;
sig = payload->data[asym_auth];
+ if (!sig)
+ return -ENOPKG;
if (!sig->auth_ids[0] && !sig->auth_ids[1])
return -ENOKEY;
@@ -139,6 +142,8 @@ static int key_or_keyring_common(struct key *dest_keyring,
return -EOPNOTSUPP;
sig = payload->data[asym_auth];
+ if (!sig)
+ return -ENOPKG;
if (!sig->auth_ids[0] && !sig->auth_ids[1])
return -ENOKEY;
@@ -222,9 +227,9 @@ static int key_or_keyring_common(struct key *dest_keyring,
*
* Returns 0 if the new certificate was accepted, -ENOKEY if we
* couldn't find a matching parent certificate in the trusted list,
- * -EKEYREJECTED if the signature check fails, and some other error if
- * there is a matching certificate but the signature check cannot be
- * performed.
+ * -EKEYREJECTED if the signature check fails, -ENOPKG if the signature uses
+ * unsupported crypto, or some other error if there is a matching certificate
+ * but the signature check cannot be performed.
*/
int restrict_link_by_key_or_keyring(struct key *dest_keyring,
const struct key_type *type,
@@ -249,9 +254,9 @@ int restrict_link_by_key_or_keyring(struct key *dest_keyring,
*
* Returns 0 if the new certificate was accepted, -ENOKEY if we
* couldn't find a matching parent certificate in the trusted list,
- * -EKEYREJECTED if the signature check fails, and some other error if
- * there is a matching certificate but the signature check cannot be
- * performed.
+ * -EKEYREJECTED if the signature check fails, -ENOPKG if the signature uses
+ * unsupported crypto, or some other error if there is a matching certificate
+ * but the signature check cannot be performed.
*/
int restrict_link_by_key_or_keyring_chain(struct key *dest_keyring,
const struct key_type *type,
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
If there is a blacklisted certificate in a SignerInfo's certificate
chain, then pkcs7_verify_sig_chain() sets sinfo->blacklisted and returns
0. But, pkcs7_verify() fails to handle this case appropriately, as it
actually continues on to the line 'actual_ret = 0;', indicating that the
SignerInfo has passed verification. Consequently, PKCS#7 signature
verification ignores the certificate blacklist.
Fix this by not considering blacklisted SignerInfos to have passed
verification.
Also fix the function comment with regards to when 0 is returned.
Fixes: 03bb79315ddc ("PKCS#7: Handle blacklisted certificates")
Cc: <stable(a)vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/pkcs7_verify.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/crypto/asymmetric_keys/pkcs7_verify.c b/crypto/asymmetric_keys/pkcs7_verify.c
index 2f6a768b91d7..97c77f66b20d 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -366,8 +366,7 @@ static int pkcs7_verify_one(struct pkcs7_message *pkcs7,
*
* (*) -EBADMSG if some part of the message was invalid, or:
*
- * (*) 0 if no signature chains were found to be blacklisted or to contain
- * unsupported crypto, or:
+ * (*) 0 if a signature chain passed verification, or:
*
* (*) -EKEYREJECTED if a blacklisted key was encountered, or:
*
@@ -423,8 +422,11 @@ int pkcs7_verify(struct pkcs7_message *pkcs7,
for (sinfo = pkcs7->signed_infos; sinfo; sinfo = sinfo->next) {
ret = pkcs7_verify_one(pkcs7, sinfo);
- if (sinfo->blacklisted && actual_ret == -ENOPKG)
- actual_ret = -EKEYREJECTED;
+ if (sinfo->blacklisted) {
+ if (actual_ret == -ENOPKG)
+ actual_ret = -EKEYREJECTED;
+ continue;
+ }
if (ret < 0) {
if (ret == -ENOPKG) {
sinfo->unsupported_crypto = true;
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
When pkcs7_verify_sig_chain() is building the certificate chain for a
SignerInfo using the certificates in the PKCS#7 message, it is passing
the wrong arguments to public_key_verify_signature(). Consequently,
when the next certificate is supposed to be used to verify the previous
certificate, the next certificate is actually used to verify itself.
An attacker can use this bug to create a bogus certificate chain that
has no cryptographic relationship between the beginning and end.
Fortunately I couldn't quite find a way to use this to bypass the
overall signature verification, though it comes very close. Here's the
reasoning: due to the bug, every certificate in the chain beyond the
first actually has to be self-signed (where "self-signed" here refers to
the actual key and signature; an attacker might still manipulate the
certificate fields such that the self_signed flag doesn't actually get
set, and thus the chain doesn't end immediately). But to pass trust
validation (pkcs7_validate_trust()), either the SignerInfo or one of the
certificates has to actually be signed by a trusted key. Since only
self-signed certificates can be added to the chain, the only way for an
attacker to introduce a trusted signature is to include a self-signed
trusted certificate.
But, when pkcs7_validate_trust_one() reaches that certificate, instead
of trying to verify the signature on that certificate, it will actually
look up the corresponding trusted key, which will succeed, and then try
to verify the *previous* certificate, which will fail. Thus, disaster
is narrowly averted (as far as I could tell).
Fixes: 6c2dc5ae4ab7 ("X.509: Extract signature digest and make self-signed cert checks earlier")
Cc: <stable(a)vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/pkcs7_verify.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/crypto/asymmetric_keys/pkcs7_verify.c b/crypto/asymmetric_keys/pkcs7_verify.c
index 39e6de0c2761..2f6a768b91d7 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -270,7 +270,7 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message *pkcs7,
sinfo->index);
return 0;
}
- ret = public_key_verify_signature(p->pub, p->sig);
+ ret = public_key_verify_signature(p->pub, x509->sig);
if (ret < 0)
return ret;
x509->signer = p;
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
Subject: pipe: actually allow root to exceed the pipe buffer limits
pipe-user-pages-hard and pipe-user-pages-soft are only supposed to apply
to unprivileged users, as documented in both Documentation/sysctl/fs.txt
and the pipe(7) man page.
However, the capabilities are actually only checked when increasing a
pipe's size using F_SETPIPE_SZ, not when creating a new pipe. Therefore,
if pipe-user-pages-hard has been set, the root user can run into it and be
unable to create pipes. Similarly, if pipe-user-pages-soft has been set,
the root user can run into it and have their pipes limited to 1 page each.
Fix this by allowing the privileged override in both cases.
Link: http://lkml.kernel.org/r/20180111052902.14409-4-ebiggers3@gmail.com
Fixes: 759c01142a5d ("pipe: limit the per-user amount of pages allocated in pipes")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Joe Lawrence <joe.lawrence(a)redhat.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof(a)kernel.org>
Cc: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/pipe.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff -puN fs/pipe.c~pipe-actually-allow-root-to-exceed-the-pipe-buffer-limits fs/pipe.c
--- a/fs/pipe.c~pipe-actually-allow-root-to-exceed-the-pipe-buffer-limits
+++ a/fs/pipe.c
@@ -613,6 +613,11 @@ static bool too_many_pipe_buffers_hard(u
return pipe_user_pages_hard && user_bufs >= pipe_user_pages_hard;
}
+static bool is_unprivileged_user(void)
+{
+ return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
+}
+
struct pipe_inode_info *alloc_pipe_info(void)
{
struct pipe_inode_info *pipe;
@@ -629,12 +634,12 @@ struct pipe_inode_info *alloc_pipe_info(
user_bufs = account_pipe_buffers(user, 0, pipe_bufs);
- if (too_many_pipe_buffers_soft(user_bufs)) {
+ if (too_many_pipe_buffers_soft(user_bufs) && is_unprivileged_user()) {
user_bufs = account_pipe_buffers(user, pipe_bufs, 1);
pipe_bufs = 1;
}
- if (too_many_pipe_buffers_hard(user_bufs))
+ if (too_many_pipe_buffers_hard(user_bufs) && is_unprivileged_user())
goto out_revert_acct;
pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer),
@@ -1065,7 +1070,7 @@ static long pipe_set_size(struct pipe_in
if (nr_pages > pipe->buffers &&
(too_many_pipe_buffers_hard(user_bufs) ||
too_many_pipe_buffers_soft(user_bufs)) &&
- !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) {
+ is_unprivileged_user()) {
ret = -EPERM;
goto out_revert_acct;
}
_
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: kasan: rework Kconfig settings
We get a lot of very large stack frames using gcc-7.0.1 with the default
-fsanitize-address-use-after-scope --param asan-stack=1 options, which can
easily cause an overflow of the kernel stack, e.g.
drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes
To reduce this risk, -fsanitize-address-use-after-scope is now split out
into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack frames
that are smaller than 2 kilobytes most of the time on x86_64. An earlier
version of this patch also prevented combining KASAN_EXTRA with
KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.
All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y and
CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can bring
back that default now. KASAN_EXTRA=y still causes lots of warnings but
now defaults to !COMPILE_TEST to disable it in allmodconfig, and it
remains disabled in all other defconfigs since it is a new option. I
arbitrarily raise the warning limit for KASAN_EXTRA to 3072 to reduce the
noise, but an allmodconfig kernel still has around 50 warnings on gcc-7.
I experimented a bit more with smaller stack frames and have another
follow-up series that reduces the warning limit for 64-bit architectures
to 1280 bytes (without CONFIG_KASAN).
With earlier versions of this patch series, I also had patches to address
the warnings we get with KASAN and/or KASAN_EXTRA, using a
"noinline_if_stackbloat" annotation. That annotation now got replaced
with a gcc-8 bugfix (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
older compilers, which means that KASAN_EXTRA is now just as bad as before
and will lead to an instant stack overflow in a few extreme cases.
This reverts parts of commit commit 3f181b4 ("lib/Kconfig.debug: disable
-Wframe-larger-than warnings with KASAN=y"). Two patches in linux-next
should be merged first to avoid introducing warnings in an allmodconfig
build:
3cd890dbe2a4 ("media: dvb-frontends: fix i2c access helpers for KASAN")
16c3ada89cff ("media: r820t: fix r820t_write_reg for KASAN")
Do we really need to backport this?
I think we do: without this patch, enabling KASAN will lead to
unavoidable kernel stack overflow in certain device drivers when built
with gcc-7 or higher on linux-4.10+ or any version that contains a
backport of commit c5caf21ab0cf8. Most people are probably still on
older compilers, but it will get worse over time as they upgrade their
distros.
The warnings we get on kernels older than this should all be for code
that uses dangerously large stack frames, though most of them do not
cause an actual stack overflow by themselves.The asan-stack option was
added in linux-4.0, and commit 3f181b4d8652 ("lib/Kconfig.debug:
disable -Wframe-larger-than warnings with KASAN=y") effectively turned
off the warning for allmodconfig kernels, so I would like to see this
fix backported to any kernels later than 4.0.
I have done dozens of fixes for individual functions with stack frames
larger than 2048 bytes with asan-stack, and I plan to make sure that
all those fixes make it into the stable kernels as well (most are
already there).
Part of the complication here is that asan-stack (from 4.0) was
originally assumed to always require much larger stacks, but that
turned out to be a combination of multiple gcc bugs that we have now
worked around and fixed, but sanitize-address-use-after-scope (from
v4.10) has a much higher inherent stack usage and also suffers from at
least three other problems that we have analyzed but not yet fixed
upstream, each of them makes the stack usage more severe than it should
be.
Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Acked-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/Kconfig.debug | 2 +-
lib/Kconfig.kasan | 11 +++++++++++
scripts/Makefile.kasan | 2 ++
3 files changed, 14 insertions(+), 1 deletion(-)
diff -puN lib/Kconfig.debug~kasan-rework-kconfig-settings lib/Kconfig.debug
--- a/lib/Kconfig.debug~kasan-rework-kconfig-settings
+++ a/lib/Kconfig.debug
@@ -217,7 +217,7 @@ config ENABLE_MUST_CHECK
config FRAME_WARN
int "Warn for stack frames larger than (needs gcc 4.4)"
range 0 8192
- default 0 if KASAN
+ default 3072 if KASAN_EXTRA
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
default 1280 if (!64BIT && PARISC)
default 1024 if (!64BIT && !PARISC)
diff -puN lib/Kconfig.kasan~kasan-rework-kconfig-settings lib/Kconfig.kasan
--- a/lib/Kconfig.kasan~kasan-rework-kconfig-settings
+++ a/lib/Kconfig.kasan
@@ -20,6 +20,17 @@ config KASAN
Currently CONFIG_KASAN doesn't work with CONFIG_DEBUG_SLAB
(the resulting kernel does not boot).
+config KASAN_EXTRA
+ bool "KAsan: extra checks"
+ depends on KASAN && DEBUG_KERNEL && !COMPILE_TEST
+ help
+ This enables further checks in the kernel address sanitizer, for now
+ it only includes the address-use-after-scope check that can lead
+ to excessive kernel stack usage, frame size warnings and longer
+ compile time.
+ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 has more
+
+
choice
prompt "Instrumentation type"
depends on KASAN
diff -puN scripts/Makefile.kasan~kasan-rework-kconfig-settings scripts/Makefile.kasan
--- a/scripts/Makefile.kasan~kasan-rework-kconfig-settings
+++ a/scripts/Makefile.kasan
@@ -38,7 +38,9 @@ else
endif
+ifdef CONFIG_KASAN_EXTRA
CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope)
+endif
CFLAGS_KASAN_NOSANITIZE := -fno-builtin
_
From: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Subject: kernel/async.c: revert "async: simplify lowest_in_progress()"
This reverts 92266d6ef60c2381 ("async: simplify lowest_in_progress()"),
which was simply wrong: In the case where domain is NULL, we now use the
wrong offsetof() in the list_first_entry macro, so we don't actually fetch
the ->cookie value, but rather the eight bytes located sizeof(struct
list_head) further into the struct async_entry.
On 64 bit, that's the data member, while on 32 bit, that's a u64 built
from func and data in some order.
I think the bug happens to be harmless in practice: It obviously only
affects callers which pass a NULL domain, and AFAICT the only such caller
is
async_synchronize_full() ->
async_synchronize_full_domain(NULL) ->
async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, NULL)
and the ASYNC_COOKIE_MAX means that in practice we end up waiting for the
async_global_pending list to be empty - but it would break if somebody
happened to pass (void*)-1 as the data element to async_schedule, and of
course also if somebody ever does a async_synchronize_cookie_domain(,
NULL) with a "finite" cookie value.
Maybe the "harmless in practice" means this isn't -stable material. But
I'm not completely confident my quick git grep'ing is enough, and there
might be affected code in one of the earlier kernels that has since been
removed, so I'll leave the decision to the stable guys.
Link: http://lkml.kernel.org/r/20171128104938.3921-1-linux@rasmusvillemoes.dk
Fixes: 92266d6ef60c "async: simplify lowest_in_progress()"
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Acked-by: Tejun Heo <tj(a)kernel.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Adam Wallis <awallis(a)codeaurora.org>
Cc: Lai Jiangshan <laijs(a)cn.fujitsu.com>
Cc: <stable(a)vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/async.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff -puN kernel/async.c~revert-async-simplify-lowest_in_progress kernel/async.c
--- a/kernel/async.c~revert-async-simplify-lowest_in_progress
+++ a/kernel/async.c
@@ -84,20 +84,24 @@ static atomic_t entry_count;
static async_cookie_t lowest_in_progress(struct async_domain *domain)
{
- struct list_head *pending;
+ struct async_entry *first = NULL;
async_cookie_t ret = ASYNC_COOKIE_MAX;
unsigned long flags;
spin_lock_irqsave(&async_lock, flags);
- if (domain)
- pending = &domain->pending;
- else
- pending = &async_global_pending;
+ if (domain) {
+ if (!list_empty(&domain->pending))
+ first = list_first_entry(&domain->pending,
+ struct async_entry, domain_list);
+ } else {
+ if (!list_empty(&async_global_pending))
+ first = list_first_entry(&async_global_pending,
+ struct async_entry, global_list);
+ }
- if (!list_empty(pending))
- ret = list_first_entry(pending, struct async_entry,
- domain_list)->cookie;
+ if (first)
+ ret = first->cookie;
spin_unlock_irqrestore(&async_lock, flags);
return ret;
_
From: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Subject: fs/proc/kcore.c: use probe_kernel_read() instead of memcpy()
df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data") added a
bounce buffer to avoid hardened usercopy checks. Copying to the bounce
buffer was implemented with a simple memcpy() assuming that it is always
valid to read from kernel memory iff the kern_addr_valid() check passed.
A simple, but pointless, test case like "dd if=/proc/kcore of=/dev/null"
now can easily crash the kernel, since the former execption handling on
invalid kernel addresses now doesn't work anymore.
Also adding a kern_addr_valid() implementation wouldn't help here. Most
architectures simply return 1 here, while a couple implemented a page
table walk to figure out if something is mapped at the address in
question.
With DEBUG_PAGEALLOC active mappings are established and removed all the
time, so that relying on the result of kern_addr_valid() before executing
the memcpy() also doesn't work.
Therefore simply use probe_kernel_read() to copy to the bounce buffer.
This also allows to simplify read_kcore().
At least on s390 this fixes the observed crashes and doesn't introduce
warnings that were removed with df04abfd181a ("fs/proc/kcore.c: Add bounce
buffer for ktext data"), even though the generic probe_kernel_read()
implementation uses uaccess functions.
While looking into this I'm also wondering if kern_addr_valid() could be
completely removed...(?)
Link: http://lkml.kernel.org/r/20171202132739.99971-1-heiko.carstens@de.ibm.com
Fixes: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Al Viro <viro(a)ZenIV.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/kcore.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff -puN fs/proc/kcore.c~fs-proc-kcorec-use-probe_kernel_read-instead-of-memcpy fs/proc/kcore.c
--- a/fs/proc/kcore.c~fs-proc-kcorec-use-probe_kernel_read-instead-of-memcpy
+++ a/fs/proc/kcore.c
@@ -512,23 +512,15 @@ read_kcore(struct file *file, char __use
return -EFAULT;
} else {
if (kern_addr_valid(start)) {
- unsigned long n;
-
/*
* Using bounce buffer to bypass the
* hardened user copy kernel text checks.
*/
- memcpy(buf, (char *) start, tsz);
- n = copy_to_user(buffer, buf, tsz);
- /*
- * We cannot distinguish between fault on source
- * and fault on destination. When this happens
- * we clear too and hope it will trigger the
- * EFAULT again.
- */
- if (n) {
- if (clear_user(buffer + tsz - n,
- n))
+ if (probe_kernel_read(buf, (void *) start, tsz)) {
+ if (clear_user(buffer, tsz))
+ return -EFAULT;
+ } else {
+ if (copy_to_user(buffer, buf, tsz))
return -EFAULT;
}
} else {
_
From: Andrey Konovalov <andreyknvl(a)google.com>
Subject: kasan: don't emit builtin calls when sanitization is off
With KASAN enabled the kernel has two different memset() functions, one
with KASAN checks (memset) and one without (__memset). KASAN uses some
macro tricks to use the proper version where required. For example
memset() calls in mm/slub.c are without KASAN checks, since they operate
on poisoned slab object metadata.
The issue is that clang emits memset() calls even when there is no
memset() in the source code. They get linked with improper memset()
implementation and the kernel fails to boot due to a huge amount of KASAN
reports during early boot stages.
The solution is to add -fno-builtin flag for files with KASAN_SANITIZE :=
n marker.
Link: http://lkml.kernel.org/r/8ffecfffe04088c52c42b92739c2bd8a0bcb3f5e.151638459…
Signed-off-by: Andrey Konovalov <andreyknvl(a)google.com>
Acked-by: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Cc: Michal Marek <michal.lkml(a)markovi.net>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
Makefile | 3 ++-
scripts/Makefile.kasan | 3 +++
scripts/Makefile.lib | 2 +-
3 files changed, 6 insertions(+), 2 deletions(-)
diff -puN Makefile~kasan-dont-emit-builtin-calls-when-sanitization-is-off Makefile
--- a/Makefile~kasan-dont-emit-builtin-calls-when-sanitization-is-off
+++ a/Makefile
@@ -434,7 +434,8 @@ export MAKE LEX YACC AWK GENKSYMS INSTAL
export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
-export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_KASAN CFLAGS_UBSAN
+export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE
+export CFLAGS_KASAN CFLAGS_KASAN_NOSANITIZE CFLAGS_UBSAN
export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
diff -puN scripts/Makefile.kasan~kasan-dont-emit-builtin-calls-when-sanitization-is-off scripts/Makefile.kasan
--- a/scripts/Makefile.kasan~kasan-dont-emit-builtin-calls-when-sanitization-is-off
+++ a/scripts/Makefile.kasan
@@ -31,4 +31,7 @@ else
endif
CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope)
+
+CFLAGS_KASAN_NOSANITIZE := -fno-builtin
+
endif
diff -puN scripts/Makefile.lib~kasan-dont-emit-builtin-calls-when-sanitization-is-off scripts/Makefile.lib
--- a/scripts/Makefile.lib~kasan-dont-emit-builtin-calls-when-sanitization-is-off
+++ a/scripts/Makefile.lib
@@ -121,7 +121,7 @@ endif
ifeq ($(CONFIG_KASAN),y)
_c_flags += $(if $(patsubst n%,, \
$(KASAN_SANITIZE_$(basetarget).o)$(KASAN_SANITIZE)y), \
- $(CFLAGS_KASAN))
+ $(CFLAGS_KASAN), $(CFLAGS_KASAN_NOSANITIZE))
endif
ifeq ($(CONFIG_UBSAN),y)
_
This is a note to let you know that I've just added the patch titled
x86/asm: Fix inline asm call constraints for GCC 4.4
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 520a13c530aeb5f63e011d668c42db1af19ed349 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
Date: Thu, 28 Sep 2017 16:58:26 -0500
Subject: x86/asm: Fix inline asm call constraints for GCC 4.4
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
commit 520a13c530aeb5f63e011d668c42db1af19ed349 upstream.
The kernel test bot (run by Xiaolong Ye) reported that the following commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
is causing double faults in a kernel compiled with GCC 4.4.
Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:
register unsigned int __asm_call_sp asm("esp");
#define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)
Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC
to produce the following bogus code:
ffffffff8147461d: 89 e0 mov %esp,%eax
ffffffff8147461f: 4c 89 f7 mov %r14,%rdi
ffffffff81474622: 4c 89 fe mov %r15,%rsi
ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx
ffffffff8147462a: 89 c4 mov %eax,%esp
ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled>
Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer. The upper 32 bits
are getting cleared out, corrupting the stack pointer.
So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.
This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.
Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye(a)intel.com>
Diagnosed-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: LKP <lkp(a)01.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/asm.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -11,10 +11,12 @@
# define __ASM_FORM_COMMA(x) " " #x ","
#endif
-#ifdef CONFIG_X86_32
+#ifndef __x86_64__
+/* 32 bit */
# define __ASM_SEL(a,b) __ASM_FORM(a)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
#else
+/* 64 bit */
# define __ASM_SEL(a,b) __ASM_FORM(b)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(b)
#endif
Patches currently in stable-queue which might be from jpoimboe(a)redhat.com are
queue-4.9/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
This is a note to let you know that I've just added the patch titled
x86/asm: Fix inline asm call constraints for GCC 4.4
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 520a13c530aeb5f63e011d668c42db1af19ed349 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
Date: Thu, 28 Sep 2017 16:58:26 -0500
Subject: x86/asm: Fix inline asm call constraints for GCC 4.4
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
commit 520a13c530aeb5f63e011d668c42db1af19ed349 upstream.
The kernel test bot (run by Xiaolong Ye) reported that the following commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
is causing double faults in a kernel compiled with GCC 4.4.
Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:
register unsigned int __asm_call_sp asm("esp");
#define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)
Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC
to produce the following bogus code:
ffffffff8147461d: 89 e0 mov %esp,%eax
ffffffff8147461f: 4c 89 f7 mov %r14,%rdi
ffffffff81474622: 4c 89 fe mov %r15,%rsi
ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx
ffffffff8147462a: 89 c4 mov %eax,%esp
ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled>
Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer. The upper 32 bits
are getting cleared out, corrupting the stack pointer.
So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.
This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.
Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye(a)intel.com>
Diagnosed-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: LKP <lkp(a)01.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/asm.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -11,10 +11,12 @@
# define __ASM_FORM_COMMA(x) " " #x ","
#endif
-#ifdef CONFIG_X86_32
+#ifndef __x86_64__
+/* 32 bit */
# define __ASM_SEL(a,b) __ASM_FORM(a)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
#else
+/* 64 bit */
# define __ASM_SEL(a,b) __ASM_FORM(b)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(b)
#endif
Patches currently in stable-queue which might be from jpoimboe(a)redhat.com are
queue-4.4/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
From: Eric Biggers <ebiggers(a)google.com>
commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: linux-stable <stable(a)vger.kernel.org> # 4.9.y
Cc: Mimi Zohar <zohar(a)linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: James Morris <james.l.morris(a)oracle.com>
Signed-off-by: Jin Qian <jinqian(a)google.com>
Signed-off-by: Jin Qian <jinqian(a)android.com>
---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/security/keys/encrypted-keys/encrypted.c b/security/keys/encrypted-keys/encrypted.c
index a871159bf03c..ead2fd60244d 100644
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const char *ecryptfs_desc)
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}
/*
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: linux-stable <stable(a)vger.kernel.org> # 4.4.y
Cc: Mimi Zohar <zohar(a)linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: James Morris <james.l.morris(a)oracle.com>
Signed-off-by: Jin Qian <jinqian(a)google.com>
Signed-off-by: Jin Qian <jinqian(a)android.com>
---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/security/keys/encrypted-keys/encrypted.c b/security/keys/encrypted-keys/encrypted.c
index ce295c0c1da0..e44e844c8ec4 100644
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const char *ecryptfs_desc)
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}
/*
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: linux-stable <stable(a)vger.kernel.org> # 3.18.y
Cc: Mimi Zohar <zohar(a)linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: James Morris <james.l.morris(a)oracle.com>
Signed-off-by: Jin Qian <jinqian(a)google.com>
---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/security/keys/encrypted-keys/encrypted.c b/security/keys/encrypted-keys/encrypted.c
index 89d5695c51cd..20251ee5c491 100644
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const char *ecryptfs_desc)
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}
/*
--
2.16.0.rc1.238.g530d649a79-goog
This is a note to let you know that I've just added the patch titled
vhost_net: stop device during reset owner
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vhost_net-stop-device-during-reset-owner.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Jason Wang <jasowang(a)redhat.com>
Date: Thu, 25 Jan 2018 22:03:52 +0800
Subject: vhost_net: stop device during reset owner
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ]
We don't stop device before reset owner, this means we could try to
serve any virtqueue kick before reset dev->worker. This will result a
warn since the work was pending at llist during owner resetting. Fix
this by stopping device during owner reset.
Reported-by: syzbot+eb17c6162478cc50632c(a)syzkaller.appspotmail.com
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/vhost/net.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1078,6 +1078,7 @@ static long vhost_net_reset_owner(struct
}
vhost_net_stop(n, &tx_sock, &rx_sock);
vhost_net_flush(n);
+ vhost_dev_stop(&n->dev);
vhost_dev_reset_owner(&n->dev, umem);
vhost_net_vq_reset(n);
done:
Patches currently in stable-queue which might be from jasowang(a)redhat.com are
queue-4.9/vhost_net-stop-device-during-reset-owner.patch
This is a note to let you know that I've just added the patch titled
tcp_bbr: fix pacing_gain to always be unity when using lt_bw
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp_bbr-fix-pacing_gain-to-always-be-unity-when-using-lt_bw.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Neal Cardwell <ncardwell(a)google.com>
Date: Wed, 31 Jan 2018 15:43:05 -0500
Subject: tcp_bbr: fix pacing_gain to always be unity when using lt_bw
From: Neal Cardwell <ncardwell(a)google.com>
[ Upstream commit 3aff3b4b986e51bcf4ab249e5d48d39596e0df6a ]
This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when
using lt_bw and returning from the PROBE_RTT state to PROBE_BW.
Previously, when using lt_bw, upon exiting PROBE_RTT and entering
PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly
end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase()
set a pacing gain above 1.0. In such cases this would result in a
pacing rate that is 1.25x higher than intended, potentially resulting
in a high loss rate for a little while until we stop using the lt_bw a
bit later.
This commit is a stable candidate for kernels back as far as 4.9.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell(a)google.com>
Signed-off-by: Yuchung Cheng <ycheng(a)google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Reported-by: Beyers Cronje <bcronje(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_bbr.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -452,7 +452,8 @@ static void bbr_advance_cycle_phase(stru
bbr->cycle_idx = (bbr->cycle_idx + 1) & (CYCLE_LEN - 1);
bbr->cycle_mstamp = tp->delivered_mstamp;
- bbr->pacing_gain = bbr_pacing_gain[bbr->cycle_idx];
+ bbr->pacing_gain = bbr->lt_use_bw ? BBR_UNIT :
+ bbr_pacing_gain[bbr->cycle_idx];
}
/* Gain cycling: cycle pacing gain to converge to fair share of available bw. */
@@ -461,8 +462,7 @@ static void bbr_update_cycle_phase(struc
{
struct bbr *bbr = inet_csk_ca(sk);
- if ((bbr->mode == BBR_PROBE_BW) && !bbr->lt_use_bw &&
- bbr_is_next_cycle_phase(sk, rs))
+ if (bbr->mode == BBR_PROBE_BW && bbr_is_next_cycle_phase(sk, rs))
bbr_advance_cycle_phase(sk);
}
Patches currently in stable-queue which might be from ncardwell(a)google.com are
queue-4.9/tcp_bbr-fix-pacing_gain-to-always-be-unity-when-using-lt_bw.patch
This is a note to let you know that I've just added the patch titled
tcp: release sk_frag.page in tcp_disconnect
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tcp-release-sk_frag.page-in-tcp_disconnect.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Li RongQing <lirongqing(a)baidu.com>
Date: Fri, 26 Jan 2018 16:40:41 +0800
Subject: tcp: release sk_frag.page in tcp_disconnect
From: Li RongQing <lirongqing(a)baidu.com>
[ Upstream commit 9b42d55a66d388e4dd5550107df051a9637564fc ]
socket can be disconnected and gets transformed back to a listening
socket, if sk_frag.page is not released, which will be cloned into
a new socket by sk_clone_lock, but the reference count of this page
is increased, lead to a use after free or double free issue
Signed-off-by: Li RongQing <lirongqing(a)baidu.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2316,6 +2316,12 @@ int tcp_disconnect(struct sock *sk, int
WARN_ON(inet->inet_num && !icsk->icsk_bind_hash);
+ if (sk->sk_frag.page) {
+ put_page(sk->sk_frag.page);
+ sk->sk_frag.page = NULL;
+ sk->sk_frag.offset = 0;
+ }
+
sk->sk_error_report(sk);
return err;
}
Patches currently in stable-queue which might be from lirongqing(a)baidu.com are
queue-4.9/tcp-release-sk_frag.page-in-tcp_disconnect.patch
This is a note to let you know that I've just added the patch titled
r8169: fix RTL8168EP take too long to complete driver initialization.
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Chunhao Lin <hau(a)realtek.com>
Date: Wed, 31 Jan 2018 01:32:36 +0800
Subject: r8169: fix RTL8168EP take too long to complete driver initialization.
From: Chunhao Lin <hau(a)realtek.com>
[ Upstream commit 086ca23d03c0d2f4088f472386778d293e15c5f6 ]
Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver
waiting until timeout.
Fix this by waiting for the right register bit.
Signed-off-by: Chunhao Lin <hau(a)realtek.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/realtek/r8169.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1387,7 +1387,7 @@ DECLARE_RTL_COND(rtl_ocp_tx_cond)
{
void __iomem *ioaddr = tp->mmio_addr;
- return RTL_R8(IBISR0) & 0x02;
+ return RTL_R8(IBISR0) & 0x20;
}
static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
@@ -1395,7 +1395,7 @@ static void rtl8168ep_stop_cmac(struct r
void __iomem *ioaddr = tp->mmio_addr;
RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
- rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
+ rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
}
Patches currently in stable-queue which might be from hau(a)realtek.com are
queue-4.9/r8169-fix-rtl8168ep-take-too-long-to-complete-driver-initialization.patch
This is a note to let you know that I've just added the patch titled
qmi_wwan: Add support for Quectel EP06
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
qmi_wwan-add-support-for-quectel-ep06.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Kristian Evensen <kristian.evensen(a)gmail.com>
Date: Tue, 30 Jan 2018 14:12:55 +0100
Subject: qmi_wwan: Add support for Quectel EP06
From: Kristian Evensen <kristian.evensen(a)gmail.com>
[ Upstream commit c0b91a56a2e57a5a370655b25d677ae0ebf8a2d0 ]
The Quectel EP06 is a Cat. 6 LTE modem. It uses the same interface as
the EC20/EC25 for QMI, and requires the same "set DTR"-quirk to work.
Signed-off-by: Kristian Evensen <kristian.evensen(a)gmail.com>
Acked-by: Bjørn Mork <bjorn(a)mork.no>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/qmi_wwan.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -944,6 +944,7 @@ static const struct usb_device_id produc
{QMI_QUIRK_SET_DTR(0x2c7c, 0x0125, 4)}, /* Quectel EC25, EC20 R2.0 Mini PCIe */
{QMI_QUIRK_SET_DTR(0x2c7c, 0x0121, 4)}, /* Quectel EC21 Mini PCIe */
{QMI_FIXED_INTF(0x2c7c, 0x0296, 4)}, /* Quectel BG96 */
+ {QMI_QUIRK_SET_DTR(0x2c7c, 0x0306, 4)}, /* Quectel EP06 Mini PCIe */
/* 4. Gobi 1000 devices */
{QMI_GOBI1K_DEVICE(0x05c6, 0x9212)}, /* Acer Gobi Modem Device */
Patches currently in stable-queue which might be from kristian.evensen(a)gmail.com are
queue-4.9/qmi_wwan-add-support-for-quectel-ep06.patch
This is a note to let you know that I've just added the patch titled
soreuseport: fix mem leak in reuseport_add_sock()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Fri, 2 Feb 2018 10:27:27 -0800
Subject: soreuseport: fix mem leak in reuseport_add_sock()
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 4db428a7c9ab07e08783e0fcdc4ca0f555da0567 ]
reuseport_add_sock() needs to deal with attaching a socket having
its own sk_reuseport_cb, after a prior
setsockopt(SO_ATTACH_REUSEPORT_?BPF)
Without this fix, not only a WARN_ONCE() was issued, but we were also
leaking memory.
Thanks to sysbot and Eric Biggers for providing us nice C repros.
------------[ cut here ]------------
socket already in reuseport group
WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119
reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x211/0x2d0 lib/bug.c:184
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079
Fixes: ef456144da8e ("soreuseport: define reuseport groups")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot+c0ea2226f77a42936bf7(a)syzkaller.appspotmail.com
Acked-by: Craig Gallek <kraig(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/sock_reuseport.c | 35 ++++++++++++++++++++---------------
1 file changed, 20 insertions(+), 15 deletions(-)
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -93,6 +93,16 @@ static struct sock_reuseport *reuseport_
return more_reuse;
}
+static void reuseport_free_rcu(struct rcu_head *head)
+{
+ struct sock_reuseport *reuse;
+
+ reuse = container_of(head, struct sock_reuseport, rcu);
+ if (reuse->prog)
+ bpf_prog_destroy(reuse->prog);
+ kfree(reuse);
+}
+
/**
* reuseport_add_sock - Add a socket to the reuseport group of another.
* @sk: New socket to add to the group.
@@ -101,7 +111,7 @@ static struct sock_reuseport *reuseport_
*/
int reuseport_add_sock(struct sock *sk, struct sock *sk2)
{
- struct sock_reuseport *reuse;
+ struct sock_reuseport *old_reuse, *reuse;
if (!rcu_access_pointer(sk2->sk_reuseport_cb)) {
int err = reuseport_alloc(sk2);
@@ -112,10 +122,13 @@ int reuseport_add_sock(struct sock *sk,
spin_lock_bh(&reuseport_lock);
reuse = rcu_dereference_protected(sk2->sk_reuseport_cb,
- lockdep_is_held(&reuseport_lock)),
- WARN_ONCE(rcu_dereference_protected(sk->sk_reuseport_cb,
- lockdep_is_held(&reuseport_lock)),
- "socket already in reuseport group");
+ lockdep_is_held(&reuseport_lock));
+ old_reuse = rcu_dereference_protected(sk->sk_reuseport_cb,
+ lockdep_is_held(&reuseport_lock));
+ if (old_reuse && old_reuse->num_socks != 1) {
+ spin_unlock_bh(&reuseport_lock);
+ return -EBUSY;
+ }
if (reuse->num_socks == reuse->max_socks) {
reuse = reuseport_grow(reuse);
@@ -133,19 +146,11 @@ int reuseport_add_sock(struct sock *sk,
spin_unlock_bh(&reuseport_lock);
+ if (old_reuse)
+ call_rcu(&old_reuse->rcu, reuseport_free_rcu);
return 0;
}
-static void reuseport_free_rcu(struct rcu_head *head)
-{
- struct sock_reuseport *reuse;
-
- reuse = container_of(head, struct sock_reuseport, rcu);
- if (reuse->prog)
- bpf_prog_destroy(reuse->prog);
- kfree(reuse);
-}
-
void reuseport_detach_sock(struct sock *sk)
{
struct sock_reuseport *reuse;
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.9/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-4.9/net-igmp-add-a-missing-rcu-locking-section.patch
queue-4.9/soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
This is a note to let you know that I've just added the patch titled
net: igmp: add a missing rcu locking section
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-igmp-add-a-missing-rcu-locking-section.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 1 Feb 2018 10:26:57 -0800
Subject: net: igmp: add a missing rcu locking section
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ]
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
Timer callbacks do not ensure this locking.
=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/igmp.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -386,7 +386,11 @@ static struct sk_buff *igmpv3_newpack(st
pip->frag_off = htons(IP_DF);
pip->ttl = 1;
pip->daddr = fl4.daddr;
+
+ rcu_read_lock();
pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
+ rcu_read_unlock();
+
pip->protocol = IPPROTO_IGMP;
pip->tot_len = 0; /* filled in later */
ip_select_ident(net, skb, NULL);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.9/tcp-release-sk_frag.page-in-tcp_disconnect.patch
queue-4.9/net-igmp-add-a-missing-rcu-locking-section.patch
queue-4.9/soreuseport-fix-mem-leak-in-reuseport_add_sock.patch
This is a note to let you know that I've just added the patch titled
ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Martin KaFai Lau <kafai(a)fb.com>
Date: Wed, 24 Jan 2018 23:15:27 -0800
Subject: ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
From: Martin KaFai Lau <kafai(a)fb.com>
[ Upstream commit 7ece54a60ee2ba7a386308cae73c790bd580589c ]
If a sk_v6_rcv_saddr is !IPV6_ADDR_ANY and !IPV6_ADDR_MAPPED, it
implicitly implies it is an ipv6only socket. However, in inet6_bind(),
this addr_type checking and setting sk->sk_ipv6only to 1 are only done
after sk->sk_prot->get_port(sk, snum) has been completed successfully.
This inconsistency between sk_v6_rcv_saddr and sk_ipv6only confuses
the 'get_port()'.
In particular, when binding SO_REUSEPORT UDP sockets,
udp_reuseport_add_sock(sk,...) is called. udp_reuseport_add_sock()
checks "ipv6_only_sock(sk2) == ipv6_only_sock(sk)" before adding sk to
sk2->sk_reuseport_cb. In this case, ipv6_only_sock(sk2) could be
1 while ipv6_only_sock(sk) is still 0 here. The end result is,
reuseport_alloc(sk) is called instead of adding sk to the existing
sk2->sk_reuseport_cb.
It can be reproduced by binding two SO_REUSEPORT UDP sockets on an
IPv6 address (!ANY and !MAPPED). Only one of the socket will
receive packet.
The fix is to set the implicit sk_ipv6only before calling get_port().
The original sk_ipv6only has to be saved such that it can be restored
in case get_port() failed. The situation is similar to the
inet_reset_saddr(sk) after get_port() has failed.
Thanks to Calvin Owens <calvinowens(a)fb.com> who created an easy
reproduction which leads to a fix.
Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection")
Signed-off-by: Martin KaFai Lau <kafai(a)fb.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/af_inet6.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -274,6 +274,7 @@ int inet6_bind(struct socket *sock, stru
struct net *net = sock_net(sk);
__be32 v4addr = 0;
unsigned short snum;
+ bool saved_ipv6only;
int addr_type = 0;
int err = 0;
@@ -378,19 +379,21 @@ int inet6_bind(struct socket *sock, stru
if (!(addr_type & IPV6_ADDR_MULTICAST))
np->saddr = addr->sin6_addr;
+ saved_ipv6only = sk->sk_ipv6only;
+ if (addr_type != IPV6_ADDR_ANY && addr_type != IPV6_ADDR_MAPPED)
+ sk->sk_ipv6only = 1;
+
/* Make sure we are allowed to bind here. */
if ((snum || !inet->bind_address_no_port) &&
sk->sk_prot->get_port(sk, snum)) {
+ sk->sk_ipv6only = saved_ipv6only;
inet_reset_saddr(sk);
err = -EADDRINUSE;
goto out;
}
- if (addr_type != IPV6_ADDR_ANY) {
+ if (addr_type != IPV6_ADDR_ANY)
sk->sk_userlocks |= SOCK_BINDADDR_LOCK;
- if (addr_type != IPV6_ADDR_MAPPED)
- sk->sk_ipv6only = 1;
- }
if (snum)
sk->sk_userlocks |= SOCK_BINDPORT_LOCK;
inet->inet_sport = htons(inet->inet_num);
Patches currently in stable-queue which might be from kafai(a)fb.com are
queue-4.9/ipv6-fix-so_reuseport-udp-socket-with-implicit-sk_ipv6only.patch
This is a note to let you know that I've just added the patch titled
ip6mr: fix stale iterator
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ip6mr-fix-stale-iterator.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Date: Wed, 31 Jan 2018 16:29:30 +0200
Subject: ip6mr: fix stale iterator
From: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
[ Upstream commit 4adfa79fc254efb7b0eb3cd58f62c2c3f805f1ba ]
When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.
Here's syzbot's trace:
WARNING: bad unlock balance detected!
4.15.0-rc3+ #128 Not tainted
syzkaller971460/3195 is trying to release lock (mrt_lock) at:
[<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syzkaller971460/3195:
#0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
fs/seq_file.c:165
stack backtrace:
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
__lock_release kernel/locking/lockdep.c:3775 [inline]
lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
__raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
_raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
traverse+0x3bc/0xa00 fs/seq_file.c:135
seq_read+0x96a/0x13d0 fs/seq_file.c:189
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at lib/usercopy.c:25
in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
INFO: lockdep is turned off.
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
__might_sleep+0x95/0x190 kernel/sched/core.c:6013
__might_fault+0xab/0x1d0 mm/memory.c:4525
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
seq_read+0xcb4/0x13d0 fs/seq_file.c:279
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
lib/usercopy.c:26
Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669(a)syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6mr.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -495,6 +495,7 @@ static void *ipmr_mfc_seq_start(struct s
return ERR_PTR(-ENOENT);
it->mrt = mrt;
+ it->cache = NULL;
return *pos ? ipmr_mfc_seq_idx(net, seq->private, *pos - 1)
: SEQ_START_TOKEN;
}
Patches currently in stable-queue which might be from nikolay(a)cumulusnetworks.com are
queue-4.9/ip6mr-fix-stale-iterator.patch
This is a note to let you know that I've just added the patch titled
cls_u32: add missing RCU annotation.
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
cls_u32-add-missing-rcu-annotation.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Feb 6 12:42:23 PST 2018
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Fri, 2 Feb 2018 16:02:22 +0100
Subject: cls_u32: add missing RCU annotation.
From: Paolo Abeni <pabeni(a)redhat.com>
[ Upstream commit 058a6c033488494a6b1477b05fe8e1a16e344462 ]
In a couple of points of the control path, n->ht_down is currently
accessed without the required RCU annotation. The accesses are
safe, but sparse complaints. Since we already held the
rtnl lock, let use rtnl_dereference().
Fixes: a1b7c5fd7fe9 ("net: sched: add cls_u32 offload hooks for netdevs")
Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Acked-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sched/cls_u32.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -496,6 +496,7 @@ static void u32_clear_hw_hnode(struct tc
static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
u32 flags)
{
+ struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
struct net_device *dev = tp->q->dev_queue->dev;
struct tc_cls_u32_offload u32_offload = {0};
struct tc_to_netdev offload;
@@ -520,7 +521,7 @@ static int u32_replace_hw_knode(struct t
offload.cls_u32->knode.sel = &n->sel;
offload.cls_u32->knode.exts = &n->exts;
if (n->ht_down)
- offload.cls_u32->knode.link_handle = n->ht_down->handle;
+ offload.cls_u32->knode.link_handle = ht->handle;
err = dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle,
tp->protocol, &offload);
@@ -788,8 +789,9 @@ static void u32_replace_knode(struct tcf
static struct tc_u_knode *u32_init_knode(struct tcf_proto *tp,
struct tc_u_knode *n)
{
- struct tc_u_knode *new;
+ struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
struct tc_u32_sel *s = &n->sel;
+ struct tc_u_knode *new;
new = kzalloc(sizeof(*n) + s->nkeys*sizeof(struct tc_u32_key),
GFP_KERNEL);
@@ -807,11 +809,11 @@ static struct tc_u_knode *u32_init_knode
new->fshift = n->fshift;
new->res = n->res;
new->flags = n->flags;
- RCU_INIT_POINTER(new->ht_down, n->ht_down);
+ RCU_INIT_POINTER(new->ht_down, ht);
/* bump reference count as long as we hold pointer to structure */
- if (new->ht_down)
- new->ht_down->refcnt++;
+ if (ht)
+ ht->refcnt++;
#ifdef CONFIG_CLS_U32_PERF
/* Statistics may be incremented by readers during update
Patches currently in stable-queue which might be from pabeni(a)redhat.com are
queue-4.9/cls_u32-add-missing-rcu-annotation.patch
On Sun, Feb 04, 2018 at 06:17:36PM +0100, Pavel Machek wrote:
>
>> > > >> *** if brightness=0, led off
>> > > >> *** else apply brightness if next timer <--- timer is stop, and will never apply new setting
>> > > >> ** otherwise set led_set_brightness_nosleep
>> > > >>
>> > > >> To fix that, when we delete the timer, we should clear LED_BLINK_SW.
>> > > >
>> > > >Can you run the tests on the affected stable kernels? I have feeling
>> > > >that the problem described might not be present there.
>> > >
>> > > Hm, I don't seem to have HW to test that out. Maybe someone else does?
>> >
>> > Why are you submitting patches you have no way to test?
>>
>> What? This is stable tree backporting, why are you trying to make a
>> requirement for something that we have never had before?
>
>I don't think random patches should be sent to stable just because
>they appeared in mainline. Plus, I don't think I'm making new rules:
>
>submit-checklist.rst:
>
>13) Has been build- and runtime tested with and without ``CONFIG_SMP``
>and
> ``CONFIG_PREEMPT.``
>
>stable-kernel-rules.rst:
>
>Rules on what kind of patches are accepted, and which ones are not,
>into the "-stable" tree:
>
> - It must be obviously correct and tested.
> - It must fix a real bug that bothers people (not a, "This could be a
> problem..." type thing).
So you're saying that this doesn't qualify as a bug?
>> This is a backport of a patch that is already upstream. If it doesn't
>> belong in a stable tree, great, let us know that, saying why it is so.
>
>See jacek.anaszewski(a)gmail.com 's explanation.
I might be missing something, but Jacek suggested I pull another patch
before this one?
--
Thanks,
Sasha
This is a note to let you know that I've just added the patch titled
KVM/x86: Add IBPB support
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-add-ibpb-support.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 15d45071523d89b3fb7372e2135fbd72f6af9506 Mon Sep 17 00:00:00 2001
From: Ashok Raj <ashok.raj(a)intel.com>
Date: Thu, 1 Feb 2018 22:59:43 +0100
Subject: KVM/x86: Add IBPB support
From: Ashok Raj <ashok.raj(a)intel.com>
commit 15d45071523d89b3fb7372e2135fbd72f6af9506 upstream.
The Indirect Branch Predictor Barrier (IBPB) is an indirect branch
control mechanism. It keeps earlier branches from influencing
later ones.
Unlike IBRS and STIBP, IBPB does not define a new mode of operation.
It's a command that ensures predicted branch targets aren't used after
the barrier. Although IBRS and IBPB are enumerated by the same CPUID
enumeration, IBPB is very different.
IBPB helps mitigate against three potential attacks:
* Mitigate guests from being attacked by other guests.
- This is addressed by issing IBPB when we do a guest switch.
* Mitigate attacks from guest/ring3->host/ring3.
These would require a IBPB during context switch in host, or after
VMEXIT. The host process has two ways to mitigate
- Either it can be compiled with retpoline
- If its going through context switch, and has set !dumpable then
there is a IBPB in that path.
(Tim's patch: https://patchwork.kernel.org/patch/10192871)
- The case where after a VMEXIT you return back to Qemu might make
Qemu attackable from guest when Qemu isn't compiled with retpoline.
There are issues reported when doing IBPB on every VMEXIT that resulted
in some tsc calibration woes in guest.
* Mitigate guest/ring0->host/ring0 attacks.
When host kernel is using retpoline it is safe against these attacks.
If host kernel isn't using retpoline we might need to do a IBPB flush on
every VMEXIT.
Even when using retpoline for indirect calls, in certain conditions 'ret'
can use the BTB on Skylake-era CPUs. There are other mitigations
available like RSB stuffing/clearing.
* IBPB is issued only for SVM during svm_free_vcpu().
VMX has a vmclear and SVM doesn't. Follow discussion here:
https://lkml.org/lkml/2018/1/15/146
Please refer to the following spec for more details on the enumeration
and control.
Refer here to get documentation about mitigations.
https://software.intel.com/en-us/side-channel-security-support
[peterz: rebase and changelog rewrite]
[karahmed: - rebase
- vmx: expose PRED_CMD if guest has it in CPUID
- svm: only pass through IBPB if guest has it in CPUID
- vmx: support !cpu_has_vmx_msr_bitmap()]
- vmx: support nested]
[dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
PRED_CMD is a write-only MSR]
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: KarimAllah Ahmed <karahmed(a)amazon.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: kvm(a)vger.kernel.org
Cc: Asit Mallick <asit.k.mallick(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven(a)intel.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: Jun Nakajima <jun.nakajima(a)intel.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.…
Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/cpuid.c | 11 ++++++-
arch/x86/kvm/cpuid.h | 12 +++++++
arch/x86/kvm/svm.c | 28 ++++++++++++++++++
arch/x86/kvm/vmx.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++--
4 files changed, 127 insertions(+), 3 deletions(-)
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -355,6 +355,10 @@ static inline int __do_cpuid_ent(struct
F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM);
+ /* cpuid 0x80000008.ebx */
+ const u32 kvm_cpuid_8000_0008_ebx_x86_features =
+ F(IBPB);
+
/* cpuid 0xC0000001.edx */
const u32 kvm_cpuid_C000_0001_edx_x86_features =
F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) |
@@ -607,7 +611,12 @@ static inline int __do_cpuid_ent(struct
if (!g_phys_as)
g_phys_as = phys_as;
entry->eax = g_phys_as | (virt_as << 8);
- entry->ebx = entry->edx = 0;
+ entry->edx = 0;
+ /* IBPB isn't necessarily present in hardware cpuid */
+ if (boot_cpu_has(X86_FEATURE_IBPB))
+ entry->ebx |= F(IBPB);
+ entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features;
+ cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
break;
}
case 0x80000019:
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -160,6 +160,18 @@ static inline bool guest_cpuid_has_rdtsc
return best && (best->edx & bit(X86_FEATURE_RDTSCP));
}
+static inline bool guest_cpuid_has_ibpb(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 0x80000008, 0);
+ if (best && (best->ebx & bit(X86_FEATURE_IBPB)))
+ return true;
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
+}
+
+
/*
* NRIPS is provided through cpuidfn 0x8000000a.edx bit 3
*/
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -248,6 +248,7 @@ static const struct svm_direct_access_ms
{ .index = MSR_CSTAR, .always = true },
{ .index = MSR_SYSCALL_MASK, .always = true },
#endif
+ { .index = MSR_IA32_PRED_CMD, .always = false },
{ .index = MSR_IA32_LASTBRANCHFROMIP, .always = false },
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
{ .index = MSR_IA32_LASTINTFROMIP, .always = false },
@@ -510,6 +511,7 @@ struct svm_cpu_data {
struct kvm_ldttss_desc *tss_desc;
struct page *save_area;
+ struct vmcb *current_vmcb;
};
static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -1644,11 +1646,17 @@ static void svm_free_vcpu(struct kvm_vcp
__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, svm);
+ /*
+ * The vmcb page can be recycled, causing a false negative in
+ * svm_vcpu_load(). So do a full IBPB now.
+ */
+ indirect_branch_prediction_barrier();
}
static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
int i;
if (unlikely(cpu != vcpu->cpu)) {
@@ -1677,6 +1685,10 @@ static void svm_vcpu_load(struct kvm_vcp
if (static_cpu_has(X86_FEATURE_RDTSCP))
wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
+ if (sd->current_vmcb != svm->vmcb) {
+ sd->current_vmcb = svm->vmcb;
+ indirect_branch_prediction_barrier();
+ }
avic_vcpu_load(vcpu, cpu);
}
@@ -3599,6 +3611,22 @@ static int svm_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr);
break;
+ case MSR_IA32_PRED_CMD:
+ if (!msr->host_initiated &&
+ !guest_cpuid_has_ibpb(vcpu))
+ return 1;
+
+ if (data & ~PRED_CMD_IBPB)
+ return 1;
+
+ if (!data)
+ break;
+
+ wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
+ if (is_guest_mode(vcpu))
+ break;
+ set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, 0, 1);
+ break;
case MSR_STAR:
svm->vmcb->save.star = data;
break;
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -549,6 +549,7 @@ struct vcpu_vmx {
u64 msr_host_kernel_gs_base;
u64 msr_guest_kernel_gs_base;
#endif
+
u32 vm_entry_controls_shadow;
u32 vm_exit_controls_shadow;
/*
@@ -913,6 +914,8 @@ static void copy_vmcs12_to_shadow(struct
static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
static int alloc_identity_pagetable(struct kvm *kvm);
static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
+static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+ u32 msr, int type);
static DEFINE_PER_CPU(struct vmcs *, vmxarea);
static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -1848,6 +1851,29 @@ static void update_exception_bitmap(stru
vmcs_write32(EXCEPTION_BITMAP, eb);
}
+/*
+ * Check if MSR is intercepted for L01 MSR bitmap.
+ */
+static bool msr_write_intercepted_l01(struct kvm_vcpu *vcpu, u32 msr)
+{
+ unsigned long *msr_bitmap;
+ int f = sizeof(unsigned long);
+
+ if (!cpu_has_vmx_msr_bitmap())
+ return true;
+
+ msr_bitmap = to_vmx(vcpu)->vmcs01.msr_bitmap;
+
+ if (msr <= 0x1fff) {
+ return !!test_bit(msr, msr_bitmap + 0x800 / f);
+ } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
+ msr &= 0x1fff;
+ return !!test_bit(msr, msr_bitmap + 0xc00 / f);
+ }
+
+ return true;
+}
+
static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
unsigned long entry, unsigned long exit)
{
@@ -2257,6 +2283,7 @@ static void vmx_vcpu_load(struct kvm_vcp
if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
vmcs_load(vmx->loaded_vmcs->vmcs);
+ indirect_branch_prediction_barrier();
}
if (!already_loaded) {
@@ -3058,6 +3085,33 @@ static int vmx_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr_info);
break;
+ case MSR_IA32_PRED_CMD:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibpb(vcpu))
+ return 1;
+
+ if (data & ~PRED_CMD_IBPB)
+ return 1;
+
+ if (!data)
+ break;
+
+ wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
+
+ /*
+ * For non-nested:
+ * When it's written (to non-zero) for the first time, pass
+ * it through.
+ *
+ * For nested:
+ * The handling of the MSR bitmap for L2 guests is done in
+ * nested_vmx_merge_msr_bitmap. We should not touch the
+ * vmcs02.msr_bitmap here since it gets completely overwritten
+ * in the merging.
+ */
+ vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
+ MSR_TYPE_W);
+ break;
case MSR_IA32_CR_PAT:
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -9437,9 +9491,23 @@ static inline bool nested_vmx_merge_msr_
struct page *page;
unsigned long *msr_bitmap_l1;
unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
+ /*
+ * pred_cmd is trying to verify two things:
+ *
+ * 1. L0 gave a permission to L1 to actually passthrough the MSR. This
+ * ensures that we do not accidentally generate an L02 MSR bitmap
+ * from the L12 MSR bitmap that is too permissive.
+ * 2. That L1 or L2s have actually used the MSR. This avoids
+ * unnecessarily merging of the bitmap if the MSR is unused. This
+ * works properly because we only update the L01 MSR bitmap lazily.
+ * So even if L0 should pass L1 these MSRs, the L01 bitmap is only
+ * updated to reflect this when L1 (or its L2s) actually write to
+ * the MSR.
+ */
+ bool pred_cmd = msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD);
- /* This shortcut is ok because we support only x2APIC MSRs so far. */
- if (!nested_cpu_has_virt_x2apic_mode(vmcs12))
+ if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
+ !pred_cmd)
return false;
page = nested_get_page(vcpu, vmcs12->msr_bitmap);
@@ -9477,6 +9545,13 @@ static inline bool nested_vmx_merge_msr_
MSR_TYPE_W);
}
}
+
+ if (pred_cmd)
+ nested_vmx_disable_intercept_for_msr(
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_PRED_CMD,
+ MSR_TYPE_W);
+
kunmap(page);
nested_release_page_clean(page);
Patches currently in stable-queue which might be from ashok.raj(a)intel.com are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
KVM: VMX: introduce alloc_loaded_vmcs
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-introduce-alloc_loaded_vmcs.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f21f165ef922c2146cc5bdc620f542953c41714b Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 11 Jan 2018 12:16:15 +0100
Subject: KVM: VMX: introduce alloc_loaded_vmcs
From: Paolo Bonzini <pbonzini(a)redhat.com>
commit f21f165ef922c2146cc5bdc620f542953c41714b upstream.
Group together the calls to alloc_vmcs and loaded_vmcs_init. Soon we'll also
allocate an MSR bitmap there.
Cc: stable(a)vger.kernel.org # prereq for Spectre mitigation
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 38 +++++++++++++++++++++++---------------
1 file changed, 23 insertions(+), 15 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3524,11 +3524,6 @@ static struct vmcs *alloc_vmcs_cpu(int c
return vmcs;
}
-static struct vmcs *alloc_vmcs(void)
-{
- return alloc_vmcs_cpu(raw_smp_processor_id());
-}
-
static void free_vmcs(struct vmcs *vmcs)
{
free_pages((unsigned long)vmcs, vmcs_config.order);
@@ -3547,6 +3542,22 @@ static void free_loaded_vmcs(struct load
WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
}
+static struct vmcs *alloc_vmcs(void)
+{
+ return alloc_vmcs_cpu(raw_smp_processor_id());
+}
+
+static int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
+{
+ loaded_vmcs->vmcs = alloc_vmcs();
+ if (!loaded_vmcs->vmcs)
+ return -ENOMEM;
+
+ loaded_vmcs->shadow_vmcs = NULL;
+ loaded_vmcs_init(loaded_vmcs);
+ return 0;
+}
+
static void free_kvm_area(void)
{
int cpu;
@@ -6949,6 +6960,7 @@ static int handle_vmon(struct kvm_vcpu *
struct vmcs *shadow_vmcs;
const u64 VMXON_NEEDED_FEATURES = FEATURE_CONTROL_LOCKED
| FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
+ int r;
/* The Intel VMX Instruction Reference lists a bunch of bits that
* are prerequisite to running VMXON, most notably cr4.VMXE must be
@@ -6988,11 +7000,9 @@ static int handle_vmon(struct kvm_vcpu *
return 1;
}
- vmx->nested.vmcs02.vmcs = alloc_vmcs();
- vmx->nested.vmcs02.shadow_vmcs = NULL;
- if (!vmx->nested.vmcs02.vmcs)
+ r = alloc_loaded_vmcs(&vmx->nested.vmcs02);
+ if (r < 0)
goto out_vmcs02;
- loaded_vmcs_init(&vmx->nested.vmcs02);
if (cpu_has_vmx_msr_bitmap()) {
vmx->nested.msr_bitmap =
@@ -9113,17 +9123,15 @@ static struct kvm_vcpu *vmx_create_vcpu(
if (!vmx->guest_msrs)
goto free_pml;
- vmx->loaded_vmcs = &vmx->vmcs01;
- vmx->loaded_vmcs->vmcs = alloc_vmcs();
- vmx->loaded_vmcs->shadow_vmcs = NULL;
- if (!vmx->loaded_vmcs->vmcs)
- goto free_msrs;
if (!vmm_exclusive)
kvm_cpu_vmxon(__pa(per_cpu(vmxarea, raw_smp_processor_id())));
- loaded_vmcs_init(vmx->loaded_vmcs);
+ err = alloc_loaded_vmcs(&vmx->vmcs01);
if (!vmm_exclusive)
kvm_cpu_vmxoff();
+ if (err < 0)
+ goto free_msrs;
+ vmx->loaded_vmcs = &vmx->vmcs01;
cpu = get_cpu();
vmx_vcpu_load(&vmx->vcpu, cpu);
vmx->vcpu.cpu = cpu;
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.9/kvm-vmx-introduce-alloc_loaded_vmcs.patch
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
queue-4.9/x86-pti-make-unpoison-of-pgd-for-trusted-boot-work-for-real.patch
queue-4.9/kvm-vmx-make-msr-bitmaps-per-vcpu.patch
queue-4.9/kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 28c1c9fabf48d6ad596273a11c46e0d0da3e14cd Mon Sep 17 00:00:00 2001
From: KarimAllah Ahmed <karahmed(a)amazon.de>
Date: Thu, 1 Feb 2018 22:59:44 +0100
Subject: KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
From: KarimAllah Ahmed <karahmed(a)amazon.de>
commit 28c1c9fabf48d6ad596273a11c46e0d0da3e14cd upstream.
Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO
(bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the
contents will come directly from the hardware, but user-space can still
override it.
[dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional]
Signed-off-by: KarimAllah Ahmed <karahmed(a)amazon.de>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Darren Kenny <darren.kenny(a)oracle.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Jun Nakajima <jun.nakajima(a)intel.com>
Cc: kvm(a)vger.kernel.org
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Asit Mallick <asit.k.mallick(a)intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven(a)intel.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/cpuid.c | 8 +++++++-
arch/x86/kvm/cpuid.h | 8 ++++++++
arch/x86/kvm/vmx.c | 15 +++++++++++++++
arch/x86/kvm/x86.c | 1 +
4 files changed, 31 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -380,6 +380,10 @@ static inline int __do_cpuid_ent(struct
/* cpuid 7.0.ecx*/
const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/;
+ /* cpuid 7.0.edx*/
+ const u32 kvm_cpuid_7_0_edx_x86_features =
+ F(ARCH_CAPABILITIES);
+
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
@@ -462,12 +466,14 @@ static inline int __do_cpuid_ent(struct
/* PKU is not yet implemented for shadow paging. */
if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE))
entry->ecx &= ~F(PKU);
+ entry->edx &= kvm_cpuid_7_0_edx_x86_features;
+ cpuid_mask(&entry->edx, CPUID_7_EDX);
} else {
entry->ebx = 0;
entry->ecx = 0;
+ entry->edx = 0;
}
entry->eax = 0;
- entry->edx = 0;
break;
}
case 9:
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -171,6 +171,14 @@ static inline bool guest_cpuid_has_ibpb(
return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
}
+static inline bool guest_cpuid_has_arch_capabilities(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ return best && (best->edx & bit(X86_FEATURE_ARCH_CAPABILITIES));
+}
+
/*
* NRIPS is provided through cpuidfn 0x8000000a.edx bit 3
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -550,6 +550,8 @@ struct vcpu_vmx {
u64 msr_guest_kernel_gs_base;
#endif
+ u64 arch_capabilities;
+
u32 vm_entry_controls_shadow;
u32 vm_exit_controls_shadow;
/*
@@ -2981,6 +2983,12 @@ static int vmx_get_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
msr_info->data = guest_read_tsc(vcpu);
break;
+ case MSR_IA32_ARCH_CAPABILITIES:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_arch_capabilities(vcpu))
+ return 1;
+ msr_info->data = to_vmx(vcpu)->arch_capabilities;
+ break;
case MSR_IA32_SYSENTER_CS:
msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
break;
@@ -3112,6 +3120,11 @@ static int vmx_set_msr(struct kvm_vcpu *
vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
MSR_TYPE_W);
break;
+ case MSR_IA32_ARCH_CAPABILITIES:
+ if (!msr_info->host_initiated)
+ return 1;
+ vmx->arch_capabilities = data;
+ break;
case MSR_IA32_CR_PAT:
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -5202,6 +5215,8 @@ static int vmx_vcpu_setup(struct vcpu_vm
++vmx->nmsrs;
}
+ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, vmx->arch_capabilities);
vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -975,6 +975,7 @@ static u32 msrs_to_save[] = {
#endif
MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
+ MSR_IA32_ARCH_CAPABILITIES
};
static unsigned num_msrs_to_save;
Patches currently in stable-queue which might be from karahmed(a)amazon.de are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d28b387fb74da95d69d2615732f50cceb38e9a4d Mon Sep 17 00:00:00 2001
From: KarimAllah Ahmed <karahmed(a)amazon.de>
Date: Thu, 1 Feb 2018 22:59:45 +0100
Subject: KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
From: KarimAllah Ahmed <karahmed(a)amazon.de>
commit d28b387fb74da95d69d2615732f50cceb38e9a4d upstream.
[ Based on a patch from Ashok Raj <ashok.raj(a)intel.com> ]
Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
be using a retpoline+IBPB based approach.
To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for
guests that do not actually use the MSR, only start saving and restoring
when a non-zero is written to it.
No attempt is made to handle STIBP here, intentionally. Filtering STIBP
may be added in a future patch, which may require trapping all writes
if we don't want to pass it through directly to the guest.
[dwmw2: Clean up CPUID bits, save/restore manually, handle reset]
Signed-off-by: KarimAllah Ahmed <karahmed(a)amazon.de>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Darren Kenny <darren.kenny(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Jun Nakajima <jun.nakajima(a)intel.com>
Cc: kvm(a)vger.kernel.org
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Asit Mallick <asit.k.mallick(a)intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven(a)intel.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/cpuid.c | 8 ++-
arch/x86/kvm/cpuid.h | 11 +++++
arch/x86/kvm/vmx.c | 103 ++++++++++++++++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/x86.c | 2
4 files changed, 118 insertions(+), 6 deletions(-)
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -357,7 +357,7 @@ static inline int __do_cpuid_ent(struct
/* cpuid 0x80000008.ebx */
const u32 kvm_cpuid_8000_0008_ebx_x86_features =
- F(IBPB);
+ F(IBPB) | F(IBRS);
/* cpuid 0xC0000001.edx */
const u32 kvm_cpuid_C000_0001_edx_x86_features =
@@ -382,7 +382,7 @@ static inline int __do_cpuid_ent(struct
/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
- F(ARCH_CAPABILITIES);
+ F(SPEC_CTRL) | F(ARCH_CAPABILITIES);
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
@@ -618,9 +618,11 @@ static inline int __do_cpuid_ent(struct
g_phys_as = phys_as;
entry->eax = g_phys_as | (virt_as << 8);
entry->edx = 0;
- /* IBPB isn't necessarily present in hardware cpuid */
+ /* IBRS and IBPB aren't necessarily present in hardware cpuid */
if (boot_cpu_has(X86_FEATURE_IBPB))
entry->ebx |= F(IBPB);
+ if (boot_cpu_has(X86_FEATURE_IBRS))
+ entry->ebx |= F(IBRS);
entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features;
cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
break;
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -171,6 +171,17 @@ static inline bool guest_cpuid_has_ibpb(
return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
}
+static inline bool guest_cpuid_has_ibrs(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 0x80000008, 0);
+ if (best && (best->ebx & bit(X86_FEATURE_IBRS)))
+ return true;
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
+}
+
static inline bool guest_cpuid_has_arch_capabilities(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *best;
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -551,6 +551,7 @@ struct vcpu_vmx {
#endif
u64 arch_capabilities;
+ u64 spec_ctrl;
u32 vm_entry_controls_shadow;
u32 vm_exit_controls_shadow;
@@ -1854,6 +1855,29 @@ static void update_exception_bitmap(stru
}
/*
+ * Check if MSR is intercepted for currently loaded MSR bitmap.
+ */
+static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
+{
+ unsigned long *msr_bitmap;
+ int f = sizeof(unsigned long);
+
+ if (!cpu_has_vmx_msr_bitmap())
+ return true;
+
+ msr_bitmap = to_vmx(vcpu)->loaded_vmcs->msr_bitmap;
+
+ if (msr <= 0x1fff) {
+ return !!test_bit(msr, msr_bitmap + 0x800 / f);
+ } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
+ msr &= 0x1fff;
+ return !!test_bit(msr, msr_bitmap + 0xc00 / f);
+ }
+
+ return true;
+}
+
+/*
* Check if MSR is intercepted for L01 MSR bitmap.
*/
static bool msr_write_intercepted_l01(struct kvm_vcpu *vcpu, u32 msr)
@@ -2983,6 +3007,13 @@ static int vmx_get_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
msr_info->data = guest_read_tsc(vcpu);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ msr_info->data = to_vmx(vcpu)->spec_ctrl;
+ break;
case MSR_IA32_ARCH_CAPABILITIES:
if (!msr_info->host_initiated &&
!guest_cpuid_has_arch_capabilities(vcpu))
@@ -3093,6 +3124,36 @@ static int vmx_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr_info);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ /* The STIBP bit doesn't fault even if it's not advertised */
+ if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP))
+ return 1;
+
+ vmx->spec_ctrl = data;
+
+ if (!data)
+ break;
+
+ /*
+ * For non-nested:
+ * When it's written (to non-zero) for the first time, pass
+ * it through.
+ *
+ * For nested:
+ * The handling of the MSR bitmap for L2 guests is done in
+ * nested_vmx_merge_msr_bitmap. We should not touch the
+ * vmcs02.msr_bitmap here since it gets completely overwritten
+ * in the merging. We update the vmcs01 here for L1 as well
+ * since it will end up touching the MSR anyway now.
+ */
+ vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap,
+ MSR_IA32_SPEC_CTRL,
+ MSR_TYPE_RW);
+ break;
case MSR_IA32_PRED_CMD:
if (!msr_info->host_initiated &&
!guest_cpuid_has_ibpb(vcpu))
@@ -5245,6 +5306,7 @@ static void vmx_vcpu_reset(struct kvm_vc
u64 cr0;
vmx->rmode.vm86_active = 0;
+ vmx->spec_ctrl = 0;
vmx->soft_vnmi_blocked = 0;
@@ -8830,6 +8892,15 @@ static void __noclone vmx_vcpu_run(struc
vmx_arm_hv_timer(vcpu);
+ /*
+ * If this vCPU has touched SPEC_CTRL, restore the guest's value if
+ * it's non-zero. Since vmentry is serialising on affected CPUs, there
+ * is no need to worry about the conditional branch over the wrmsr
+ * being speculatively taken.
+ */
+ if (vmx->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+
vmx->__launched = vmx->loaded_vmcs->launched;
asm(
/* Store host registers */
@@ -8948,6 +9019,27 @@ static void __noclone vmx_vcpu_run(struc
#endif
);
+ /*
+ * We do not use IBRS in the kernel. If this vCPU has used the
+ * SPEC_CTRL MSR it may have left it on; save the value and
+ * turn it off. This is much more efficient than blindly adding
+ * it to the atomic save/restore list. Especially as the former
+ * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
+ *
+ * For non-nested case:
+ * If the L01 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ *
+ * For nested case:
+ * If the L02 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ */
+ if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+
+ if (vmx->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
@@ -9507,7 +9599,7 @@ static inline bool nested_vmx_merge_msr_
unsigned long *msr_bitmap_l1;
unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
/*
- * pred_cmd is trying to verify two things:
+ * pred_cmd & spec_ctrl are trying to verify two things:
*
* 1. L0 gave a permission to L1 to actually passthrough the MSR. This
* ensures that we do not accidentally generate an L02 MSR bitmap
@@ -9520,9 +9612,10 @@ static inline bool nested_vmx_merge_msr_
* the MSR.
*/
bool pred_cmd = msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD);
+ bool spec_ctrl = msr_write_intercepted_l01(vcpu, MSR_IA32_SPEC_CTRL);
if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
- !pred_cmd)
+ !pred_cmd && !spec_ctrl)
return false;
page = nested_get_page(vcpu, vmcs12->msr_bitmap);
@@ -9561,6 +9654,12 @@ static inline bool nested_vmx_merge_msr_
}
}
+ if (spec_ctrl)
+ nested_vmx_disable_intercept_for_msr(
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_SPEC_CTRL,
+ MSR_TYPE_R | MSR_TYPE_W);
+
if (pred_cmd)
nested_vmx_disable_intercept_for_msr(
msr_bitmap_l1, msr_bitmap_l0,
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -975,7 +975,7 @@ static u32 msrs_to_save[] = {
#endif
MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
- MSR_IA32_ARCH_CAPABILITIES
+ MSR_IA32_SPEC_CTRL, MSR_IA32_ARCH_CAPABILITIES
};
static unsigned num_msrs_to_save;
Patches currently in stable-queue which might be from karahmed(a)amazon.de are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b2ac58f90540e39324e7a29a7ad471407ae0bf48 Mon Sep 17 00:00:00 2001
From: KarimAllah Ahmed <karahmed(a)amazon.de>
Date: Sat, 3 Feb 2018 15:56:23 +0100
Subject: KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
From: KarimAllah Ahmed <karahmed(a)amazon.de>
commit b2ac58f90540e39324e7a29a7ad471407ae0bf48 upstream.
[ Based on a patch from Paolo Bonzini <pbonzini(a)redhat.com> ]
... basically doing exactly what we do for VMX:
- Passthrough SPEC_CTRL to guests (if enabled in guest CPUID)
- Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest
actually used it.
Signed-off-by: KarimAllah Ahmed <karahmed(a)amazon.de>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Darren Kenny <darren.kenny(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Jun Nakajima <jun.nakajima(a)intel.com>
Cc: kvm(a)vger.kernel.org
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Asit Mallick <asit.k.mallick(a)intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven(a)intel.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon…
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -183,6 +183,8 @@ struct vcpu_svm {
u64 gs_base;
} host;
+ u64 spec_ctrl;
+
u32 *msrpm;
ulong nmi_iret_rip;
@@ -248,6 +250,7 @@ static const struct svm_direct_access_ms
{ .index = MSR_CSTAR, .always = true },
{ .index = MSR_SYSCALL_MASK, .always = true },
#endif
+ { .index = MSR_IA32_SPEC_CTRL, .always = false },
{ .index = MSR_IA32_PRED_CMD, .always = false },
{ .index = MSR_IA32_LASTBRANCHFROMIP, .always = false },
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
@@ -863,6 +866,25 @@ static bool valid_msr_intercept(u32 inde
return false;
}
+static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr)
+{
+ u8 bit_write;
+ unsigned long tmp;
+ u32 offset;
+ u32 *msrpm;
+
+ msrpm = is_guest_mode(vcpu) ? to_svm(vcpu)->nested.msrpm:
+ to_svm(vcpu)->msrpm;
+
+ offset = svm_msrpm_offset(msr);
+ bit_write = 2 * (msr & 0x0f) + 1;
+ tmp = msrpm[offset];
+
+ BUG_ON(offset == MSR_INVALID);
+
+ return !!test_bit(bit_write, &tmp);
+}
+
static void set_msr_interception(u32 *msrpm, unsigned msr,
int read, int write)
{
@@ -1537,6 +1559,8 @@ static void svm_vcpu_reset(struct kvm_vc
u32 dummy;
u32 eax = 1;
+ svm->spec_ctrl = 0;
+
if (!init_event) {
svm->vcpu.arch.apic_base = APIC_DEFAULT_PHYS_BASE |
MSR_IA32_APICBASE_ENABLE;
@@ -3520,6 +3544,13 @@ static int svm_get_msr(struct kvm_vcpu *
case MSR_VM_CR:
msr_info->data = svm->nested.vm_cr_msr;
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ msr_info->data = svm->spec_ctrl;
+ break;
case MSR_IA32_UCODE_REV:
msr_info->data = 0x01000065;
break;
@@ -3611,6 +3642,33 @@ static int svm_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ /* The STIBP bit doesn't fault even if it's not advertised */
+ if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP))
+ return 1;
+
+ svm->spec_ctrl = data;
+
+ if (!data)
+ break;
+
+ /*
+ * For non-nested:
+ * When it's written (to non-zero) for the first time, pass
+ * it through.
+ *
+ * For nested:
+ * The handling of the MSR bitmap for L2 guests is done in
+ * nested_svm_vmrun_msrpm.
+ * We update the L1 MSR bit as well since it will end up
+ * touching the MSR anyway now.
+ */
+ set_msr_interception(svm->msrpm, MSR_IA32_SPEC_CTRL, 1, 1);
+ break;
case MSR_IA32_PRED_CMD:
if (!msr->host_initiated &&
!guest_cpuid_has_ibpb(vcpu))
@@ -4854,6 +4912,15 @@ static void svm_vcpu_run(struct kvm_vcpu
local_irq_enable();
+ /*
+ * If this vCPU has touched SPEC_CTRL, restore the guest's value if
+ * it's non-zero. Since vmentry is serialising on affected CPUs, there
+ * is no need to worry about the conditional branch over the wrmsr
+ * being speculatively taken.
+ */
+ if (svm->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+
asm volatile (
"push %%" _ASM_BP "; \n\t"
"mov %c[rbx](%[svm]), %%" _ASM_BX " \n\t"
@@ -4946,6 +5013,27 @@ static void svm_vcpu_run(struct kvm_vcpu
#endif
);
+ /*
+ * We do not use IBRS in the kernel. If this vCPU has used the
+ * SPEC_CTRL MSR it may have left it on; save the value and
+ * turn it off. This is much more efficient than blindly adding
+ * it to the atomic save/restore list. Especially as the former
+ * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
+ *
+ * For non-nested case:
+ * If the L01 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ *
+ * For nested case:
+ * If the L02 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ */
+ if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+
+ if (svm->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
Patches currently in stable-queue which might be from karahmed(a)amazon.de are
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-x86-add-ibpb-support.patch
queue-4.9/kvm-svm-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
This is a note to let you know that I've just added the patch titled
KVM: nVMX: vmx_complete_nested_posted_interrupt() can't fail
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6342c50ad12e8ce0736e722184a7dbdea4a3477f Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david(a)redhat.com>
Date: Wed, 25 Jan 2017 11:58:58 +0100
Subject: KVM: nVMX: vmx_complete_nested_posted_interrupt() can't fail
From: David Hildenbrand <david(a)redhat.com>
commit 6342c50ad12e8ce0736e722184a7dbdea4a3477f upstream.
vmx_complete_nested_posted_interrupt() can't fail, let's turn it into
a void function.
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4738,7 +4738,7 @@ static bool vmx_get_enable_apicv(void)
return enable_apicv;
}
-static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
+static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
int max_irr;
@@ -4749,13 +4749,13 @@ static int vmx_complete_nested_posted_in
vmx->nested.pi_pending) {
vmx->nested.pi_pending = false;
if (!pi_test_and_clear_on(vmx->nested.pi_desc))
- return 0;
+ return;
max_irr = find_last_bit(
(unsigned long *)vmx->nested.pi_desc->pir, 256);
if (max_irr == 256)
- return 0;
+ return;
vapic_page = kmap(vmx->nested.virtual_apic_page);
if (!vapic_page) {
@@ -4772,7 +4772,6 @@ static int vmx_complete_nested_posted_in
vmcs_write16(GUEST_INTR_STATUS, status);
}
}
- return 0;
}
static inline bool kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu)
@@ -10493,7 +10492,8 @@ static int vmx_check_nested_events(struc
return 0;
}
- return vmx_complete_nested_posted_interrupt(vcpu);
+ vmx_complete_nested_posted_interrupt(vcpu);
+ return 0;
}
static u32 vmx_get_preemption_timer_value(struct kvm_vcpu *vcpu)
Patches currently in stable-queue which might be from david(a)redhat.com are
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-nvmx-vmx_complete_nested_posted_interrupt-can-t-fail.patch
This is a note to let you know that I've just added the patch titled
KVM: nVMX: mark vmcs12 pages dirty on L2 exit
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c9f04407f2e0b3fc9ff7913c65fcfcb0a4b61570 Mon Sep 17 00:00:00 2001
From: David Matlack <dmatlack(a)google.com>
Date: Tue, 1 Aug 2017 14:00:40 -0700
Subject: KVM: nVMX: mark vmcs12 pages dirty on L2 exit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: David Matlack <dmatlack(a)google.com>
commit c9f04407f2e0b3fc9ff7913c65fcfcb0a4b61570 upstream.
The host physical addresses of L1's Virtual APIC Page and Posted
Interrupt descriptor are loaded into the VMCS02. The CPU may write
to these pages via their host physical address while L2 is running,
bypassing address-translation-based dirty tracking (e.g. EPT write
protection). Mark them dirty on every exit from L2 to prevent them
from getting out of sync with dirty tracking.
Also mark the virtual APIC page and the posted interrupt descriptor
dirty when KVM is virtualizing posted interrupt processing.
Signed-off-by: David Matlack <dmatlack(a)google.com>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 53 +++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 43 insertions(+), 10 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4738,6 +4738,28 @@ static bool vmx_get_enable_apicv(void)
return enable_apicv;
}
+static void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu)
+{
+ struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+ gfn_t gfn;
+
+ /*
+ * Don't need to mark the APIC access page dirty; it is never
+ * written to by the CPU during APIC virtualization.
+ */
+
+ if (nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW)) {
+ gfn = vmcs12->virtual_apic_page_addr >> PAGE_SHIFT;
+ kvm_vcpu_mark_page_dirty(vcpu, gfn);
+ }
+
+ if (nested_cpu_has_posted_intr(vmcs12)) {
+ gfn = vmcs12->posted_intr_desc_addr >> PAGE_SHIFT;
+ kvm_vcpu_mark_page_dirty(vcpu, gfn);
+ }
+}
+
+
static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -4745,18 +4767,15 @@ static void vmx_complete_nested_posted_i
void *vapic_page;
u16 status;
- if (vmx->nested.pi_desc &&
- vmx->nested.pi_pending) {
- vmx->nested.pi_pending = false;
- if (!pi_test_and_clear_on(vmx->nested.pi_desc))
- return;
-
- max_irr = find_last_bit(
- (unsigned long *)vmx->nested.pi_desc->pir, 256);
+ if (!vmx->nested.pi_desc || !vmx->nested.pi_pending)
+ return;
- if (max_irr == 256)
- return;
+ vmx->nested.pi_pending = false;
+ if (!pi_test_and_clear_on(vmx->nested.pi_desc))
+ return;
+ max_irr = find_last_bit((unsigned long *)vmx->nested.pi_desc->pir, 256);
+ if (max_irr != 256) {
vapic_page = kmap(vmx->nested.virtual_apic_page);
if (!vapic_page) {
WARN_ON(1);
@@ -4772,6 +4791,8 @@ static void vmx_complete_nested_posted_i
vmcs_write16(GUEST_INTR_STATUS, status);
}
}
+
+ nested_mark_vmcs12_pages_dirty(vcpu);
}
static inline bool kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu)
@@ -8028,6 +8049,18 @@ static bool nested_vmx_exit_handled(stru
vmcs_read32(VM_EXIT_INTR_ERROR_CODE),
KVM_ISA_VMX);
+ /*
+ * The host physical addresses of some pages of guest memory
+ * are loaded into VMCS02 (e.g. L1's Virtual APIC Page). The CPU
+ * may write to these pages via their host physical address while
+ * L2 is running, bypassing any address-translation-based dirty
+ * tracking (e.g. EPT write protection).
+ *
+ * Mark them dirty on every exit from L2 to prevent them from
+ * getting out of sync with dirty tracking.
+ */
+ nested_mark_vmcs12_pages_dirty(vcpu);
+
if (vmx->nested.nested_run_pending)
return false;
Patches currently in stable-queue which might be from dmatlack(a)google.com are
queue-4.9/kvm-nvmx-mark-vmcs12-pages-dirty-on-l2-exit.patch
This is a note to let you know that I've just added the patch titled
KVM: nVMX: Eliminate vmcs02 pool
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-nvmx-eliminate-vmcs02-pool.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From de3a0021a60635de96aa92713c1a31a96747d72c Mon Sep 17 00:00:00 2001
From: Jim Mattson <jmattson(a)google.com>
Date: Mon, 27 Nov 2017 17:22:25 -0600
Subject: KVM: nVMX: Eliminate vmcs02 pool
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Jim Mattson <jmattson(a)google.com>
commit de3a0021a60635de96aa92713c1a31a96747d72c upstream.
The potential performance advantages of a vmcs02 pool have never been
realized. To simplify the code, eliminate the pool. Instead, a single
vmcs02 is allocated per VCPU when the VCPU enters VMX operation.
Cc: stable(a)vger.kernel.org # prereq for Spectre mitigation
Signed-off-by: Jim Mattson <jmattson(a)google.com>
Signed-off-by: Mark Kanda <mark.kanda(a)oracle.com>
Reviewed-by: Ameya More <ameya.more(a)oracle.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 146 ++++++++---------------------------------------------
1 file changed, 23 insertions(+), 123 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -173,7 +173,6 @@ module_param(ple_window_max, int, S_IRUG
extern const ulong vmx_return;
#define NR_AUTOLOAD_MSRS 8
-#define VMCS02_POOL_SIZE 1
struct vmcs {
u32 revision_id;
@@ -207,7 +206,7 @@ struct shared_msr_entry {
* stored in guest memory specified by VMPTRLD, but is opaque to the guest,
* which must access it using VMREAD/VMWRITE/VMCLEAR instructions.
* More than one of these structures may exist, if L1 runs multiple L2 guests.
- * nested_vmx_run() will use the data here to build a vmcs02: a VMCS for the
+ * nested_vmx_run() will use the data here to build the vmcs02: a VMCS for the
* underlying hardware which will be used to run L2.
* This structure is packed to ensure that its layout is identical across
* machines (necessary for live migration).
@@ -386,13 +385,6 @@ struct __packed vmcs12 {
*/
#define VMCS12_SIZE 0x1000
-/* Used to remember the last vmcs02 used for some recently used vmcs12s */
-struct vmcs02_list {
- struct list_head list;
- gpa_t vmptr;
- struct loaded_vmcs vmcs02;
-};
-
/*
* The nested_vmx structure is part of vcpu_vmx, and holds information we need
* for correct emulation of VMX (i.e., nested VMX) on this vcpu.
@@ -419,15 +411,15 @@ struct nested_vmx {
*/
bool sync_shadow_vmcs;
- /* vmcs02_list cache of VMCSs recently used to run L2 guests */
- struct list_head vmcs02_pool;
- int vmcs02_num;
bool change_vmcs01_virtual_x2apic_mode;
/* L2 must run next, and mustn't decide to exit to L1. */
bool nested_run_pending;
+
+ struct loaded_vmcs vmcs02;
+
/*
- * Guest pages referred to in vmcs02 with host-physical pointers, so
- * we must keep them pinned while L2 runs.
+ * Guest pages referred to in the vmcs02 with host-physical
+ * pointers, so we must keep them pinned while L2 runs.
*/
struct page *apic_access_page;
struct page *virtual_apic_page;
@@ -6684,94 +6676,6 @@ static int handle_monitor(struct kvm_vcp
}
/*
- * To run an L2 guest, we need a vmcs02 based on the L1-specified vmcs12.
- * We could reuse a single VMCS for all the L2 guests, but we also want the
- * option to allocate a separate vmcs02 for each separate loaded vmcs12 - this
- * allows keeping them loaded on the processor, and in the future will allow
- * optimizations where prepare_vmcs02 doesn't need to set all the fields on
- * every entry if they never change.
- * So we keep, in vmx->nested.vmcs02_pool, a cache of size VMCS02_POOL_SIZE
- * (>=0) with a vmcs02 for each recently loaded vmcs12s, most recent first.
- *
- * The following functions allocate and free a vmcs02 in this pool.
- */
-
-/* Get a VMCS from the pool to use as vmcs02 for the current vmcs12. */
-static struct loaded_vmcs *nested_get_current_vmcs02(struct vcpu_vmx *vmx)
-{
- struct vmcs02_list *item;
- list_for_each_entry(item, &vmx->nested.vmcs02_pool, list)
- if (item->vmptr == vmx->nested.current_vmptr) {
- list_move(&item->list, &vmx->nested.vmcs02_pool);
- return &item->vmcs02;
- }
-
- if (vmx->nested.vmcs02_num >= max(VMCS02_POOL_SIZE, 1)) {
- /* Recycle the least recently used VMCS. */
- item = list_last_entry(&vmx->nested.vmcs02_pool,
- struct vmcs02_list, list);
- item->vmptr = vmx->nested.current_vmptr;
- list_move(&item->list, &vmx->nested.vmcs02_pool);
- return &item->vmcs02;
- }
-
- /* Create a new VMCS */
- item = kmalloc(sizeof(struct vmcs02_list), GFP_KERNEL);
- if (!item)
- return NULL;
- item->vmcs02.vmcs = alloc_vmcs();
- item->vmcs02.shadow_vmcs = NULL;
- if (!item->vmcs02.vmcs) {
- kfree(item);
- return NULL;
- }
- loaded_vmcs_init(&item->vmcs02);
- item->vmptr = vmx->nested.current_vmptr;
- list_add(&(item->list), &(vmx->nested.vmcs02_pool));
- vmx->nested.vmcs02_num++;
- return &item->vmcs02;
-}
-
-/* Free and remove from pool a vmcs02 saved for a vmcs12 (if there is one) */
-static void nested_free_vmcs02(struct vcpu_vmx *vmx, gpa_t vmptr)
-{
- struct vmcs02_list *item;
- list_for_each_entry(item, &vmx->nested.vmcs02_pool, list)
- if (item->vmptr == vmptr) {
- free_loaded_vmcs(&item->vmcs02);
- list_del(&item->list);
- kfree(item);
- vmx->nested.vmcs02_num--;
- return;
- }
-}
-
-/*
- * Free all VMCSs saved for this vcpu, except the one pointed by
- * vmx->loaded_vmcs. We must be running L1, so vmx->loaded_vmcs
- * must be &vmx->vmcs01.
- */
-static void nested_free_all_saved_vmcss(struct vcpu_vmx *vmx)
-{
- struct vmcs02_list *item, *n;
-
- WARN_ON(vmx->loaded_vmcs != &vmx->vmcs01);
- list_for_each_entry_safe(item, n, &vmx->nested.vmcs02_pool, list) {
- /*
- * Something will leak if the above WARN triggers. Better than
- * a use-after-free.
- */
- if (vmx->loaded_vmcs == &item->vmcs02)
- continue;
-
- free_loaded_vmcs(&item->vmcs02);
- list_del(&item->list);
- kfree(item);
- vmx->nested.vmcs02_num--;
- }
-}
-
-/*
* The following 3 functions, nested_vmx_succeed()/failValid()/failInvalid(),
* set the success or error code of an emulated VMX instruction, as specified
* by Vol 2B, VMX Instruction Reference, "Conventions".
@@ -7084,6 +6988,12 @@ static int handle_vmon(struct kvm_vcpu *
return 1;
}
+ vmx->nested.vmcs02.vmcs = alloc_vmcs();
+ vmx->nested.vmcs02.shadow_vmcs = NULL;
+ if (!vmx->nested.vmcs02.vmcs)
+ goto out_vmcs02;
+ loaded_vmcs_init(&vmx->nested.vmcs02);
+
if (cpu_has_vmx_msr_bitmap()) {
vmx->nested.msr_bitmap =
(unsigned long *)__get_free_page(GFP_KERNEL);
@@ -7106,9 +7016,6 @@ static int handle_vmon(struct kvm_vcpu *
vmx->vmcs01.shadow_vmcs = shadow_vmcs;
}
- INIT_LIST_HEAD(&(vmx->nested.vmcs02_pool));
- vmx->nested.vmcs02_num = 0;
-
hrtimer_init(&vmx->nested.preemption_timer, CLOCK_MONOTONIC,
HRTIMER_MODE_REL_PINNED);
vmx->nested.preemption_timer.function = vmx_preemption_timer_fn;
@@ -7126,6 +7033,9 @@ out_cached_vmcs12:
free_page((unsigned long)vmx->nested.msr_bitmap);
out_msr_bitmap:
+ free_loaded_vmcs(&vmx->nested.vmcs02);
+
+out_vmcs02:
return -ENOMEM;
}
@@ -7211,7 +7121,7 @@ static void free_nested(struct vcpu_vmx
vmx->vmcs01.shadow_vmcs = NULL;
}
kfree(vmx->nested.cached_vmcs12);
- /* Unpin physical memory we referred to in current vmcs02 */
+ /* Unpin physical memory we referred to in the vmcs02 */
if (vmx->nested.apic_access_page) {
nested_release_page(vmx->nested.apic_access_page);
vmx->nested.apic_access_page = NULL;
@@ -7227,7 +7137,7 @@ static void free_nested(struct vcpu_vmx
vmx->nested.pi_desc = NULL;
}
- nested_free_all_saved_vmcss(vmx);
+ free_loaded_vmcs(&vmx->nested.vmcs02);
}
/* Emulate the VMXOFF instruction */
@@ -7261,8 +7171,6 @@ static int handle_vmclear(struct kvm_vcp
vmptr + offsetof(struct vmcs12, launch_state),
&zero, sizeof(zero));
- nested_free_vmcs02(vmx, vmptr);
-
skip_emulated_instruction(vcpu);
nested_vmx_succeed(vcpu);
return 1;
@@ -8051,10 +7959,11 @@ static bool nested_vmx_exit_handled(stru
/*
* The host physical addresses of some pages of guest memory
- * are loaded into VMCS02 (e.g. L1's Virtual APIC Page). The CPU
- * may write to these pages via their host physical address while
- * L2 is running, bypassing any address-translation-based dirty
- * tracking (e.g. EPT write protection).
+ * are loaded into the vmcs02 (e.g. vmcs12's Virtual APIC
+ * Page). The CPU may write to these pages via their host
+ * physical address while L2 is running, bypassing any
+ * address-translation-based dirty tracking (e.g. EPT write
+ * protection).
*
* Mark them dirty on every exit from L2 to prevent them from
* getting out of sync with dirty tracking.
@@ -10223,7 +10132,6 @@ static int nested_vmx_run(struct kvm_vcp
struct vmcs12 *vmcs12;
struct vcpu_vmx *vmx = to_vmx(vcpu);
int cpu;
- struct loaded_vmcs *vmcs02;
bool ia32e;
u32 msr_entry_idx;
@@ -10363,17 +10271,13 @@ static int nested_vmx_run(struct kvm_vcp
* the nested entry.
*/
- vmcs02 = nested_get_current_vmcs02(vmx);
- if (!vmcs02)
- return -ENOMEM;
-
enter_guest_mode(vcpu);
if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
cpu = get_cpu();
- vmx->loaded_vmcs = vmcs02;
+ vmx->loaded_vmcs = &vmx->nested.vmcs02;
vmx_vcpu_put(vcpu);
vmx_vcpu_load(vcpu, cpu);
vcpu->cpu = cpu;
@@ -10888,10 +10792,6 @@ static void nested_vmx_vmexit(struct kvm
vm_exit_controls_reset_shadow(vmx);
vmx_segment_cache_clear(vmx);
- /* if no vmcs02 cache requested, remove the one we used */
- if (VMCS02_POOL_SIZE == 0)
- nested_free_vmcs02(vmx, vmx->nested.current_vmptr);
-
load_vmcs12_host_state(vcpu, vmcs12);
/* Update any VMCS fields that might have changed while L2 ran */
Patches currently in stable-queue which might be from jmattson(a)google.com are
queue-4.9/kvm-nvmx-eliminate-vmcs02-pool.patch
queue-4.9/kvm-vmx-allow-direct-access-to-msr_ia32_spec_ctrl.patch
queue-4.9/kvm-vmx-make-msr-bitmaps-per-vcpu.patch
queue-4.9/kvm-vmx-emulate-msr_ia32_arch_capabilities.patch
From: Matthias Hintzmann <matthias.dev(a)gmx.de>
ctx->drvflags is checked in the if clause before beeing initialized.
Move initialization before first usage.
Note, that the if clause was backported with commit 75f82a703b30
("cdc_ncm: Set NTB format again after altsetting switch for Huawei
devices") from mainline (upstream commit 2b02c20ce0c2 ("cdc_ncm: Set
NTB format again after altsetting switch for Huawei devices").
In mainline, the initialization is at the right place before the if
clause.
Fixes: 75f82a703b30 ("cdc_ncm: Set NTB format again after altsetting switch for Huawei devices")
Signed-off-by: Matthias Hintzmann <matthias.dev(a)gmx.de>
[mrkiko.rs(a)gmail.com: commit message tweaks]
---
drivers/net/usb/cdc_ncm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 1228d0da407..72cb30828a1 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -825,6 +825,9 @@ int cdc_ncm_bind_common(struct usbnet *dev, struct usb_interface *intf, u8 data_
goto error2;
}
+ /* Device-specific flags */
+ ctx->drvflags = drvflags;
+
/*
* Some Huawei devices have been observed to come out of reset in NDP32 mode.
* Let's check if this is the case, and set the device to NDP16 mode again if
@@ -873,9 +876,6 @@ int cdc_ncm_bind_common(struct usbnet *dev, struct usb_interface *intf, u8 data_
/* finish setting up the device specific data */
cdc_ncm_setup(dev);
- /* Device-specific flags */
- ctx->drvflags = drvflags;
-
/* Allocate the delayed NDP if needed. */
if (ctx->drvflags & CDC_NCM_FLAG_NDP_TO_END) {
ctx->delayed_ndp16 = kzalloc(ctx->max_ndp_size, GFP_KERNEL);
--
2.15.1
This is a note to let you know that I've just added the patch titled
drm: rcar-du: Fix race condition when disabling planes at CRTC stop
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-rcar-du-fix-race-condition-when-disabling-planes-at-crtc-stop.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 641307df71fe77d7b38a477067495ede05d47295 Mon Sep 17 00:00:00 2001
From: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Date: Sat, 29 Jul 2017 02:31:33 +0300
Subject: drm: rcar-du: Fix race condition when disabling planes at CRTC stop
From: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
commit 641307df71fe77d7b38a477067495ede05d47295 upstream.
When stopping the CRTC the driver must disable all planes and wait for
the change to take effect at the next vblank. Merely calling
drm_crtc_wait_one_vblank() is not enough, as the function doesn't
include any mechanism to handle the race with vblank interrupts.
Replace the drm_crtc_wait_one_vblank() call with a manual mechanism that
handles the vblank interrupt race.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 54 +++++++++++++++++++++++++++++----
drivers/gpu/drm/rcar-du/rcar_du_crtc.h | 8 ++++
2 files changed, 56 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -371,6 +371,31 @@ static void rcar_du_crtc_start(struct rc
rcrtc->started = true;
}
+static void rcar_du_crtc_disable_planes(struct rcar_du_crtc *rcrtc)
+{
+ struct rcar_du_device *rcdu = rcrtc->group->dev;
+ struct drm_crtc *crtc = &rcrtc->crtc;
+ u32 status;
+ /* Make sure vblank interrupts are enabled. */
+ drm_crtc_vblank_get(crtc);
+ /*
+ * Disable planes and calculate how many vertical blanking interrupts we
+ * have to wait for. If a vertical blanking interrupt has been triggered
+ * but not processed yet, we don't know whether it occurred before or
+ * after the planes got disabled. We thus have to wait for two vblank
+ * interrupts in that case.
+ */
+ spin_lock_irq(&rcrtc->vblank_lock);
+ rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
+ status = rcar_du_crtc_read(rcrtc, DSSR);
+ rcrtc->vblank_count = status & DSSR_VBK ? 2 : 1;
+ spin_unlock_irq(&rcrtc->vblank_lock);
+ if (!wait_event_timeout(rcrtc->vblank_wait, rcrtc->vblank_count == 0,
+ msecs_to_jiffies(100)))
+ dev_warn(rcdu->dev, "vertical blanking timeout\n");
+ drm_crtc_vblank_put(crtc);
+}
+
static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
{
struct drm_crtc *crtc = &rcrtc->crtc;
@@ -379,17 +404,16 @@ static void rcar_du_crtc_stop(struct rca
return;
/* Disable all planes and wait for the change to take effect. This is
- * required as the DSnPR registers are updated on vblank, and no vblank
- * will occur once the CRTC is stopped. Disabling planes when starting
- * the CRTC thus wouldn't be enough as it would start scanning out
- * immediately from old frame buffers until the next vblank.
+ * required as the plane enable registers are updated on vblank, and no
+ * vblank will occur once the CRTC is stopped. Disabling planes when
+ * starting the CRTC thus wouldn't be enough as it would start scanning
+ * out immediately from old frame buffers until the next vblank.
*
* This increases the CRTC stop delay, especially when multiple CRTCs
* are stopped in one operation as we now wait for one vblank per CRTC.
* Whether this can be improved needs to be researched.
*/
- rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
- drm_crtc_wait_one_vblank(crtc);
+ rcar_du_crtc_disable_planes(rcrtc);
/* Disable vertical blanking interrupt reporting. We first need to wait
* for page flip completion before stopping the CRTC as userspace
@@ -528,10 +552,26 @@ static irqreturn_t rcar_du_crtc_irq(int
irqreturn_t ret = IRQ_NONE;
u32 status;
+ spin_lock(&rcrtc->vblank_lock);
+
status = rcar_du_crtc_read(rcrtc, DSSR);
rcar_du_crtc_write(rcrtc, DSRCR, status & DSRCR_MASK);
if (status & DSSR_VBK) {
+ /*
+ * Wake up the vblank wait if the counter reaches 0. This must
+ * be protected by the vblank_lock to avoid races in
+ * rcar_du_crtc_disable_planes().
+ */
+ if (rcrtc->vblank_count) {
+ if (--rcrtc->vblank_count == 0)
+ wake_up(&rcrtc->vblank_wait);
+ }
+ }
+
+ spin_unlock(&rcrtc->vblank_lock);
+
+ if (status & DSSR_VBK) {
drm_handle_vblank(rcrtc->crtc.dev, rcrtc->index);
rcar_du_crtc_finish_page_flip(rcrtc);
ret = IRQ_HANDLED;
@@ -585,6 +625,8 @@ int rcar_du_crtc_create(struct rcar_du_g
}
init_waitqueue_head(&rcrtc->flip_wait);
+ init_waitqueue_head(&rcrtc->vblank_wait);
+ spin_lock_init(&rcrtc->vblank_lock);
rcrtc->group = rgrp;
rcrtc->mmio_offset = mmio_offsets[index];
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -15,6 +15,7 @@
#define __RCAR_DU_CRTC_H__
#include <linux/mutex.h>
+#include <linux/spinlock.h>
#include <linux/wait.h>
#include <drm/drmP.h>
@@ -32,6 +33,9 @@ struct rcar_du_group;
* @started: whether the CRTC has been started and is running
* @event: event to post when the pending page flip completes
* @flip_wait: wait queue used to signal page flip completion
+ * @vblank_lock: protects vblank_wait and vblank_count
+ * @vblank_wait: wait queue used to signal vertical blanking
+ * @vblank_count: number of vertical blanking interrupts to wait for
* @outputs: bitmask of the outputs (enum rcar_du_output) driven by this CRTC
* @enabled: whether the CRTC is enabled, used to control system resume
* @group: CRTC group this CRTC belongs to
@@ -48,6 +52,10 @@ struct rcar_du_crtc {
struct drm_pending_vblank_event *event;
wait_queue_head_t flip_wait;
+ spinlock_t vblank_lock;
+ wait_queue_head_t vblank_wait;
+ unsigned int vblank_count;
+
unsigned int outputs;
bool enabled;
Patches currently in stable-queue which might be from laurent.pinchart+renesas(a)ideasonboard.com are
queue-4.4/drm-rcar-du-use-the-vbk-interrupt-for-vblank-events.patch
queue-4.4/drm-rcar-du-fix-race-condition-when-disabling-planes-at-crtc-stop.patch
This is a note to let you know that I've just added the patch titled
drm: rcar-du: Use the VBK interrupt for vblank events
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-rcar-du-use-the-vbk-interrupt-for-vblank-events.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From cbbb90b0c084d7dfb2ed8e3fecf8df200fbdd2a0 Mon Sep 17 00:00:00 2001
From: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Date: Mon, 10 Jul 2017 23:46:39 +0300
Subject: drm: rcar-du: Use the VBK interrupt for vblank events
From: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
commit cbbb90b0c084d7dfb2ed8e3fecf8df200fbdd2a0 upstream.
When implementing support for interlaced modes, the driver switched from
reporting vblank events on the vertical blanking (VBK) interrupt to the
frame end interrupt (FRM). This incorrectly divided the reported refresh
rate by two. Fix it by moving back to the VBK interrupt.
Fixes: 906eff7fcada ("drm: rcar-du: Implement support for interlaced modes")
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -531,7 +531,7 @@ static irqreturn_t rcar_du_crtc_irq(int
status = rcar_du_crtc_read(rcrtc, DSSR);
rcar_du_crtc_write(rcrtc, DSRCR, status & DSRCR_MASK);
- if (status & DSSR_FRM) {
+ if (status & DSSR_VBK) {
drm_handle_vblank(rcrtc->crtc.dev, rcrtc->index);
rcar_du_crtc_finish_page_flip(rcrtc);
ret = IRQ_HANDLED;
Patches currently in stable-queue which might be from laurent.pinchart+renesas(a)ideasonboard.com are
queue-4.4/drm-rcar-du-use-the-vbk-interrupt-for-vblank-events.patch
queue-4.4/drm-rcar-du-fix-race-condition-when-disabling-planes-at-crtc-stop.patch
This is a note to let you know that I've just added the patch titled
ASoC: rsnd: don't call free_irq() on Parent SSI
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-rsnd-don-t-call-free_irq-on-parent-ssi.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1f8754d4daea5f257370a52a30fcb22798c54516 Mon Sep 17 00:00:00 2001
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Date: Tue, 16 May 2017 01:48:24 +0000
Subject: ASoC: rsnd: don't call free_irq() on Parent SSI
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
commit 1f8754d4daea5f257370a52a30fcb22798c54516 upstream.
If SSI uses shared pin, some SSI will be used as parent SSI.
Then, normal SSI's remove and Parent SSI's remove
(these are same SSI) will be called when unbind or remove timing.
In this case, free_irq() will be called twice.
This patch solve this issue.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx(a)renesas.com>
Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx(a)renesas.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/sh/rcar/rsnd.h | 2 ++
sound/soc/sh/rcar/ssi.c | 5 +++++
2 files changed, 7 insertions(+)
--- a/sound/soc/sh/rcar/rsnd.h
+++ b/sound/soc/sh/rcar/rsnd.h
@@ -235,6 +235,7 @@ enum rsnd_mod_type {
RSND_MOD_MIX,
RSND_MOD_CTU,
RSND_MOD_SRC,
+ RSND_MOD_SSIP, /* SSI parent */
RSND_MOD_SSI,
RSND_MOD_MAX,
};
@@ -365,6 +366,7 @@ struct rsnd_dai_stream {
};
#define rsnd_io_to_mod(io, i) ((i) < RSND_MOD_MAX ? (io)->mod[(i)] : NULL)
#define rsnd_io_to_mod_ssi(io) rsnd_io_to_mod((io), RSND_MOD_SSI)
+#define rsnd_io_to_mod_ssip(io) rsnd_io_to_mod((io), RSND_MOD_SSIP)
#define rsnd_io_to_mod_src(io) rsnd_io_to_mod((io), RSND_MOD_SRC)
#define rsnd_io_to_mod_ctu(io) rsnd_io_to_mod((io), RSND_MOD_CTU)
#define rsnd_io_to_mod_mix(io) rsnd_io_to_mod((io), RSND_MOD_MIX)
--- a/sound/soc/sh/rcar/ssi.c
+++ b/sound/soc/sh/rcar/ssi.c
@@ -550,11 +550,16 @@ static int rsnd_ssi_dma_remove(struct rs
struct rsnd_priv *priv)
{
struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
+ struct rsnd_mod *ssi_parent_mod = rsnd_io_to_mod_ssip(io);
struct device *dev = rsnd_priv_to_dev(priv);
int irq = ssi->info->irq;
rsnd_dma_quit(io, rsnd_mod_to_dma(mod));
+ /* Do nothing for SSI parent mod */
+ if (ssi_parent_mod == mod)
+ return 0;
+
/* PIO will request IRQ again */
devm_free_irq(dev, irq, mod);
Patches currently in stable-queue which might be from kuninori.morimoto.gx(a)renesas.com are
queue-4.4/asoc-rsnd-don-t-call-free_irq-on-parent-ssi.patch
queue-4.4/asoc-rsnd-avoid-duplicate-free_irq.patch
This is a note to let you know that I've just added the patch titled
ASoC: simple-card: Fix misleading error message
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-simple-card-fix-misleading-error-message.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 7ac45d1635a4cd2e99a4b11903d4a2815ca1b27b Mon Sep 17 00:00:00 2001
From: Julian Scheel <julian(a)jusst.de>
Date: Wed, 24 May 2017 12:28:23 +0200
Subject: ASoC: simple-card: Fix misleading error message
From: Julian Scheel <julian(a)jusst.de>
commit 7ac45d1635a4cd2e99a4b11903d4a2815ca1b27b upstream.
In case cpu could not be found the error message would always refer to
/codec/ not being found in DT. Fix this by catching the cpu node not found
case explicitly.
Signed-off-by: Julian Scheel <julian(a)jusst.de>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/generic/simple-card.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/sound/soc/generic/simple-card.c
+++ b/sound/soc/generic/simple-card.c
@@ -343,13 +343,19 @@ static int asoc_simple_card_dai_link_of(
snprintf(prop, sizeof(prop), "%scpu", prefix);
cpu = of_get_child_by_name(node, prop);
+ if (!cpu) {
+ ret = -EINVAL;
+ dev_err(dev, "%s: Can't find %s DT node\n", __func__, prop);
+ goto dai_link_of_err;
+ }
+
snprintf(prop, sizeof(prop), "%splat", prefix);
plat = of_get_child_by_name(node, prop);
snprintf(prop, sizeof(prop), "%scodec", prefix);
codec = of_get_child_by_name(node, prop);
- if (!cpu || !codec) {
+ if (!codec) {
ret = -EINVAL;
dev_err(dev, "%s: Can't find %s DT node\n", __func__, prop);
goto dai_link_of_err;
Patches currently in stable-queue which might be from julian(a)jusst.de are
queue-4.4/asoc-simple-card-fix-misleading-error-message.patch
This is a note to let you know that I've just added the patch titled
ASoC: rsnd: avoid duplicate free_irq()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-rsnd-avoid-duplicate-free_irq.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e0936c3471a8411a5df327641fa3ffe12a2fb07b Mon Sep 17 00:00:00 2001
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Date: Wed, 9 Aug 2017 02:16:20 +0000
Subject: ASoC: rsnd: avoid duplicate free_irq()
From: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
commit e0936c3471a8411a5df327641fa3ffe12a2fb07b upstream.
commit 1f8754d4daea5f ("ASoC: rsnd: don't call free_irq() on
Parent SSI") fixed Parent SSI duplicate free_irq().
But on Renesas Sound, not only Parent SSI but also Multi SSI
have same issue.
This patch avoid duplicate free_irq() if it was not pure SSI.
Fixes: 1f8754d4daea5f ("ASoC: rsnd: don't call free_irq() on Parent SSI")
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx(a)renesas.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/sh/rcar/ssi.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/sound/soc/sh/rcar/ssi.c
+++ b/sound/soc/sh/rcar/ssi.c
@@ -550,14 +550,14 @@ static int rsnd_ssi_dma_remove(struct rs
struct rsnd_priv *priv)
{
struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
- struct rsnd_mod *ssi_parent_mod = rsnd_io_to_mod_ssip(io);
+ struct rsnd_mod *pure_ssi_mod = rsnd_io_to_mod_ssi(io);
struct device *dev = rsnd_priv_to_dev(priv);
int irq = ssi->info->irq;
rsnd_dma_quit(io, rsnd_mod_to_dma(mod));
- /* Do nothing for SSI parent mod */
- if (ssi_parent_mod == mod)
+ /* Do nothing if non SSI (= SSI parent, multi SSI) mod */
+ if (pure_ssi_mod != mod)
return 0;
/* PIO will request IRQ again */
Patches currently in stable-queue which might be from kuninori.morimoto.gx(a)renesas.com are
queue-4.4/asoc-rsnd-don-t-call-free_irq-on-parent-ssi.patch
queue-4.4/asoc-rsnd-avoid-duplicate-free_irq.patch
From: Matthias Hintzmann <matthias.dev(a)gmx.de>
ctx->drvflags is checked in the if clause before beeing initialized. Move initialization before first usage.
Note, that the if clause was backported from mainline at Nov. 15th 2017 (GetNtbFormat endian fix). In mainline, the initialization is at the right place before the if clause.
Signed-off-by: Matthias Hintzmann <matthias.dev(a)gmx.de>
---
drivers/net/usb/cdc_ncm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 1228d0da407..72cb30828a1 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -825,6 +825,9 @@ int cdc_ncm_bind_common(struct usbnet *dev, struct usb_interface *intf, u8 data_
goto error2;
}
+ /* Device-specific flags */
+ ctx->drvflags = drvflags;
+
/*
* Some Huawei devices have been observed to come out of reset in NDP32 mode.
* Let's check if this is the case, and set the device to NDP16 mode again if
@@ -873,9 +876,6 @@ int cdc_ncm_bind_common(struct usbnet *dev, struct usb_interface *intf, u8 data_
/* finish setting up the device specific data */
cdc_ncm_setup(dev);
- /* Device-specific flags */
- ctx->drvflags = drvflags;
-
/* Allocate the delayed NDP if needed. */
if (ctx->drvflags & CDC_NCM_FLAG_NDP_TO_END) {
ctx->delayed_ndp16 = kzalloc(ctx->max_ndp_size, GFP_KERNEL);
--
2.15.1
This is the start of the stable review cycle for the 4.15.2 release.
There are 60 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Feb 7 18:21:57 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.15.2-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.15.2-rc1
Ian Abbott <abbotti(a)mev.co.uk>
fpga: region: release of_parse_phandle nodes after use
Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
serial: core: mark port as initialized after successful IRQ change
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
Ashok Raj <ashok.raj(a)intel.com>
KVM/x86: Add IBPB support
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX
Darren Kenny <darren.kenny(a)oracle.com>
x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
Arnd Bergmann <arnd(a)arndb.de>
x86/pti: Mark constant arrays as __initconst
KarimAllah Ahmed <karahmed(a)amazon.de>
x86/spectre: Simplify spectre_v2 command line parsing
David Woodhouse <dwmw(a)amazon.co.uk>
x86/retpoline: Avoid retpolines for built-in __init functions
Dan Williams <dan.j.williams(a)intel.com>
x86/kvm: Update spectre-v1 mitigation
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: VMX: make MSR bitmaps per-VCPU
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/paravirt: Remove 'noreplace-paravirt' cmdline option
Tim Chen <tim.c.chen(a)linux.intel.com>
x86/speculation: Use Indirect Branch Prediction Barrier in context switch
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
Colin Ian King <colin.king(a)canonical.com>
x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"
Dan Williams <dan.j.williams(a)intel.com>
x86/spectre: Report get_user mitigation for spectre_v1
Dan Williams <dan.j.williams(a)intel.com>
nl80211: Sanitize array index in parse_txq_params
Dan Williams <dan.j.williams(a)intel.com>
vfs, fdtable: Prevent bounds-check bypass via speculative execution
Dan Williams <dan.j.williams(a)intel.com>
x86/syscall: Sanitize syscall table de-references under speculation
Dan Williams <dan.j.williams(a)intel.com>
x86/get_user: Use pointer masking to limit speculation
Dan Williams <dan.j.williams(a)intel.com>
x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec
Dan Williams <dan.j.williams(a)intel.com>
x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end}
Dan Williams <dan.j.williams(a)intel.com>
x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec
Dan Williams <dan.j.williams(a)intel.com>
x86: Introduce barrier_nospec
Dan Williams <dan.j.williams(a)intel.com>
x86: Implement array_index_mask_nospec
Dan Williams <dan.j.williams(a)intel.com>
array_index_nospec: Sanitize speculative array de-references
Mark Rutland <mark.rutland(a)arm.com>
Documentation: Document array_index_nospec
Andy Lutomirski <luto(a)kernel.org>
x86/asm: Move 'status' from thread_struct to thread_info
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Push extra regs right away
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Remove the SYSCALL64 fast path
Dou Liyang <douly.fnst(a)cn.fujitsu.com>
x86/spectre: Check CONFIG_RETPOLINE in command line parser
William Grant <william.grant(a)canonical.com>
x86/mm: Fix overlap of i386 CPU_ENTRY_AREA with FIX_BTMAP
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Warn on stripped section symbol
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Add support for alternatives at the end of a section
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Improve retpoline alternative handling
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: VMX: introduce alloc_loaded_vmcs
Jim Mattson <jmattson(a)google.com>
KVM: nVMX: Eliminate vmcs02 pool
Jesse Chan <jc(a)linux.com>
ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Jesse Chan <jc(a)linux.com>
pinctrl: pxa: pxa2xx: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Linus Walleij <linus.walleij(a)linaro.org>
iio: adc/accel: Fix up module licenses
Jesse Chan <jc(a)linux.com>
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Borislav Petkov <bp(a)suse.de>
x86/speculation: Simplify indirect_branch_prediction_barrier()
Borislav Petkov <bp(a)alien8.de>
x86/retpoline: Simplify vmexit_fill_RSB()
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Clean up Spectre v2 related CPUID flags
Thomas Gleixner <tglx(a)linutronix.de>
x86/cpu/bugs: Make retpoline module warning conditional
Borislav Petkov <bp(a)suse.de>
x86/bugs: Drop one "mitigation" from dmesg
Borislav Petkov <bp(a)suse.de>
x86/nospec: Fix header guards names
Borislav Petkov <bp(a)suse.de>
x86/alternative: Print unadorned pointers
David Woodhouse <dwmw(a)amazon.co.uk>
x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
David Woodhouse <dwmw(a)amazon.co.uk>
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
David Woodhouse <dwmw(a)amazon.co.uk>
x86/msr: Add definitions for new speculation control MSRs
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Add AMD feature bits for Speculation Control
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Add Intel feature bits for Speculation Control
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Add CPUID_7_EDX CPUID leaf
Andi Kleen <ak(a)linux.intel.com>
module/retpoline: Warn about missing retpoline in module
Peter Zijlstra <peterz(a)infradead.org>
KVM: VMX: Make indirect call speculation safe
Peter Zijlstra <peterz(a)infradead.org>
KVM: x86: Make indirect calls in emulator speculation safe
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 2 -
Documentation/speculation.txt | 90 ++++
Makefile | 4 +-
arch/x86/entry/common.c | 9 +-
arch/x86/entry/entry_32.S | 3 +-
arch/x86/entry/entry_64.S | 130 +----
arch/x86/entry/syscall_64.c | 7 +-
arch/x86/include/asm/asm-prototypes.h | 3 +
arch/x86/include/asm/barrier.h | 28 +
arch/x86/include/asm/cpufeature.h | 7 +-
arch/x86/include/asm/cpufeatures.h | 22 +-
arch/x86/include/asm/disabled-features.h | 3 +-
arch/x86/include/asm/fixmap.h | 6 +-
arch/x86/include/asm/msr-index.h | 12 +
arch/x86/include/asm/msr.h | 3 +-
arch/x86/include/asm/nospec-branch.h | 86 +--
arch/x86/include/asm/pgtable_32_types.h | 5 +-
arch/x86/include/asm/processor.h | 5 +-
arch/x86/include/asm/required-features.h | 3 +-
arch/x86/include/asm/syscall.h | 6 +-
arch/x86/include/asm/thread_info.h | 3 +-
arch/x86/include/asm/tlbflush.h | 2 +
arch/x86/include/asm/uaccess.h | 15 +-
arch/x86/include/asm/uaccess_32.h | 6 +-
arch/x86/include/asm/uaccess_64.h | 12 +-
arch/x86/kernel/alternative.c | 28 +-
arch/x86/kernel/cpu/bugs.c | 134 +++--
arch/x86/kernel/cpu/common.c | 70 ++-
arch/x86/kernel/cpu/intel.c | 66 +++
arch/x86/kernel/cpu/scattered.c | 2 -
arch/x86/kernel/process_64.c | 4 +-
arch/x86/kernel/ptrace.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
arch/x86/kvm/cpuid.c | 22 +-
arch/x86/kvm/cpuid.h | 1 +
arch/x86/kvm/emulate.c | 9 +-
arch/x86/kvm/svm.c | 116 +++++
arch/x86/kvm/vmx.c | 660 ++++++++++++++----------
arch/x86/kvm/x86.c | 1 +
arch/x86/lib/Makefile | 1 +
arch/x86/lib/getuser.S | 10 +
arch/x86/lib/retpoline.S | 56 ++
arch/x86/lib/usercopy_32.c | 8 +-
arch/x86/mm/tlb.c | 33 +-
drivers/auxdisplay/img-ascii-lcd.c | 4 +
drivers/fpga/fpga-region.c | 13 +-
drivers/iio/accel/kxsd9-i2c.c | 3 +
drivers/iio/adc/qcom-vadc-common.c | 4 +
drivers/pinctrl/pxa/pinctrl-pxa2xx.c | 4 +
drivers/tty/serial/serial_core.c | 2 +
include/linux/fdtable.h | 5 +-
include/linux/init.h | 9 +-
include/linux/module.h | 9 +
include/linux/nospec.h | 72 +++
kernel/module.c | 11 +
net/wireless/nl80211.c | 9 +-
scripts/mod/modpost.c | 9 +
sound/soc/codecs/pcm512x-spi.c | 4 +
tools/objtool/check.c | 89 ++--
tools/objtool/orc_gen.c | 5 +
60 files changed, 1312 insertions(+), 637 deletions(-)
This is the start of the stable review cycle for the 3.18.94 release.
There are 36 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Feb 7 18:23:41 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.94-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 3.18.94-rc1
Jesse Chan <jc(a)linux.com>
ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Stefan Agner <stefan(a)agner.ch>
spi: imx: do not access registers while clocks disabled
Mark Salyzyn <salyzyn(a)android.com>
selinux: general protection fault in sock_has_perm
Oliver Neukum <oneukum(a)suse.com>
usb: uas: unconditionally bring back host after reset
Hemant Kumar <hemantk(a)codeaurora.org>
usb: f_fs: Prevent gadget unbind if it is already unbound
Johan Hovold <johan(a)kernel.org>
USB: serial: simple: add Motorola Tetra driver
Shuah Khan <shuahkh(a)osg.samsung.com>
usbip: list: don't list devices attached to vhci_hcd
Shuah Khan <shuahkh(a)osg.samsung.com>
usbip: prevent bind loops on devices attached to vhci_hcd
Jia-Ju Bai <baijiaju1990(a)gmail.com>
USB: serial: io_edgeport: fix possible sleep-in-atomic
Oliver Neukum <oneukum(a)suse.com>
CDC-ACM: apply quirk for card reader
Hans de Goede <hdegoede(a)redhat.com>
USB: cdc-acm: Do not log urb submission errors on disconnect
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
USB: serial: pl2303: new device id for Chilitag
Larry Finger <Larry.Finger(a)lwfinger.net>
staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID
Colin Ian King <colin.king(a)canonical.com>
usb: gadget: don't dereference g until after it has been null checked
Icenowy Zheng <icenowy(a)aosc.io>
media: usbtv: add a new usbid
Gustavo A. R. Silva <garsilva(a)embeddedor.com>
scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
quota: Check for register_shrinker() failure.
Geert Uytterhoeven <geert+renesas(a)glider.be>
net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit
Robert Lippert <roblip(a)gmail.com>
hwmon: (pmbus) Use 64bit math for DIRECT format values
Andrew Elble <aweits(a)rit.edu>
nfsd: check for use of the closed special stateid
Trond Myklebust <trond.myklebust(a)primarydata.com>
nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0)
Eduardo Otubo <otubo(a)redhat.com>
xen-netfront: remove warning when unloading module
Wanpeng Li <wanpeng.li(a)hotmail.com>
KVM: VMX: Fix rflags cache during vCPU reset
Chun-Yeow Yeoh <yeohchunyeow(a)gmail.com>
mac80211: fix the update of path metric for RANN frame
Michael Lyle <mlyle(a)lyle.org>
bcache: check return value of register_shrinker
Wanpeng Li <wanpeng.li(a)hotmail.com>
KVM: X86: Fix operand/address-size during instruction decoding
Liran Alon <liran.alon(a)oracle.com>
KVM: x86: Don't re-execute instruction when not passing CR2 value
Liran Alon <liran.alon(a)oracle.com>
KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
Lyude Paul <lyude(a)redhat.com>
igb: Free IRQs when device is hotplugged
Jesse Chan <jc(a)linux.com>
gpio: iop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Takashi Iwai <tiwai(a)suse.de>
ALSA: seq: Make ioctls race-free
Linus Torvalds <torvalds(a)linux-foundation.org>
loop: fix concurrent lo_open/lo_release
Richard Weinberger <richard(a)nod.at>
um: Remove copy&paste code from init.h
Richard Weinberger <richard(a)nod.at>
um: Stop abusing __KERNEL__
Thomas Meyer <thomas(a)m3y3r.de>
um: link vmlinux with -no-pie
Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Input: do not emit unneeded EV_SYN when suspending
-------------
Diffstat:
Makefile | 4 ++--
arch/um/Makefile | 9 +++++----
arch/um/drivers/mconsole.h | 2 +-
arch/um/include/shared/init.h | 24 ++----------------------
arch/um/include/shared/user.h | 2 +-
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/emulate.c | 7 +++++++
arch/x86/kvm/vmx.c | 4 ++--
arch/x86/kvm/x86.c | 2 +-
arch/x86/um/shared/sysdep/tls.h | 6 +++---
drivers/block/loop.c | 10 ++++++++--
drivers/gpio/gpio-iop.c | 4 ++++
drivers/hwmon/pmbus/pmbus_core.c | 21 ++++++++++++---------
drivers/input/input.c | 5 ++++-
drivers/md/bcache/btree.c | 5 ++++-
drivers/media/usb/usbtv/usbtv-core.c | 1 +
drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
drivers/net/ethernet/xilinx/Kconfig | 1 +
drivers/net/xen-netfront.c | 18 ++++++++++++++++++
drivers/scsi/ufs/ufshcd.c | 7 +++++--
drivers/spi/spi-imx.c | 15 +++++++++++++--
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 14 ++++----------
drivers/usb/class/cdc-acm.c | 5 ++++-
drivers/usb/gadget/composite.c | 7 +++++--
drivers/usb/gadget/function/f_fs.c | 3 ++-
drivers/usb/serial/Kconfig | 1 +
drivers/usb/serial/io_edgeport.c | 1 -
drivers/usb/serial/pl2303.c | 1 +
drivers/usb/serial/pl2303.h | 1 +
drivers/usb/serial/usb-serial-simple.c | 7 +++++++
drivers/usb/storage/uas.c | 7 +++----
fs/nfsd/nfs4state.c | 15 +++++++++++++--
fs/quota/dquot.c | 3 ++-
net/mac80211/mesh_hwmp.c | 15 +++++++++------
security/selinux/hooks.c | 2 ++
sound/core/seq/seq_clientmgr.c | 10 ++++++++--
sound/core/seq/seq_clientmgr.h | 1 +
sound/soc/codecs/pcm512x-spi.c | 4 ++++
tools/usb/usbip/src/usbip_bind.c | 9 +++++++++
tools/usb/usbip/src/usbip_list.c | 9 +++++++++
40 files changed, 182 insertions(+), 85 deletions(-)
Alex, here is a change to vaddr_get_pfn() that we discussed in this
thread: https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg07117.html
Namely, drop support for passing Filesystem-DAX mappings through to
guests. Perhaps in the future we can create some para-virtualized
passthrough interface to coordinate guest-DMA vs host-filesystem
operations. For now, this needs to be disabled for data-integrity and
guaranteeing forward progress of filesystem operations.
If you want to take this through your tree please grab the other dax
fixups as well. Otherwise, let me know and I'll take the lot through the
nvdimm tree.
---
Dan Williams (3):
dax: fix S_DAX definition
dax: short circuit vma_is_fsdax() in the CONFIG_FS_DAX=n case
vfio: disable filesystem-dax page pinning
drivers/vfio/vfio_iommu_type1.c | 18 +++++++++++++++---
include/linux/fs.h | 4 +++-
2 files changed, 18 insertions(+), 4 deletions(-)
This is the start of the stable review cycle for the 4.14.18 release.
There are 64 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed Feb 7 18:21:22 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.18-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.18-rc1
Ian Abbott <abbotti(a)mev.co.uk>
fpga: region: release of_parse_phandle nodes after use
Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
serial: core: mark port as initialized after successful IRQ change
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
Ashok Raj <ashok.raj(a)intel.com>
KVM/x86: Add IBPB support
KarimAllah Ahmed <karahmed(a)amazon.de>
KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX
Darren Kenny <darren.kenny(a)oracle.com>
x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
Arnd Bergmann <arnd(a)arndb.de>
x86/pti: Mark constant arrays as __initconst
KarimAllah Ahmed <karahmed(a)amazon.de>
x86/spectre: Simplify spectre_v2 command line parsing
David Woodhouse <dwmw(a)amazon.co.uk>
x86/retpoline: Avoid retpolines for built-in __init functions
Dan Williams <dan.j.williams(a)intel.com>
x86/kvm: Update spectre-v1 mitigation
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: VMX: make MSR bitmaps per-VCPU
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/paravirt: Remove 'noreplace-paravirt' cmdline option
Tim Chen <tim.c.chen(a)linux.intel.com>
x86/speculation: Use Indirect Branch Prediction Barrier in context switch
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
Colin Ian King <colin.king(a)canonical.com>
x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"
Dan Williams <dan.j.williams(a)intel.com>
x86/spectre: Report get_user mitigation for spectre_v1
Dan Williams <dan.j.williams(a)intel.com>
nl80211: Sanitize array index in parse_txq_params
Dan Williams <dan.j.williams(a)intel.com>
vfs, fdtable: Prevent bounds-check bypass via speculative execution
Dan Williams <dan.j.williams(a)intel.com>
x86/syscall: Sanitize syscall table de-references under speculation
Dan Williams <dan.j.williams(a)intel.com>
x86/get_user: Use pointer masking to limit speculation
Dan Williams <dan.j.williams(a)intel.com>
x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec
Dan Williams <dan.j.williams(a)intel.com>
x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end}
Dan Williams <dan.j.williams(a)intel.com>
x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec
Dan Williams <dan.j.williams(a)intel.com>
x86: Introduce barrier_nospec
Dan Williams <dan.j.williams(a)intel.com>
x86: Implement array_index_mask_nospec
Dan Williams <dan.j.williams(a)intel.com>
array_index_nospec: Sanitize speculative array de-references
Mark Rutland <mark.rutland(a)arm.com>
Documentation: Document array_index_nospec
Andy Lutomirski <luto(a)kernel.org>
x86/asm: Move 'status' from thread_struct to thread_info
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Push extra regs right away
Andy Lutomirski <luto(a)kernel.org>
x86/entry/64: Remove the SYSCALL64 fast path
Dou Liyang <douly.fnst(a)cn.fujitsu.com>
x86/spectre: Check CONFIG_RETPOLINE in command line parser
William Grant <william.grant(a)canonical.com>
x86/mm: Fix overlap of i386 CPU_ENTRY_AREA with FIX_BTMAP
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Warn on stripped section symbol
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Add support for alternatives at the end of a section
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Improve retpoline alternative handling
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: VMX: introduce alloc_loaded_vmcs
Jim Mattson <jmattson(a)google.com>
KVM: nVMX: Eliminate vmcs02 pool
Jesse Chan <jc(a)linux.com>
ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Jesse Chan <jc(a)linux.com>
pinctrl: pxa: pxa2xx: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Linus Walleij <linus.walleij(a)linaro.org>
iio: adc/accel: Fix up module licenses
Jesse Chan <jc(a)linux.com>
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Borislav Petkov <bp(a)suse.de>
x86/speculation: Simplify indirect_branch_prediction_barrier()
Borislav Petkov <bp(a)alien8.de>
x86/retpoline: Simplify vmexit_fill_RSB()
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Clean up Spectre v2 related CPUID flags
Thomas Gleixner <tglx(a)linutronix.de>
x86/cpu/bugs: Make retpoline module warning conditional
Borislav Petkov <bp(a)suse.de>
x86/bugs: Drop one "mitigation" from dmesg
Borislav Petkov <bp(a)suse.de>
x86/nospec: Fix header guards names
Borislav Petkov <bp(a)suse.de>
x86/alternative: Print unadorned pointers
David Woodhouse <dwmw(a)amazon.co.uk>
x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
David Woodhouse <dwmw(a)amazon.co.uk>
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
David Woodhouse <dwmw(a)amazon.co.uk>
x86/msr: Add definitions for new speculation control MSRs
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Add AMD feature bits for Speculation Control
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Add Intel feature bits for Speculation Control
David Woodhouse <dwmw(a)amazon.co.uk>
x86/cpufeatures: Add CPUID_7_EDX CPUID leaf
Andi Kleen <ak(a)linux.intel.com>
module/retpoline: Warn about missing retpoline in module
Peter Zijlstra <peterz(a)infradead.org>
KVM: VMX: Make indirect call speculation safe
Peter Zijlstra <peterz(a)infradead.org>
KVM: x86: Make indirect calls in emulator speculation safe
Waiman Long <longman(a)redhat.com>
x86/retpoline: Remove the esp/rsp thunk
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc/64s: Allow control of RFI flush via debugfs
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc/64s: Wire up cpu_show_meltdown()
Liu, Changcheng <changcheng.liu(a)intel.com>
scripts/faddr2line: fix CROSS_COMPILE unset error
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 2 -
Documentation/speculation.txt | 90 ++++
Makefile | 4 +-
arch/powerpc/Kconfig | 1 +
arch/powerpc/kernel/setup_64.c | 38 ++
arch/x86/entry/common.c | 9 +-
arch/x86/entry/entry_32.S | 3 +-
arch/x86/entry/entry_64.S | 130 +----
arch/x86/entry/syscall_64.c | 7 +-
arch/x86/include/asm/asm-prototypes.h | 4 +-
arch/x86/include/asm/barrier.h | 28 +
arch/x86/include/asm/cpufeature.h | 7 +-
arch/x86/include/asm/cpufeatures.h | 22 +-
arch/x86/include/asm/disabled-features.h | 3 +-
arch/x86/include/asm/fixmap.h | 6 +-
arch/x86/include/asm/msr-index.h | 12 +
arch/x86/include/asm/msr.h | 3 +-
arch/x86/include/asm/nospec-branch.h | 86 +--
arch/x86/include/asm/pgtable_32_types.h | 5 +-
arch/x86/include/asm/processor.h | 5 +-
arch/x86/include/asm/required-features.h | 3 +-
arch/x86/include/asm/syscall.h | 6 +-
arch/x86/include/asm/thread_info.h | 3 +-
arch/x86/include/asm/tlbflush.h | 2 +
arch/x86/include/asm/uaccess.h | 15 +-
arch/x86/include/asm/uaccess_32.h | 6 +-
arch/x86/include/asm/uaccess_64.h | 12 +-
arch/x86/kernel/alternative.c | 28 +-
arch/x86/kernel/cpu/bugs.c | 134 +++--
arch/x86/kernel/cpu/common.c | 70 ++-
arch/x86/kernel/cpu/intel.c | 66 +++
arch/x86/kernel/cpu/scattered.c | 2 -
arch/x86/kernel/process_64.c | 4 +-
arch/x86/kernel/ptrace.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
arch/x86/kvm/cpuid.c | 22 +-
arch/x86/kvm/cpuid.h | 1 +
arch/x86/kvm/emulate.c | 9 +-
arch/x86/kvm/svm.c | 116 +++++
arch/x86/kvm/vmx.c | 660 ++++++++++++++----------
arch/x86/kvm/x86.c | 1 +
arch/x86/lib/Makefile | 1 +
arch/x86/lib/getuser.S | 10 +
arch/x86/lib/retpoline.S | 57 +-
arch/x86/lib/usercopy_32.c | 8 +-
arch/x86/mm/tlb.c | 33 +-
drivers/auxdisplay/img-ascii-lcd.c | 4 +
drivers/fpga/fpga-region.c | 13 +-
drivers/iio/accel/kxsd9-i2c.c | 3 +
drivers/iio/adc/qcom-vadc-common.c | 4 +
drivers/pinctrl/pxa/pinctrl-pxa2xx.c | 4 +
drivers/tty/serial/serial_core.c | 2 +
include/linux/fdtable.h | 5 +-
include/linux/init.h | 9 +-
include/linux/module.h | 9 +
include/linux/nospec.h | 72 +++
kernel/module.c | 11 +
net/wireless/nl80211.c | 9 +-
scripts/faddr2line | 8 +-
scripts/mod/modpost.c | 9 +
sound/soc/codecs/pcm512x-spi.c | 4 +
tools/objtool/check.c | 89 ++--
tools/objtool/orc_gen.c | 5 +
63 files changed, 1355 insertions(+), 643 deletions(-)
Hi,
On 02/08/2016 at 11:50:16 +1000, Stewart Smith wrote:
> According to the OPAL docs:
> https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-…
> https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-…
> OPAL_HARDWARE may be returned from OPAL_RTC_READ or OPAL_RTC_WRITE and this
> indicates either a transient or permanent error.
>
> Prior to this patch, Linux was not dealing with OPAL_HARDWARE being a
> permanent error particularly well, in that you could end up in a busy
> loop.
>
> This was not too hard to trigger on an AMI BMC based OpenPOWER machine
> doing a continuous "ipmitool mc reset cold" to the BMC, the result of
> that being that we'd get stuck in an infinite loop in opal_get_rtc_time.
>
> We now retry a few times before returning the error higher up the stack.
>
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Stewart Smith <stewart(a)linux.vnet.ibm.com>
> ---
> drivers/rtc/rtc-opal.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
Just a note to let you know that this patch should have gone through my
tree but it was not sent to linux-rtc or me.
I guess what happened is that Michael cleaned up the Linux PPC patchwork
queue.
--
Alexandre Belloni, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com
Commit-ID: 8e1eb3fa009aa7c0b944b3c8b26b07de0efb3200
Gitweb: https://git.kernel.org/tip/8e1eb3fa009aa7c0b944b3c8b26b07de0efb3200
Author: Dan Williams <dan.j.williams(a)intel.com>
AuthorDate: Mon, 5 Feb 2018 17:18:05 -0800
Committer: Ingo Molnar <mingo(a)kernel.org>
CommitDate: Tue, 6 Feb 2018 08:30:27 +0100
x86/entry/64: Clear extra registers beyond syscall arguments, to reduce speculation attack surface
At entry userspace may have (maliciously) populated the extra registers
outside the syscall calling convention with arbitrary values that could
be useful in a speculative execution (Spectre style) attack.
Clear these registers to minimize the kernel's attack surface.
Note, this only clears the extra registers and not the unused
registers for syscalls less than 6 arguments, since those registers are
likely to be clobbered well before their values could be put to use
under speculation.
Note, Linus found that the XOR instructions can be executed with
minimized cost if interleaved with the PUSH instructions, and Ingo's
analysis found that R10 and R11 should be included in the register
clearing beyond the typical 'extra' syscall calling convention
registers.
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Reported-by: Andi Kleen <ak(a)linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/151787988577.7847.16733592218894189003.stgit@dwill…
[ Made small improvements to the changelog and the code comments. ]
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
---
arch/x86/entry/entry_64.S | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index c752abe..065a71b 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -235,13 +235,26 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
pushq %r8 /* pt_regs->r8 */
pushq %r9 /* pt_regs->r9 */
pushq %r10 /* pt_regs->r10 */
+ /*
+ * Clear extra registers that a speculation attack might
+ * otherwise want to exploit. Interleave XOR with PUSH
+ * for better uop scheduling:
+ */
+ xorq %r10, %r10 /* nospec r10 */
pushq %r11 /* pt_regs->r11 */
+ xorq %r11, %r11 /* nospec r11 */
pushq %rbx /* pt_regs->rbx */
+ xorl %ebx, %ebx /* nospec rbx */
pushq %rbp /* pt_regs->rbp */
+ xorl %ebp, %ebp /* nospec rbp */
pushq %r12 /* pt_regs->r12 */
+ xorq %r12, %r12 /* nospec r12 */
pushq %r13 /* pt_regs->r13 */
+ xorq %r13, %r13 /* nospec r13 */
pushq %r14 /* pt_regs->r14 */
+ xorq %r14, %r14 /* nospec r14 */
pushq %r15 /* pt_regs->r15 */
+ xorq %r15, %r15 /* nospec r15 */
UNWIND_HINT_REGS
TRACE_IRQS_OFF
From: Russell King <rmk+kernel(a)armlinux.org.uk>
[ Upstream commit f5e64032a799d4f54decc7eb6aafcdffb67f9ad9 ]
When a PHY has the BMCR_PDOWN bit set, it may decide to ignore writes
to other registers, or reset the registers to power-on defaults.
Micrel PHYs do this for their interrupt registers.
The current structure of phylib tries to enable interrupts before
resuming (and releasing) the BMCR_PDOWN bit. This fails, causing
Micrel PHYs to stop working after a suspend/resume sequence if they
are using interrupts.
Fix this by ensuring that the PHY driver resume methods do not take
the phydev->lock mutex themselves, but the callers of phy_resume()
take that lock. This then allows us to move the call to phy_resume()
before we enable interrupts in phy_start().
Signed-off-by: Russell King <rmk+kernel(a)armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
drivers/net/phy/at803x.c | 4 ----
drivers/net/phy/phy.c | 9 +++------
drivers/net/phy/phy_device.c | 10 ++++++----
3 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index 5f93e6add563..e911e4990b20 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -239,14 +239,10 @@ static int at803x_resume(struct phy_device *phydev)
{
int value;
- mutex_lock(&phydev->lock);
-
value = phy_read(phydev, MII_BMCR);
value &= ~(BMCR_PDOWN | BMCR_ISOLATE);
phy_write(phydev, MII_BMCR, value);
- mutex_unlock(&phydev->lock);
-
return 0;
}
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 2b1e67bc1e73..ed10d1fc8f59 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -828,7 +828,6 @@ EXPORT_SYMBOL(phy_stop);
*/
void phy_start(struct phy_device *phydev)
{
- bool do_resume = false;
int err = 0;
mutex_lock(&phydev->lock);
@@ -841,6 +840,9 @@ void phy_start(struct phy_device *phydev)
phydev->state = PHY_UP;
break;
case PHY_HALTED:
+ /* if phy was suspended, bring the physical link up again */
+ phy_resume(phydev);
+
/* make sure interrupts are re-enabled for the PHY */
if (phydev->irq != PHY_POLL) {
err = phy_enable_interrupts(phydev);
@@ -849,17 +851,12 @@ void phy_start(struct phy_device *phydev)
}
phydev->state = PHY_RESUMING;
- do_resume = true;
break;
default:
break;
}
mutex_unlock(&phydev->lock);
- /* if phy was suspended, bring the physical link up again */
- if (do_resume)
- phy_resume(phydev);
-
phy_trigger_machine(phydev, true);
}
EXPORT_SYMBOL(phy_start);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 67f25ac29025..b15b31ca2618 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -135,7 +135,9 @@ static int mdio_bus_phy_resume(struct device *dev)
if (!mdio_bus_phy_may_suspend(phydev))
goto no_resume;
+ mutex_lock(&phydev->lock);
ret = phy_resume(phydev);
+ mutex_unlock(&phydev->lock);
if (ret < 0)
return ret;
@@ -1026,7 +1028,9 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
if (err)
goto error;
+ mutex_lock(&phydev->lock);
phy_resume(phydev);
+ mutex_unlock(&phydev->lock);
phy_led_triggers_register(phydev);
return err;
@@ -1157,6 +1161,8 @@ int phy_resume(struct phy_device *phydev)
struct phy_driver *phydrv = to_phy_driver(phydev->mdio.dev.driver);
int ret = 0;
+ WARN_ON(!mutex_is_locked(&phydev->lock));
+
if (phydev->drv && phydrv->resume)
ret = phydrv->resume(phydev);
@@ -1639,13 +1645,9 @@ int genphy_resume(struct phy_device *phydev)
{
int value;
- mutex_lock(&phydev->lock);
-
value = phy_read(phydev, MII_BMCR);
phy_write(phydev, MII_BMCR, value & ~BMCR_PDOWN);
- mutex_unlock(&phydev->lock);
-
return 0;
}
EXPORT_SYMBOL(genphy_resume);
--
2.11.0
On Tue, Feb 06, 2018 at 06:48:53AM +0000, Harsh Shandilya wrote:
> On Tue 6 Feb, 2018, 12:09 AM Greg Kroah-Hartman, <gregkh(a)linuxfoundation.org>
> wrote:
>
> > This is the start of the stable review cycle for the 3.18.94 release.
> > There are 36 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed Feb 7 18:23:41 UTC 2018.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> >
> > kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.94-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> > linux-3.18.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> >
>
> Builds and boots on the OnePlus 3T, no regressions noticed.
Yeah! That device is the only reason I keep this tree alive :)
thanks for testing and letting me know.
greg k-h
This is a note to let you know that I've just added the patch titled
um: Fix out-of-tree build
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
um-fix-out-of-tree-build.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0b5aedfe0e6654ec54f35109e1929a1cf7fc4cdd Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard(a)nod.at>
Date: Sun, 28 Jun 2015 22:55:26 +0200
Subject: um: Fix out-of-tree build
From: Richard Weinberger <richard(a)nod.at>
commit 0b5aedfe0e6654ec54f35109e1929a1cf7fc4cdd upstream.
Commit 30b11ee9a (um: Remove copy&paste code from init.h)
uncovered an issue wrt. out-of-tree builds.
For out-of-tree builds, we must not rely on relative paths.
Before 30b11ee9a it worked by chance as no host code included
generated header files.
Acked-by: Randy Dunlap <rdunlap(a)infradead.org>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
Cc: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/um/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -70,8 +70,8 @@ KBUILD_AFLAGS += $(ARCH_INCLUDE)
USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
- -D_FILE_OFFSET_BITS=64 -idirafter include \
- -D__KERNEL__ -D__UM_HOST__
+ -D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
+ -idirafter $(obj)/include -D__KERNEL__ -D__UM_HOST__
#This will adjust *FLAGS accordingly to the platform.
include $(srctree)/$(ARCH_DIR)/Makefile-os-$(OS)
Patches currently in stable-queue which might be from richard(a)nod.at are
queue-3.18/um-stop-abusing-__kernel__.patch
queue-3.18/um-link-vmlinux-with-no-pie.patch
queue-3.18/um-remove-copy-paste-code-from-init.h.patch
queue-3.18/um-fix-out-of-tree-build.patch
When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno. This confuses the signaler into action and
signaling the fence.
Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v4.12+
---
drivers/gpu/drm/i915/intel_breadcrumbs.c | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index efbc627a2a25..b955f7d7bd0f 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -588,29 +588,16 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
spin_unlock_irq(&b->rb_lock);
}
-static bool signal_valid(const struct drm_i915_gem_request *request)
-{
- return intel_wait_check_request(&request->signaling.wait, request);
-}
-
static bool signal_complete(const struct drm_i915_gem_request *request)
{
if (!request)
return false;
- /* If another process served as the bottom-half it may have already
- * signalled that this wait is already completed.
- */
- if (intel_wait_complete(&request->signaling.wait))
- return signal_valid(request);
-
- /* Carefully check if the request is complete, giving time for the
+ /*
+ * Carefully check if the request is complete, giving time for the
* seqno to be visible or if the GPU hung.
*/
- if (__i915_request_irq_complete(request))
- return true;
-
- return false;
+ return __i915_request_irq_complete(request);
}
static struct drm_i915_gem_request *to_signaler(struct rb_node *rb)
@@ -712,6 +699,7 @@ static int intel_breadcrumbs_signaler(void *arg)
&request->fence.flags)) {
local_bh_disable();
dma_fence_signal(&request->fence);
+ GEM_BUG_ON(!i915_gem_request_completed(request));
local_bh_enable(); /* kick start the tasklets */
}
--
2.15.1
On Mon, Feb 05, 2018 at 11:41:54AM +0000, Mark Brown wrote:
>On Sat, Feb 03, 2018 at 06:00:42PM +0000, Sasha Levin wrote:
>> From: Abhijeet Kumar <abhijeet.kumar(a)intel.com>
>>
>> [ Upstream commit d070f7c703ef26e3db613f24206823f916272fc6 ]
>>
>> In skylake platform, we hear a loud pop noise(0 dB) at start of
>> audio capture power up sequence. This patch removes the pop noise
>> from the recording by adding a delay before enabling ADC.
>
>> switch (event) {
>> case SND_SOC_DAPM_POST_PMU:
>> + msleep(125);
>> regmap_update_bits(nau8825->regmap, NAU8825_REG_ENA_CTRL,
>
>This is another one of these where I'm not convinced that backporting to
>stable isn't going to introduce regressions - it will fix problems on
>some systems but if someone really cares about a startup pop and has
>arranged to do something like discard the start of the audio where the
>pop occurs then this may cut off even more and cause them to miss
>things.
It might cause regressions for users who worked around the issue, but it
may also fix the issue for users who didn't. I think that in general
we'd rather have mainline and stable trees have the same bugs and
breakage - we don't want surprises when users jump around between
kernels.
--
Thanks,
Sasha
From: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
commit 641307df71fe77d7b38a477067495ede05d47295 upstream.
When stopping the CRTC the driver must disable all planes and wait for
the change to take effect at the next vblank. Merely calling
drm_crtc_wait_one_vblank() is not enough, as the function doesn't
include any mechanism to handle the race with vblank interrupts.
Replace the drm_crtc_wait_one_vblank() call with a manual mechanism that
handles the vblank interrupt race.
Cc: stable(a)vger.kernel.org
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Signed-off-by: thongsyho <thong.ho.px(a)rvc.renesas.com>
Signed-off-by: Nhan Nguyen <nhan.nguyen.yb(a)renesas.com>
---
drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 54 ++++++++++++++++++++++++++++++----
drivers/gpu/drm/rcar-du/rcar_du_crtc.h | 8 +++++
2 files changed, 56 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
index 25bdab6..6fab079 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -371,6 +371,31 @@ static void rcar_du_crtc_start(struct rcar_du_crtc *rcrtc)
rcrtc->started = true;
}
+static void rcar_du_crtc_disable_planes(struct rcar_du_crtc *rcrtc)
+{
+ struct rcar_du_device *rcdu = rcrtc->group->dev;
+ struct drm_crtc *crtc = &rcrtc->crtc;
+ u32 status;
+ /* Make sure vblank interrupts are enabled. */
+ drm_crtc_vblank_get(crtc);
+ /*
+ * Disable planes and calculate how many vertical blanking interrupts we
+ * have to wait for. If a vertical blanking interrupt has been triggered
+ * but not processed yet, we don't know whether it occurred before or
+ * after the planes got disabled. We thus have to wait for two vblank
+ * interrupts in that case.
+ */
+ spin_lock_irq(&rcrtc->vblank_lock);
+ rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
+ status = rcar_du_crtc_read(rcrtc, DSSR);
+ rcrtc->vblank_count = status & DSSR_VBK ? 2 : 1;
+ spin_unlock_irq(&rcrtc->vblank_lock);
+ if (!wait_event_timeout(rcrtc->vblank_wait, rcrtc->vblank_count == 0,
+ msecs_to_jiffies(100)))
+ dev_warn(rcdu->dev, "vertical blanking timeout\n");
+ drm_crtc_vblank_put(crtc);
+}
+
static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
{
struct drm_crtc *crtc = &rcrtc->crtc;
@@ -379,17 +404,16 @@ static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
return;
/* Disable all planes and wait for the change to take effect. This is
- * required as the DSnPR registers are updated on vblank, and no vblank
- * will occur once the CRTC is stopped. Disabling planes when starting
- * the CRTC thus wouldn't be enough as it would start scanning out
- * immediately from old frame buffers until the next vblank.
+ * required as the plane enable registers are updated on vblank, and no
+ * vblank will occur once the CRTC is stopped. Disabling planes when
+ * starting the CRTC thus wouldn't be enough as it would start scanning
+ * out immediately from old frame buffers until the next vblank.
*
* This increases the CRTC stop delay, especially when multiple CRTCs
* are stopped in one operation as we now wait for one vblank per CRTC.
* Whether this can be improved needs to be researched.
*/
- rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
- drm_crtc_wait_one_vblank(crtc);
+ rcar_du_crtc_disable_planes(rcrtc);
/* Disable vertical blanking interrupt reporting. We first need to wait
* for page flip completion before stopping the CRTC as userspace
@@ -528,10 +552,26 @@ static irqreturn_t rcar_du_crtc_irq(int irq, void *arg)
irqreturn_t ret = IRQ_NONE;
u32 status;
+ spin_lock(&rcrtc->vblank_lock);
+
status = rcar_du_crtc_read(rcrtc, DSSR);
rcar_du_crtc_write(rcrtc, DSRCR, status & DSRCR_MASK);
if (status & DSSR_VBK) {
+ /*
+ * Wake up the vblank wait if the counter reaches 0. This must
+ * be protected by the vblank_lock to avoid races in
+ * rcar_du_crtc_disable_planes().
+ */
+ if (rcrtc->vblank_count) {
+ if (--rcrtc->vblank_count == 0)
+ wake_up(&rcrtc->vblank_wait);
+ }
+ }
+
+ spin_unlock(&rcrtc->vblank_lock);
+
+ if (status & DSSR_VBK) {
drm_handle_vblank(rcrtc->crtc.dev, rcrtc->index);
rcar_du_crtc_finish_page_flip(rcrtc);
ret = IRQ_HANDLED;
@@ -585,6 +625,8 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, unsigned int index)
}
init_waitqueue_head(&rcrtc->flip_wait);
+ init_waitqueue_head(&rcrtc->vblank_wait);
+ spin_lock_init(&rcrtc->vblank_lock);
rcrtc->group = rgrp;
rcrtc->mmio_offset = mmio_offsets[index];
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
index 2bbe3f5..be22ce3 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -15,6 +15,7 @@
#define __RCAR_DU_CRTC_H__
#include <linux/mutex.h>
+#include <linux/spinlock.h>
#include <linux/wait.h>
#include <drm/drmP.h>
@@ -32,6 +33,9 @@ struct rcar_du_group;
* @started: whether the CRTC has been started and is running
* @event: event to post when the pending page flip completes
* @flip_wait: wait queue used to signal page flip completion
+ * @vblank_lock: protects vblank_wait and vblank_count
+ * @vblank_wait: wait queue used to signal vertical blanking
+ * @vblank_count: number of vertical blanking interrupts to wait for
* @outputs: bitmask of the outputs (enum rcar_du_output) driven by this CRTC
* @enabled: whether the CRTC is enabled, used to control system resume
* @group: CRTC group this CRTC belongs to
@@ -48,6 +52,10 @@ struct rcar_du_crtc {
struct drm_pending_vblank_event *event;
wait_queue_head_t flip_wait;
+ spinlock_t vblank_lock;
+ wait_queue_head_t vblank_wait;
+ unsigned int vblank_count;
+
unsigned int outputs;
bool enabled;
--
1.9.1
From: Eric Biggers <ebiggers(a)google.com>
commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: stable(a)vger.kernel.org
Cc: Mimi Zohar <zohar(a)linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: James Morris <james.l.morris(a)oracle.com>
Signed-off-by: Jin Qian <jinqian(a)android.com>
---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/security/keys/encrypted-keys/encrypted.c b/security/keys/encrypted-keys/encrypted.c
index a871159bf03c..ead2fd60244d 100644
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const char *ecryptfs_desc)
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}
/*
--
2.16.0.rc1.238.g530d649a79-goog
This is a note to let you know that I've just added the patch titled
serial: core: mark port as initialized after successful IRQ change
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
serial-core-mark-port-as-initialized-after-successful-irq-change.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 44117a1d1732c513875d5a163f10d9adbe866c08 Mon Sep 17 00:00:00 2001
From: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Date: Thu, 11 Jan 2018 18:57:26 +0100
Subject: serial: core: mark port as initialized after successful IRQ change
From: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
commit 44117a1d1732c513875d5a163f10d9adbe866c08 upstream.
setserial changes the IRQ via uart_set_info(). It invokes
uart_shutdown() which free the current used IRQ and clear
TTY_PORT_INITIALIZED. It will then update the IRQ number and invoke
uart_startup() before returning to the caller leaving
TTY_PORT_INITIALIZED cleared.
The next open will crash with
| list_add double add: new=ffffffff839fcc98, prev=ffffffff839fcc98, next=ffffffff839fcc98.
since the close from the IOCTL won't free the IRQ (and clean the list)
due to the TTY_PORT_INITIALIZED check in uart_shutdown().
There is same pattern in uart_do_autoconfig() and I *think* it also
needs to set TTY_PORT_INITIALIZED there.
Is there a reason why uart_startup() does not set the flag by itself
after the IRQ has been acquired (since it is cleared in uart_shutdown)?
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/serial_core.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -965,6 +965,8 @@ static int uart_set_info(struct tty_stru
}
} else {
retval = uart_startup(tty, state, 1);
+ if (retval == 0)
+ tty_port_set_initialized(port, true);
if (retval > 0)
retval = 0;
}
Patches currently in stable-queue which might be from bigeasy(a)linutronix.de are
queue-4.9/serial-core-mark-port-as-initialized-after-successful-irq-change.patch
The first patch in this series isolates and backports a fix to clear
just the USB_PORT_STAT_POWER. Without this fix, client can't use the
imported device.
The second patch is fix to back-ported commit 3eee23c3ec14. tcp_socket
address still present in the status file. This is my bad. The bug fixed
in the first patch in this series masked this bug. With these two fixes,
client can use the imported devices on 4.4
Eric Biggers also reported the tcp_socket address still in the status
file while I am getting the patch ready. I added him to Reported-by.
Shuah Khan (2):
usbip: vhci_hcd: clear just the USB_PORT_STAT_POWER bit
usbip: fix 3eee23c3ec14 tcp_socket address still in the status file
drivers/usb/usbip/vhci_hcd.c | 2 +-
drivers/usb/usbip/vhci_sysfs.c | 7 +++----
2 files changed, 4 insertions(+), 5 deletions(-)
--
2.14.1
This is a note to let you know that I've just added the patch titled
serial: core: mark port as initialized after successful IRQ change
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
serial-core-mark-port-as-initialized-after-successful-irq-change.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 44117a1d1732c513875d5a163f10d9adbe866c08 Mon Sep 17 00:00:00 2001
From: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Date: Thu, 11 Jan 2018 18:57:26 +0100
Subject: serial: core: mark port as initialized after successful IRQ change
From: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
commit 44117a1d1732c513875d5a163f10d9adbe866c08 upstream.
setserial changes the IRQ via uart_set_info(). It invokes
uart_shutdown() which free the current used IRQ and clear
TTY_PORT_INITIALIZED. It will then update the IRQ number and invoke
uart_startup() before returning to the caller leaving
TTY_PORT_INITIALIZED cleared.
The next open will crash with
| list_add double add: new=ffffffff839fcc98, prev=ffffffff839fcc98, next=ffffffff839fcc98.
since the close from the IOCTL won't free the IRQ (and clean the list)
due to the TTY_PORT_INITIALIZED check in uart_shutdown().
There is same pattern in uart_do_autoconfig() and I *think* it also
needs to set TTY_PORT_INITIALIZED there.
Is there a reason why uart_startup() does not set the flag by itself
after the IRQ has been acquired (since it is cleared in uart_shutdown)?
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/serial_core.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -974,6 +974,8 @@ static int uart_set_info(struct tty_stru
}
} else {
retval = uart_startup(tty, state, 1);
+ if (retval == 0)
+ tty_port_set_initialized(port, true);
if (retval > 0)
retval = 0;
}
Patches currently in stable-queue which might be from bigeasy(a)linutronix.de are
queue-4.15/serial-core-mark-port-as-initialized-after-successful-irq-change.patch