This is a note to let you know that I've just added the patch titled
staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 5a96b2d38dc054c0bbcbcd585b116566cbd877fe Mon Sep 17 00:00:00 2001
From: Ben Wolsieffer <benwolsieffer(a)gmail.com>
Date: Sat, 3 Nov 2018 19:32:20 -0400
Subject: staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION
The compatibility ioctl wrapper for VCHIQ_IOC_AWAIT_COMPLETION assumes that
the native ioctl always uses a message buffer and decrements msgbufcount.
Certain message types do not use a message buffer and in this case
msgbufcount is not decremented, and completion->header for the message is
NULL. Because the wrapper unconditionally decrements msgbufcount, the
calling process may assume that a message buffer has been used even when
it has not.
This results in a memory leak in the userspace code that interfaces with
this driver. When msgbufcount is decremented, the userspace code assumes
that the buffer can be freed though the reference in completion->header,
which cannot happen when the reference is NULL.
This patch causes the wrapper to only decrement msgbufcount when the
native ioctl decrements it. Note that we cannot simply copy the native
ioctl's value of msgbufcount, because the wrapper only retrieves messages
from the native ioctl one at a time, while userspace may request multiple
messages.
See https://github.com/raspberrypi/linux/pull/2703 for more discussion of
this patch.
Fixes: 5569a1260933 ("staging: vchiq_arm: Add compatibility wrappers for ioctls")
Signed-off-by: Ben Wolsieffer <benwolsieffer(a)gmail.com>
Acked-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.../staging/vc04_services/interface/vchiq_arm/vchiq_arm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
index ea789376de0f..45de21c210c1 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
@@ -1795,6 +1795,7 @@ vchiq_compat_ioctl_await_completion(struct file *file,
struct vchiq_await_completion32 args32;
struct vchiq_completion_data32 completion32;
unsigned int *msgbufcount32;
+ unsigned int msgbufcount_native;
compat_uptr_t msgbuf32;
void *msgbuf;
void **msgbufptr;
@@ -1906,7 +1907,11 @@ vchiq_compat_ioctl_await_completion(struct file *file,
sizeof(completion32)))
return -EFAULT;
- args32.msgbufcount--;
+ if (get_user(msgbufcount_native, &args->msgbufcount))
+ return -EFAULT;
+
+ if (!msgbufcount_native)
+ args32.msgbufcount--;
msgbufcount32 =
&((struct vchiq_await_completion32 __user *)arg)->msgbufcount;
--
2.19.1
This is a note to let you know that I've just added the patch titled
staging: mt7621-pinctrl: fix uninitialized variable ngroups
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From cd56a5141331abfe218d744a3d66e1788135d482 Mon Sep 17 00:00:00 2001
From: Colin Ian King <colin.king(a)canonical.com>
Date: Sat, 10 Nov 2018 23:28:06 +0000
Subject: staging: mt7621-pinctrl: fix uninitialized variable ngroups
Currently the for_each_node_with_property loop us incrementing variable
ngroups however it was not initialized and hence will contain garbage.
Fix this by initializing ngroups to zero.
Detected with static analysis with cppcheck:
drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c:89]: (error) Uninitialized
variable: ngroups
Fixes: e12a1a6e087b ("staging: mt7621-pinctrl: refactor rt2880_pinctrl_dt_node_to_map function")
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
Reviewed-by: Sergio Paracuellos <sergio.paracuellos(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c b/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c
index b8566ed898f1..aa98fbb17013 100644
--- a/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c
+++ b/drivers/staging/mt7621-pinctrl/pinctrl-rt2880.c
@@ -82,7 +82,7 @@ static int rt2880_pinctrl_dt_node_to_map(struct pinctrl_dev *pctrldev,
struct property *prop;
const char *function_name, *group_name;
int ret;
- int ngroups;
+ int ngroups = 0;
unsigned int reserved_maps = 0;
for_each_node_with_property(np_config, "group")
--
2.19.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7703bdd8d23e6ef057af3253958a793ec6066b28 Mon Sep 17 00:00:00 2001
From: Chris Mason <clm(a)fb.com>
Date: Wed, 20 Jun 2018 07:56:11 -0700
Subject: [PATCH] Btrfs: don't clean dirty pages during buffered writes
During buffered writes, we follow this basic series of steps:
again:
lock all the pages
wait for writeback on all the pages
Take the extent range lock
wait for ordered extents on the whole range
clean all the pages
if (copy_from_user_in_atomic() hits a fault) {
drop our locks
goto again;
}
dirty all the pages
release all the locks
The extra waiting, cleaning and locking are there to make sure we don't
modify pages in flight to the drive, after they've been crc'd.
If some of the pages in the range were already dirty when the write
began, and we need to goto again, we create a window where a dirty page
has been cleaned and unlocked. It may be reclaimed before we're able to
lock it again, which means we'll read the old contents off the drive and
lose any modifications that had been pending writeback.
We don't actually need to clean the pages. All of the other locking in
place makes sure we don't start IO on the pages, so we can just leave
them dirty for the duration of the write.
Fixes: 73d59314e6ed (the original btrfs merge)
CC: stable(a)vger.kernel.org # v4.4+
Signed-off-by: Chris Mason <clm(a)fb.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index d254cf94545f..15b925142793 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -531,6 +531,14 @@ int btrfs_dirty_pages(struct inode *inode, struct page **pages,
end_of_last_block = start_pos + num_bytes - 1;
+ /*
+ * The pages may have already been dirty, clear out old accounting so
+ * we can set things up properly
+ */
+ clear_extent_bit(&BTRFS_I(inode)->io_tree, start_pos, end_of_last_block,
+ EXTENT_DIRTY | EXTENT_DELALLOC |
+ EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 0, 0, cached);
+
if (!btrfs_is_free_space_inode(BTRFS_I(inode))) {
if (start_pos >= isize &&
!(BTRFS_I(inode)->flags & BTRFS_INODE_PREALLOC)) {
@@ -1500,18 +1508,27 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages,
}
if (ordered)
btrfs_put_ordered_extent(ordered);
- clear_extent_bit(&inode->io_tree, start_pos, last_pos,
- EXTENT_DIRTY | EXTENT_DELALLOC |
- EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
- 0, 0, cached_state);
+
*lockstart = start_pos;
*lockend = last_pos;
ret = 1;
}
+ /*
+ * It's possible the pages are dirty right now, but we don't want
+ * to clean them yet because copy_from_user may catch a page fault
+ * and we might have to fall back to one page at a time. If that
+ * happens, we'll unlock these pages and we'd have a window where
+ * reclaim could sneak in and drop the once-dirty page on the floor
+ * without writing it.
+ *
+ * We have the pages locked and the extent range locked, so there's
+ * no way someone can start IO on any dirty pages in this range.
+ *
+ * We'll call btrfs_dirty_pages() later on, and that will flip around
+ * delalloc bits and dirty the pages as required.
+ */
for (i = 0; i < num_pages; i++) {
- if (clear_page_dirty_for_io(pages[i]))
- account_page_redirty(pages[i]);
set_page_extent_mapped(pages[i]);
WARN_ON(!PageLocked(pages[i]));
}
Commit f77084d96355 "x86/mm/pat: Disable preemption around
__flush_tlb_all()" addressed a case where __flush_tlb_all() is called
without preemption being disabled. It also left a warning to catch other
cases where preemption is not disabled. That warning triggers for the
memory hotplug path which is also used for persistent memory enabling:
WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460
RIP: 0010:__flush_tlb_all+0x1b/0x3a
[..]
Call Trace:
phys_pud_init+0x29c/0x2bb
kernel_physical_mapping_init+0xfc/0x219
init_memory_mapping+0x1a5/0x3b0
arch_add_memory+0x2c/0x50
devm_memremap_pages+0x3aa/0x610
pmem_attach_disk+0x585/0x700 [nd_pmem]
Rather than audit all __flush_tlb_all() callers to add preemption, just
do it internally to __flush_tlb_all().
Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()")
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
arch/x86/include/asm/tlbflush.h | 8 ++++----
arch/x86/mm/pageattr.c | 6 +-----
2 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index d760611cfc35..049e0aca0fb5 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -454,11 +454,10 @@ static inline void __native_flush_tlb_one_user(unsigned long addr)
static inline void __flush_tlb_all(void)
{
/*
- * This is to catch users with enabled preemption and the PGE feature
- * and don't trigger the warning in __native_flush_tlb().
+ * Preemption needs to be disabled around __flush_tlb* calls
+ * due to CR3 reload in __native_flush_tlb().
*/
- VM_WARN_ON_ONCE(preemptible());
-
+ preempt_disable();
if (boot_cpu_has(X86_FEATURE_PGE)) {
__flush_tlb_global();
} else {
@@ -467,6 +466,7 @@ static inline void __flush_tlb_all(void)
*/
__flush_tlb();
}
+ preempt_enable();
}
/*
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index db7a10082238..f799076e3d57 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -2309,13 +2309,9 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
/*
* We should perform an IPI and flush all tlbs,
- * but that can deadlock->flush only current cpu.
- * Preemption needs to be disabled around __flush_tlb_all() due to
- * CR3 reload in __native_flush_tlb().
+ * but that can deadlock->flush only current cpu:
*/
- preempt_disable();
__flush_tlb_all();
- preempt_enable();
arch_flush_lazy_mmu_mode();
}
From: Sai Praneeth <sai.praneeth.prakhya(a)intel.com>
[ Upstream commit 706d51681d636a0c4a5ef53395ec3b803e45ed4d ]
Future Intel processors will support "Enhanced IBRS" which is an "always
on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never
disabled.
>From the specification [1]:
"With enhanced IBRS, the predicted targets of indirect branches
executed cannot be controlled by software that was executed in a less
privileged predictor mode or on another logical processor. As a
result, software operating on a processor with enhanced IBRS need not
use WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more
privileged predictor mode. Software can isolate predictor modes
effectively simply by setting the bit once. Software need not disable
enhanced IBRS prior to entering a sleep state such as MWAIT or HLT."
If Enhanced IBRS is supported by the processor then use it as the
preferred spectre v2 mitigation mechanism instead of Retpoline. Intel's
Retpoline white paper [2] states:
"Retpoline is known to be an effective branch target injection (Spectre
variant 2) mitigation on Intel processors belonging to family 6
(enumerated by the CPUID instruction) that do not have support for
enhanced IBRS. On processors that support enhanced IBRS, it should be
used for mitigation instead of retpoline."
The reason why Enhanced IBRS is the recommended mitigation on processors
which support it is that these processors also support CET which
provides a defense against ROP attacks. Retpoline is very similar to ROP
techniques and might trigger false positives in the CET defense.
If Enhanced IBRS is selected as the mitigation technique for spectre v2,
the IBRS bit in SPEC_CTRL MSR is set once at boot time and never
cleared. Kernel also has to make sure that IBRS bit remains set after
VMEXIT because the guest might have cleared the bit. This is already
covered by the existing x86_spec_ctrl_set_guest() and
x86_spec_ctrl_restore_host() speculation control functions.
Enhanced IBRS still requires IBPB for full mitigation.
[1] Speculative-Execution-Side-Channel-Mitigations.pdf
[2] Retpoline-A-Branch-Target-Injection-Mitigation.pdf
Both documents are available at:
https://bugzilla.kernel.org/show_bug.cgi?id=199511
Originally-by: David Woodhouse <dwmw(a)amazon.co.uk>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Tim C Chen <tim.c.chen(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Ravi Shankar <ravi.v.shankar(a)intel.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/nospec-branch.h | 1 +
arch/x86/kernel/cpu/bugs.c | 20 ++++++++++++++++++--
arch/x86/kernel/cpu/common.c | 3 +++
4 files changed, 23 insertions(+), 2 deletions(-)
Changes from upstream:
----------------------
1. Use bit 30 of word 7 in cpufeatures for X86_FEATURE_IBRS_ENHANCED as bit 29
is now used by L1TF.
2. Fix some trivial line fuzzing.
Note: Based on kernel version "Linux 4.18.17" and to be applied on both "Linux
4.18.17" and "Linux 4.14.79". Please note that git am doesn't apply this patch
smoothly on 4.14.79 because of line fuzz, so please use "patch -p1". Didn't want
to spam the mailing list by sending a duplicate patch and hence sending single
patch for two stable releases.
Changes from V1->V2:
--------------------
1. Fix the format of commit log, as suggested by Greg KH, to follow standard
procedure.
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 64aaa3f5f36c..c8ac84e90d0f 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -220,6 +220,7 @@
#define X86_FEATURE_STIBP ( 7*32+27) /* Single Thread Indirect Branch Predictors */
#define X86_FEATURE_ZEN ( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */
#define X86_FEATURE_L1TF_PTEINV ( 7*32+29) /* "" L1TF workaround PTE inversion */
+#define X86_FEATURE_IBRS_ENHANCED ( 7*32+30) /* Enhanced IBRS */
/* Virtualization flags: Linux defined, word 8 */
#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index f6f6c63da62f..e7c8086e570e 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -215,6 +215,7 @@ enum spectre_v2_mitigation {
SPECTRE_V2_RETPOLINE_GENERIC,
SPECTRE_V2_RETPOLINE_AMD,
SPECTRE_V2_IBRS,
+ SPECTRE_V2_IBRS_ENHANCED,
};
/* The Speculative Store Bypass disable variants */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 4891a621a752..817e57e96d67 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -141,6 +141,7 @@ static const char *spectre_v2_strings[] = {
[SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline",
[SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline",
[SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline",
+ [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS",
};
#undef pr_fmt
@@ -343,6 +344,13 @@ static void __init spectre_v2_select_mitigation(void)
case SPECTRE_V2_CMD_FORCE:
case SPECTRE_V2_CMD_AUTO:
+ if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) {
+ mode = SPECTRE_V2_IBRS_ENHANCED;
+ /* Force it so VMEXIT will restore correctly */
+ x86_spec_ctrl_base |= SPEC_CTRL_IBRS;
+ wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+ goto specv2_set_mode;
+ }
if (IS_ENABLED(CONFIG_RETPOLINE))
goto retpoline_auto;
break;
@@ -380,6 +388,7 @@ static void __init spectre_v2_select_mitigation(void)
setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
}
+specv2_set_mode:
spectre_v2_enabled = mode;
pr_info("%s\n", spectre_v2_strings[mode]);
@@ -402,9 +411,16 @@ static void __init spectre_v2_select_mitigation(void)
/*
* Retpoline means the kernel is safe because it has no indirect
- * branches. But firmware isn't, so use IBRS to protect that.
+ * branches. Enhanced IBRS protects firmware too, so, enable restricted
+ * speculation around firmware calls only when Enhanced IBRS isn't
+ * supported.
+ *
+ * Use "mode" to check Enhanced IBRS instead of boot_cpu_has(), because
+ * the user might select retpoline on the kernel command line and if
+ * the CPU supports Enhanced IBRS, kernel might un-intentionally not
+ * enable IBRS around firmware calls.
*/
- if (boot_cpu_has(X86_FEATURE_IBRS)) {
+ if (boot_cpu_has(X86_FEATURE_IBRS) && mode != SPECTRE_V2_IBRS_ENHANCED) {
setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
pr_info("Enabling Restricted Speculation for firmware calls\n");
}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 1ee8ea36af30..79561bfcfa87 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1015,6 +1015,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
!cpu_has(c, X86_FEATURE_AMD_SSB_NO))
setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
+ if (ia32_cap & ARCH_CAP_IBRS_ALL)
+ setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
+
if (x86_match_cpu(cpu_no_meltdown))
return;
--
2.7.4