The patch titled
Subject: mm/usercopy.c: no check page span for stack objects
has been removed from the -mm tree. Its filename was
usercopy-no-check-page-span-for-stack-objects.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Qian Cai <cai(a)lca.pw>
Subject: mm/usercopy.c: no check page span for stack objects
It is easy to trigger this with CONFIG_HARDENED_USERCOPY_PAGESPAN=y,
usercopy: Kernel memory overwrite attempt detected to spans multiple
pages (offset 0, size 23)!
kernel BUG at mm/usercopy.c:102!
For example,
print_worker_info
char name[WQ_NAME_LEN] = { };
char desc[WORKER_DESC_LEN] = { };
probe_kernel_read(name, wq->name, sizeof(name) - 1);
probe_kernel_read(desc, worker->desc, sizeof(desc) - 1);
__copy_from_user_inatomic
check_object_size
check_heap_object
check_page_span
This is because on-stack variables could cross PAGE_SIZE boundary, and
failed this check,
if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) ==
((unsigned long)end & (unsigned long)PAGE_MASK)))
ptr = FFFF889007D7EFF8
end = FFFF889007D7F00E
Hence, fix it by checking if it is a stack object first.
[keescook(a)chromium.org: improve comments after reorder]
Link: http://lkml.kernel.org/r/20190103165151.GA32845@beast
Link: http://lkml.kernel.org/r/20181231030254.99441-1-cai@lca.pw
Signed-off-by: Qian Cai <cai(a)lca.pw>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/usercopy.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
--- a/mm/usercopy.c~usercopy-no-check-page-span-for-stack-objects
+++ a/mm/usercopy.c
@@ -247,7 +247,8 @@ static DEFINE_STATIC_KEY_FALSE_RO(bypass
/*
* Validates that the given object is:
* - not bogus address
- * - known-safe heap or stack object
+ * - fully contained by stack (or stack frame, when available)
+ * - fully within SLAB object (or object whitelist area, when available)
* - not in kernel text
*/
void __check_object_size(const void *ptr, unsigned long n, bool to_user)
@@ -262,9 +263,6 @@ void __check_object_size(const void *ptr
/* Check for invalid addresses. */
check_bogus_address((const unsigned long)ptr, n, to_user);
- /* Check for bad heap object. */
- check_heap_object(ptr, n, to_user);
-
/* Check for bad stack object. */
switch (check_stack_object(ptr, n)) {
case NOT_STACK:
@@ -282,6 +280,9 @@ void __check_object_size(const void *ptr
usercopy_abort("process stack", NULL, to_user, 0, n);
}
+ /* Check for bad heap object. */
+ check_heap_object(ptr, n, to_user);
+
/* Check for object in kernel to avoid text exposure. */
check_kernel_text_object((const unsigned long)ptr, n, to_user);
}
_
Patches currently in -mm which might be from cai(a)lca.pw are
mm-page_owner-fix-for-deferred-struct-page-init.patch
page_poison-plays-nicely-with-kasan.patch
signal-allow-the-null-signal-in-rt_sigqueueinfo.patch
The patch titled
Subject: slab: alien caches must not be initialized if the allocation of the alien cache failed
has been removed from the -mm tree. Its filename was
slab-alien-caches-must-not-be-initialized-if-the-allocation-of-the-alien-cache-failed.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Christoph Lameter <cl(a)linux.com>
Subject: slab: alien caches must not be initialized if the allocation of the alien cache failed
Callers of __alloc_alien() check for NULL. We must do the same check in
__alloc_alien_cache to avoid NULL pointer dereferences on allocation
failures.
Link: http://lkml.kernel.org/r/010001680f42f192-82b4e12e-1565-4ee0-ae1f-1e9897490…
Fixes: 49dfc304ba241 ("slab: use the lock on alien_cache, instead of the lock on array_cache")
Fixes: c8522a3a5832b ("Slab: introduce alloc_alien")
Signed-off-by: Christoph Lameter <cl(a)linux.com>
Reported-by: syzbot+d6ed4ec679652b4fd4e4(a)syzkaller.appspotmail.com
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slab.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/slab.c~slab-alien-caches-must-not-be-initialized-if-the-allocation-of-the-alien-cache-failed
+++ a/mm/slab.c
@@ -666,8 +666,10 @@ static struct alien_cache *__alloc_alien
struct alien_cache *alc = NULL;
alc = kmalloc_node(memsize, gfp, node);
- init_arraycache(&alc->ac, entries, batch);
- spin_lock_init(&alc->lock);
+ if (alc) {
+ init_arraycache(&alc->ac, entries, batch);
+ spin_lock_init(&alc->lock);
+ }
return alc;
}
_
Patches currently in -mm which might be from cl(a)linux.com are
The patch titled
Subject: fork, memcg: fix cached_stacks case
has been removed from the -mm tree. Its filename was
fork-memcg-fix-cached_stacks-case.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Shakeel Butt <shakeelb(a)google.com>
Subject: fork, memcg: fix cached_stacks case
5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge
fail") fixes a crash caused due to failed memcg charge of the kernel
stack. However the fix misses the cached_stacks case which this patch
fixes. So, the same crash can happen if the memcg charge of a cached
stack is failed.
Link: http://lkml.kernel.org/r/20190102180145.57406-1-shakeelb@google.com
Fixes: 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail")
Signed-off-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Rik van Riel <riel(a)surriel.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/fork.c~fork-memcg-fix-cached_stacks-case
+++ a/kernel/fork.c
@@ -217,6 +217,7 @@ static unsigned long *alloc_thread_stack
memset(s->addr, 0, THREAD_SIZE);
tsk->stack_vm_area = s;
+ tsk->stack = s->addr;
return s->addr;
}
_
Patches currently in -mm which might be from shakeelb(a)google.com are
memcg-localize-memcg_kmem_enabled-check.patch
memcg-schedule-high-reclaim-for-remote-memcgs-on-high_work.patch
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e1c3743e1a20647c53b719dbf28b48f45d23f2cd Mon Sep 17 00:00:00 2001
From: Breno Leitao <leitao(a)debian.org>
Date: Wed, 21 Nov 2018 17:21:09 -0200
Subject: [PATCH] powerpc/tm: Set MSR[TS] just prior to recheckpoint
On a signal handler return, the user could set a context with MSR[TS] bits
set, and these bits would be copied to task regs->msr.
At restore_tm_sigcontexts(), after current task regs->msr[TS] bits are set,
several __get_user() are called and then a recheckpoint is executed.
This is a problem since a page fault (in kernel space) could happen when
calling __get_user(). If it happens, the process MSR[TS] bits were
already set, but recheckpoint was not executed, and SPRs are still invalid.
The page fault can cause the current process to be de-scheduled, with
MSR[TS] active and without tm_recheckpoint() being called. More
importantly, without TEXASR[FS] bit set also.
Since TEXASR might not have the FS bit set, and when the process is
scheduled back, it will try to reclaim, which will be aborted because of
the CPU is not in the suspended state, and, then, recheckpoint. This
recheckpoint will restore thread->texasr into TEXASR SPR, which might be
zero, hitting a BUG_ON().
kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434!
cpu 0xb: Vector: 700 (Program Check) at [c00000041f1576d0]
pc: c000000000054550: restore_gprs+0xb0/0x180
lr: 0000000000000000
sp: c00000041f157950
msr: 8000000100021033
current = 0xc00000041f143000
paca = 0xc00000000fb86300 softe: 0 irq_happened: 0x01
pid = 1021, comm = kworker/11:1
kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434!
Linux version 4.9.0-3-powerpc64le (debian-kernel(a)lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26)
enter ? for help
[c00000041f157b30] c00000000001bc3c tm_recheckpoint.part.11+0x6c/0xa0
[c00000041f157b70] c00000000001d184 __switch_to+0x1e4/0x4c0
[c00000041f157bd0] c00000000082eeb8 __schedule+0x2f8/0x990
[c00000041f157cb0] c00000000082f598 schedule+0x48/0xc0
[c00000041f157ce0] c0000000000f0d28 worker_thread+0x148/0x610
[c00000041f157d80] c0000000000f96b0 kthread+0x120/0x140
[c00000041f157e30] c00000000000c0e0 ret_from_kernel_thread+0x5c/0x7c
This patch simply delays the MSR[TS] set, so, if there is any page fault in
the __get_user() section, it does not have regs->msr[TS] set, since the TM
structures are still invalid, thus avoiding doing TM operations for
in-kernel exceptions and possible process reschedule.
With this patch, the MSR[TS] will only be set just before recheckpointing
and setting TEXASR[FS] = 1, thus avoiding an interrupt with TM registers in
invalid state.
Other than that, if CONFIG_PREEMPT is set, there might be a preemption just
after setting MSR[TS] and before tm_recheckpoint(), thus, this block must
be atomic from a preemption perspective, thus, calling
preempt_disable/enable() on this code.
It is not possible to move tm_recheckpoint to happen earlier, because it is
required to get the checkpointed registers from userspace, with
__get_user(), thus, the only way to avoid this undesired behavior is
delaying the MSR[TS] set.
The 32-bits signal handler seems to be safe this current issue, but, it
might be exposed to the preemption issue, thus, disabling preemption in
this chunk of code.
Changes from v2:
* Run the critical section with preempt_disable.
Fixes: 87b4e5393af7 ("powerpc/tm: Fix return of active 64bit signals")
Cc: stable(a)vger.kernel.org (v3.9+)
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 9d39e0eb03ff..7484f43493d3 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -848,7 +848,23 @@ static long restore_tm_user_regs(struct pt_regs *regs,
/* If TM bits are set to the reserved value, it's an invalid context */
if (MSR_TM_RESV(msr_hi))
return 1;
- /* Pull in the MSR TM bits from the user context */
+
+ /*
+ * Disabling preemption, since it is unsafe to be preempted
+ * with MSR[TS] set without recheckpointing.
+ */
+ preempt_disable();
+
+ /*
+ * CAUTION:
+ * After regs->MSR[TS] being updated, make sure that get_user(),
+ * put_user() or similar functions are *not* called. These
+ * functions can generate page faults which will cause the process
+ * to be de-scheduled with MSR[TS] set but without calling
+ * tm_recheckpoint(). This can cause a bug.
+ *
+ * Pull in the MSR TM bits from the user context
+ */
regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr_hi & MSR_TS_MASK);
/* Now, recheckpoint. This loads up all of the checkpointed (older)
* registers, including FP and V[S]Rs. After recheckpointing, the
@@ -873,6 +889,8 @@ static long restore_tm_user_regs(struct pt_regs *regs,
}
#endif
+ preempt_enable();
+
return 0;
}
#endif
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index e53ad11be385..ba093ec5a21f 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -467,20 +467,6 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
if (MSR_TM_RESV(msr))
return -EINVAL;
- /* pull in MSR TS bits from user context */
- regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr & MSR_TS_MASK);
-
- /*
- * Ensure that TM is enabled in regs->msr before we leave the signal
- * handler. It could be the case that (a) user disabled the TM bit
- * through the manipulation of the MSR bits in uc_mcontext or (b) the
- * TM bit was disabled because a sufficient number of context switches
- * happened whilst in the signal handler and load_tm overflowed,
- * disabling the TM bit. In either case we can end up with an illegal
- * TM state leading to a TM Bad Thing when we return to userspace.
- */
- regs->msr |= MSR_TM;
-
/* pull in MSR LE from user context */
regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
@@ -572,6 +558,34 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
tm_enable();
/* Make sure the transaction is marked as failed */
tsk->thread.tm_texasr |= TEXASR_FS;
+
+ /*
+ * Disabling preemption, since it is unsafe to be preempted
+ * with MSR[TS] set without recheckpointing.
+ */
+ preempt_disable();
+
+ /* pull in MSR TS bits from user context */
+ regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr & MSR_TS_MASK);
+
+ /*
+ * Ensure that TM is enabled in regs->msr before we leave the signal
+ * handler. It could be the case that (a) user disabled the TM bit
+ * through the manipulation of the MSR bits in uc_mcontext or (b) the
+ * TM bit was disabled because a sufficient number of context switches
+ * happened whilst in the signal handler and load_tm overflowed,
+ * disabling the TM bit. In either case we can end up with an illegal
+ * TM state leading to a TM Bad Thing when we return to userspace.
+ *
+ * CAUTION:
+ * After regs->MSR[TS] being updated, make sure that get_user(),
+ * put_user() or similar functions are *not* called. These
+ * functions can generate page faults which will cause the process
+ * to be de-scheduled with MSR[TS] set but without calling
+ * tm_recheckpoint(). This can cause a bug.
+ */
+ regs->msr |= MSR_TM;
+
/* This loads the checkpointed FP/VEC state, if used */
tm_recheckpoint(&tsk->thread);
@@ -585,6 +599,8 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
regs->msr |= MSR_VEC;
}
+ preempt_enable();
+
return err;
}
#endif