The patch titled
Subject: fork,memcg: fix crash in free_thread_stack on memcg charge fail
has been added to the -mm tree. Its filename is
forkmemcg-fix-crash-in-free_thread_stack-on-memcg-charge-fail.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/forkmemcg-fix-crash-in-free_thread…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/forkmemcg-fix-crash-in-free_thread…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Rik van Riel <riel(a)surriel.com>
Subject: fork,memcg: fix crash in free_thread_stack on memcg charge fail
Changeset 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
will result in fork failing if allocating a kernel stack for a task
in dup_task_struct exceeds the kernel memory allowance for that cgroup.
Unfortunately, it also results in a crash.
This is due to the code jumping to free_stack and calling free_thread_stack
when the memcg kernel stack charge fails, but without tsk->stack pointing
at the freshly allocated stack.
This in turn results in the vfree_atomic in free_thread_stack oopsing
with a backtrace like this:
#5 [ffffc900244efc88] die at ffffffff8101f0ab
#6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86
#7 [ffffc900244efce0] general_protection at ffffffff818ff082
[exception RIP: llist_add_batch+7]
RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000
RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a
RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600
R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0
R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7
#9 [ffffc900244efdb8] copy_process at ffffffff81086e37
#10 [ffffc900244efe98] _do_fork at ffffffff810884e0
#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff
#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43
RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948
RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421
RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040
R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018
ORIG_RAX: 000000000000003a CS: 0033 SS: 002b
The simplest fix is to assign tsk->stack right where it is allocated.
Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com
Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/kernel/fork.c~forkmemcg-fix-crash-in-free_thread_stack-on-memcg-charge-fail
+++ a/kernel/fork.c
@@ -240,8 +240,10 @@ static unsigned long *alloc_thread_stack
* free_thread_stack() can be called in interrupt context,
* so cache the vm_struct.
*/
- if (stack)
+ if (stack) {
tsk->stack_vm_area = find_vm_area(stack);
+ tsk->stack = stack;
+ }
return stack;
#else
struct page *page = alloc_pages_node(node, THREADINFO_GFP,
@@ -288,7 +290,10 @@ static struct kmem_cache *thread_stack_c
static unsigned long *alloc_thread_stack_node(struct task_struct *tsk,
int node)
{
- return kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node);
+ unsigned long *stack;
+ stack = kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node);
+ tsk->stack = stack;
+ return stack;
}
static void free_thread_stack(struct task_struct *tsk)
_
Patches currently in -mm which might be from riel(a)surriel.com are
forkmemcg-fix-crash-in-free_thread_stack-on-memcg-charge-fail.patch
The patch titled
Subject: proc/sysctl: don't return ENOMEM on lookup when a table is unregistering
has been removed from the -mm tree. Its filename was
proc-sysctl-dont-return-enomem-on-lookup-when-a-table-is-unregistering.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ivan Delalande <colona(a)arista.com>
Subject: proc/sysctl: don't return ENOMEM on lookup when a table is unregistering
proc_sys_lookup can fail with ENOMEM instead of ENOENT when the
corresponding sysctl table is being unregistered. In our case we see this
upon opening /proc/sys/net/*/conf files while network interfaces are being
deleted, which confuses our configuration daemon.
The problem was successfully reproduced and this fix tested on v4.9.122
and v4.20-rc6.
Link: http://lkml.kernel.org/r/20181213232052.GA1513@visor
Fixes: ace0c791e6c3 ("proc/sysctl: Don't grab i_lock under sysctl_lock.")
Signed-off-by: Ivan Delalande <colona(a)arista.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm(a)xmission.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/proc_sysctl.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--- a/fs/proc/proc_sysctl.c~proc-sysctl-dont-return-enomem-on-lookup-when-a-table-is-unregistering
+++ a/fs/proc/proc_sysctl.c
@@ -464,7 +464,7 @@ static struct inode *proc_sys_make_inode
inode = new_inode(sb);
if (!inode)
- goto out;
+ return ERR_PTR(-ENOMEM);
inode->i_ino = get_next_ino();
@@ -474,7 +474,7 @@ static struct inode *proc_sys_make_inode
if (unlikely(head->unregistering)) {
spin_unlock(&sysctl_lock);
iput(inode);
- inode = NULL;
+ inode = ERR_PTR(-ENOENT);
goto out;
}
ei->sysctl = head;
@@ -549,10 +549,11 @@ static struct dentry *proc_sys_lookup(st
goto out;
}
- err = ERR_PTR(-ENOMEM);
inode = proc_sys_make_inode(dir->i_sb, h ? h : head, p);
- if (!inode)
+ if (IS_ERR(inode)) {
+ err = ERR_CAST(inode);
goto out;
+ }
d_set_d_op(dentry, &proc_sys_dentry_operations);
err = d_splice_alias(inode, dentry);
@@ -685,7 +686,7 @@ static bool proc_sys_fill_cache(struct f
if (d_in_lookup(child)) {
struct dentry *res;
inode = proc_sys_make_inode(dir->d_sb, head, table);
- if (!inode) {
+ if (IS_ERR(inode)) {
d_lookup_done(child);
dput(child);
return false;
_
Patches currently in -mm which might be from colona(a)arista.com are
The patch titled
Subject: scripts/spdxcheck.py: always open files in binary mode
has been removed from the -mm tree. Its filename was
scripts-spdxcheckpy-always-open-files-in-binary-mode.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Thierry Reding <treding(a)nvidia.com>
Subject: scripts/spdxcheck.py: always open files in binary mode
The spdxcheck script currently falls over when confronted with a binary
file (such as Documentation/logo.gif). To avoid that, always open files
in binary mode and decode line-by-line, ignoring encoding errors.
One tricky case is when piping data into the script and reading it from
standard input. By default, standard input will be opened in text mode,
so we need to reopen it in binary mode.
The breakage only happens with python3 and results in a
UnicodeDecodeError (according to Uwe).
Link: http://lkml.kernel.org/r/20181212131210.28024-1-thierry.reding@gmail.com
Fixes: 6f4d29df66ac ("scripts/spdxcheck.py: make python3 compliant")
Signed-off-by: Thierry Reding <treding(a)nvidia.com>
Reviewed-by: Jeremy Cline <jcline(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Joe Perches <joe(a)perches.com>
Cc: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
scripts/spdxcheck.py | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/scripts/spdxcheck.py~scripts-spdxcheckpy-always-open-files-in-binary-mode
+++ a/scripts/spdxcheck.py
@@ -168,6 +168,7 @@ class id_parser(object):
self.curline = 0
try:
for line in fd:
+ line = line.decode(locale.getpreferredencoding(False), errors='ignore')
self.curline += 1
if self.curline > maxlines:
break
@@ -249,12 +250,13 @@ if __name__ == '__main__':
try:
if len(args.path) and args.path[0] == '-':
- parser.parse_lines(sys.stdin, args.maxlines, '-')
+ stdin = os.fdopen(sys.stdin.fileno(), 'rb')
+ parser.parse_lines(stdin, args.maxlines, '-')
else:
if args.path:
for p in args.path:
if os.path.isfile(p):
- parser.parse_lines(open(p), args.maxlines, p)
+ parser.parse_lines(open(p, 'rb'), args.maxlines, p)
elif os.path.isdir(p):
scan_git_subtree(repo.head.reference.commit.tree, p)
else:
_
Patches currently in -mm which might be from treding(a)nvidia.com are
scripts-add-spdxcheckpy-self-test.patch
The patch titled
Subject: userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
has been removed from the -mm tree. Its filename was
userfaultfd-check-vm_maywrite-was-set-after-verifying-the-uffd-is-registered.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Andrea Arcangeli <aarcange(a)redhat.com>
Subject: userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON. Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.
Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc: <stable(a)vger.kernel.org>
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com>
Reported-by: syzbot+06c7092e7d71218a2c16(a)syzkaller.appspotmail.com
Acked-by: Mike Rapoport <rppt(a)linux.ibm.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/userfaultfd.c~userfaultfd-check-vm_maywrite-was-set-after-verifying-the-uffd-is-registered
+++ a/fs/userfaultfd.c
@@ -1566,7 +1566,6 @@ static int userfaultfd_unregister(struct
cond_resched();
BUG_ON(!vma_can_userfault(vma));
- WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
/*
* Nothing to do: this vma is already registered into this
@@ -1575,6 +1574,8 @@ static int userfaultfd_unregister(struct
if (!vma->vm_userfaultfd_ctx.ctx)
goto skip;
+ WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
+
if (vma->vm_start > start)
start = vma->vm_start;
vma_end = min(end, vma->vm_end);
_
Patches currently in -mm which might be from aarcange(a)redhat.com are
The patch titled
Subject: fs/iomap.c: get/put the page in iomap_page_create/release()
has been removed from the -mm tree. Its filename was
iomap-get-put-the-page-in-iomap_page_create-release.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Piotr Jaroszynski <pjaroszynski(a)nvidia.com>
Subject: fs/iomap.c: get/put the page in iomap_page_create/release()
migrate_page_move_mapping() expects pages with private data set to have a
page_count elevated by 1. This is what used to happen for xfs through the
buffer_heads code before the switch to iomap in commit 82cb14175e7d ("xfs:
add support for sub-pagesize writeback without buffer_heads"). Not having
the count elevated causes move_pages() to fail on memory mapped files
coming from xfs.
Make iomap compatible with the migrate_page_move_mapping() assumption by
elevating the page count as part of iomap_page_create() and lowering it in
iomap_page_release().
It causes the move_pages() syscall to misbehave on memory mapped files
from xfs. It does not not move any pages, which I suppose is "just" a
perf issue, but it also ends up returning a positive number which is
out of spec for the syscall. Talking to Michal Hocko, it sounds like
returning positive numbers might be a necessary update to move_pages()
anyway though
(https://lkml.kernel.org/r/20181116114955.GJ14706@dhcp22.suse.cz).
I only hit this in tests that verify that move_pages() actually moved
the pages. The test also got confused by the positive return from
move_pages() (it got treated as a success as positive numbers were not
expected and not handled) making it a bit harder to track down what's
going on.
Link: http://lkml.kernel.org/r/20181115184140.1388751-1-pjaroszynski@nvidia.com
Fixes: 82cb14175e7d ("xfs: add support for sub-pagesize writeback without buffer_heads")
Signed-off-by: Piotr Jaroszynski <pjaroszynski(a)nvidia.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: William Kucharski <william.kucharski(a)oracle.com>
Cc: Darrick J. Wong <darrick.wong(a)oracle.com>
Cc: Brian Foster <bfoster(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/iomap.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/fs/iomap.c~iomap-get-put-the-page-in-iomap_page_create-release
+++ a/fs/iomap.c
@@ -116,6 +116,12 @@ iomap_page_create(struct inode *inode, s
atomic_set(&iop->read_count, 0);
atomic_set(&iop->write_count, 0);
bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE);
+
+ /*
+ * migrate_page_move_mapping() assumes that pages with private data have
+ * their count elevated by 1.
+ */
+ get_page(page);
set_page_private(page, (unsigned long)iop);
SetPagePrivate(page);
return iop;
@@ -132,6 +138,7 @@ iomap_page_release(struct page *page)
WARN_ON_ONCE(atomic_read(&iop->write_count));
ClearPagePrivate(page);
set_page_private(page, 0);
+ put_page(page);
kfree(iop);
}
_
Patches currently in -mm which might be from pjaroszynski(a)nvidia.com are
This is a note to let you know that I've just added the patch titled
staging: bcm2835-audio: double free in init error path
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 649496b603000135683ee76d7ea499456617bf17 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Mon, 17 Dec 2018 10:08:54 +0300
Subject: staging: bcm2835-audio: double free in init error path
We free instance here and in the caller. It should be only the caller
which handles it.
Fixes: d7ca3a71545b ("staging: bcm2835-audio: Operate non-atomic PCM ops")
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reviewed-by: Takashi Iwai <tiwai(a)suse.de>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
index 0db412fd7c55..c0debdbce26c 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
@@ -138,7 +138,6 @@ vc_vchi_audio_init(VCHI_INSTANCE_T vchi_instance,
dev_err(instance->dev,
"failed to open VCHI service connection (status=%d)\n",
status);
- kfree(instance);
return -EPERM;
}
--
2.20.1