Commit 7fafcfdf6377 ("USB: gadget: f_midi: fixing a possible double-free in f_midi")
fixes CVE-2018-20961. Commit f0f1b8cac4d8 ("usb: gadget: f_midi: fail if set_alt fails
to allocate requests") avoids a context conflict when applying 7fafcfdf6377,
and fixes another minor problem.
Commit 7fafcfdf6377 is present in v4.9.y and v4.14.y.
Thanks,
Guenter
From: Prasad Sodagudi <psodagud(a)codeaurora.org>
commit 951531691c4bcaa59f56a316e018bc2ff1ddf855 upstream.
Currently, when checking to see if accessing n bytes starting at address
"ptr" will cause a wraparound in the memory addresses, the check in
check_bogus_address() adds an extra byte, which is incorrect, as the
range of addresses that will be accessed is [ptr, ptr + (n - 1)].
This can lead to incorrectly detecting a wraparound in the memory
address, when trying to read 4 KB from memory that is mapped to the the
last possible page in the virtual address space, when in fact, accessing
that range of memory would not cause a wraparound to occur.
Use the memory range that will actually be accessed when considering if
accessing a certain amount of bytes will cause the memory address to
wrap around.
Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeauror…
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Co-developed-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Reviewed-by: William Kucharski <william.kucharski(a)oracle.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Trilok Soni <tsoni(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
[kees: backport to v4.14]
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
mm/usercopy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/usercopy.c b/mm/usercopy.c
index a9852b24715d..975f7dff8059 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -121,7 +121,7 @@ static inline const char *check_kernel_text_object(const void *ptr,
static inline const char *check_bogus_address(const void *ptr, unsigned long n)
{
/* Reject if object wraps past end of memory. */
- if ((unsigned long)ptr + n < (unsigned long)ptr)
+ if ((unsigned long)ptr + (n - 1) < (unsigned long)ptr)
return "<wrapped address>";
/* Reject if NULL or ZERO-allocation. */
--
2.17.1
--
Kees Cook
From: Prasad Sodagudi <psodagud(a)codeaurora.org>
commit 951531691c4bcaa59f56a316e018bc2ff1ddf855 upstream.
Currently, when checking to see if accessing n bytes starting at address
"ptr" will cause a wraparound in the memory addresses, the check in
check_bogus_address() adds an extra byte, which is incorrect, as the
range of addresses that will be accessed is [ptr, ptr + (n - 1)].
This can lead to incorrectly detecting a wraparound in the memory
address, when trying to read 4 KB from memory that is mapped to the the
last possible page in the virtual address space, when in fact, accessing
that range of memory would not cause a wraparound to occur.
Use the memory range that will actually be accessed when considering if
accessing a certain amount of bytes will cause the memory address to
wrap around.
Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeauror…
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Co-developed-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Reviewed-by: William Kucharski <william.kucharski(a)oracle.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Trilok Soni <tsoni(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
[kees: backport to v4.9]
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
mm/usercopy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/usercopy.c b/mm/usercopy.c
index 3c8da0af9695..7683c22551ff 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -124,7 +124,7 @@ static inline const char *check_kernel_text_object(const void *ptr,
static inline const char *check_bogus_address(const void *ptr, unsigned long n)
{
/* Reject if object wraps past end of memory. */
- if ((unsigned long)ptr + n < (unsigned long)ptr)
+ if ((unsigned long)ptr + (n - 1) < (unsigned long)ptr)
return "<wrapped address>";
/* Reject if NULL or ZERO-allocation. */
--
2.17.1
--
Kees Cook
Hi,
Please consider backporting commit c289d6625237 ("Revert "pwm: Set class
for exported channels in sysfs"") from 4.20 to 4.19 stable (the original
buggy commit was introduced in 4.16).
This one-line revert fixes an oops triggered by writing to sysfs.
Thanks,
John
The initial support for dynamic ftrace trampolines in modules made use
of an indirect branch which loaded its target from the beginning of
a special section (e71a4e1bebaf7 ("arm64: ftrace: add support for far
branches to dynamic ftrace")). Since no instructions were being patched,
no cache maintenance was needed. However, later in be0f272bfc83 ("arm64:
ftrace: emit ftrace-mod.o contents through code") this code was reworked
to output the trampoline instructions directly into the PLT entry but,
unfortunately, the necessary cache maintenance was overlooked.
Add a call to __flush_icache_range() after writing the new trampoline
instructions but before patching in the branch to the trampoline.
Cc: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: James Morse <james.morse(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Fixes: be0f272bfc83 ("arm64: ftrace: emit ftrace-mod.o contents through code")
Signed-off-by: Will Deacon <will(a)kernel.org>
---
arch/arm64/kernel/ftrace.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 1285c7b2947f..171773257974 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -73,7 +73,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
if (offset < -SZ_128M || offset >= SZ_128M) {
#ifdef CONFIG_ARM64_MODULE_PLTS
- struct plt_entry trampoline;
+ struct plt_entry trampoline, *dst;
struct module *mod;
/*
@@ -106,23 +106,27 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
* to check if the actual opcodes are in fact identical,
* regardless of the offset in memory so use memcmp() instead.
*/
- trampoline = get_plt_entry(addr, mod->arch.ftrace_trampoline);
- if (memcmp(mod->arch.ftrace_trampoline, &trampoline,
- sizeof(trampoline))) {
- if (plt_entry_is_initialized(mod->arch.ftrace_trampoline)) {
+ dst = mod->arch.ftrace_trampoline;
+ trampoline = get_plt_entry(addr, dst);
+ if (memcmp(dst, &trampoline, sizeof(trampoline))) {
+ if (plt_entry_is_initialized(dst)) {
pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
return -EINVAL;
}
/* point the trampoline to our ftrace entry point */
module_disable_ro(mod);
- *mod->arch.ftrace_trampoline = trampoline;
+ *dst = trampoline;
module_enable_ro(mod, true);
- /* update trampoline before patching in the branch */
- smp_wmb();
+ /*
+ * Ensure updated trampoline is visible to instruction
+ * fetch before we patch in the branch.
+ */
+ __flush_icache_range((unsigned long)&dst[0],
+ (unsigned long)&dst[1]);
}
- addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
+ addr = (unsigned long)dst;
#else /* CONFIG_ARM64_MODULE_PLTS */
return -EINVAL;
#endif /* CONFIG_ARM64_MODULE_PLTS */
--
2.11.0
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 951531691c4bcaa59f56a316e018bc2ff1ddf855 Mon Sep 17 00:00:00 2001
From: "Isaac J. Manjarres" <isaacm(a)codeaurora.org>
Date: Tue, 13 Aug 2019 15:37:37 -0700
Subject: [PATCH] mm/usercopy: use memory range to be accessed for wraparound
check
Currently, when checking to see if accessing n bytes starting at address
"ptr" will cause a wraparound in the memory addresses, the check in
check_bogus_address() adds an extra byte, which is incorrect, as the
range of addresses that will be accessed is [ptr, ptr + (n - 1)].
This can lead to incorrectly detecting a wraparound in the memory
address, when trying to read 4 KB from memory that is mapped to the the
last possible page in the virtual address space, when in fact, accessing
that range of memory would not cause a wraparound to occur.
Use the memory range that will actually be accessed when considering if
accessing a certain amount of bytes will cause the memory address to
wrap around.
Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeauror…
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Co-developed-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Reviewed-by: William Kucharski <william.kucharski(a)oracle.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Trilok Soni <tsoni(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/usercopy.c b/mm/usercopy.c
index 2a09796edef8..98e924864554 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -147,7 +147,7 @@ static inline void check_bogus_address(const unsigned long ptr, unsigned long n,
bool to_user)
{
/* Reject if object wraps past end of memory. */
- if (ptr + n < ptr)
+ if (ptr + (n - 1) < ptr)
usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n);
/* Reject if NULL or ZERO-allocation. */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 951531691c4bcaa59f56a316e018bc2ff1ddf855 Mon Sep 17 00:00:00 2001
From: "Isaac J. Manjarres" <isaacm(a)codeaurora.org>
Date: Tue, 13 Aug 2019 15:37:37 -0700
Subject: [PATCH] mm/usercopy: use memory range to be accessed for wraparound
check
Currently, when checking to see if accessing n bytes starting at address
"ptr" will cause a wraparound in the memory addresses, the check in
check_bogus_address() adds an extra byte, which is incorrect, as the
range of addresses that will be accessed is [ptr, ptr + (n - 1)].
This can lead to incorrectly detecting a wraparound in the memory
address, when trying to read 4 KB from memory that is mapped to the the
last possible page in the virtual address space, when in fact, accessing
that range of memory would not cause a wraparound to occur.
Use the memory range that will actually be accessed when considering if
accessing a certain amount of bytes will cause the memory address to
wrap around.
Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeauror…
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Co-developed-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Reviewed-by: William Kucharski <william.kucharski(a)oracle.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Trilok Soni <tsoni(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/usercopy.c b/mm/usercopy.c
index 2a09796edef8..98e924864554 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -147,7 +147,7 @@ static inline void check_bogus_address(const unsigned long ptr, unsigned long n,
bool to_user)
{
/* Reject if object wraps past end of memory. */
- if (ptr + n < ptr)
+ if (ptr + (n - 1) < ptr)
usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n);
/* Reject if NULL or ZERO-allocation. */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6051d3bd3b91e96c59e62b8be2dba1cc2b19ee40 Mon Sep 17 00:00:00 2001
From: Henry Burns <henryburns(a)google.com>
Date: Tue, 13 Aug 2019 15:37:21 -0700
Subject: [PATCH] mm/z3fold.c: fix z3fold_destroy_pool() ordering
The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.
If there is work queued on pool->compact_workqueue when it is called,
z3fold_destroy_pool() will do:
z3fold_destroy_pool()
destroy_workqueue(pool->release_wq)
destroy_workqueue(pool->compact_wq)
drain_workqueue(pool->compact_wq)
do_compact_page(zhdr)
kref_put(&zhdr->refcount)
__release_z3fold_page(zhdr, ...)
queue_work_on(pool->release_wq, &pool->work) *BOOM*
So compact_wq needs to be destroyed before release_wq.
Link: http://lkml.kernel.org/r/20190726224810.79660-1-henryburns@google.com
Fixes: 5d03a6613957 ("mm/z3fold.c: use kref to prevent page free/compact race")
Signed-off-by: Henry Burns <henryburns(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Jonathan Adams <jwadams(a)google.com>
Cc: Vitaly Vul <vitaly.vul(a)sony.com>
Cc: Vitaly Wool <vitalywool(a)gmail.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk
Cc: Henry Burns <henrywolfeburns(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/z3fold.c b/mm/z3fold.c
index 1a029a7432ee..43de92f52961 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -818,8 +818,15 @@ static void z3fold_destroy_pool(struct z3fold_pool *pool)
{
kmem_cache_destroy(pool->c_handle);
z3fold_unregister_migration(pool);
- destroy_workqueue(pool->release_wq);
+
+ /*
+ * We need to destroy pool->compact_wq before pool->release_wq,
+ * as any pending work on pool->compact_wq will call
+ * queue_work(pool->release_wq, &pool->work).
+ */
+
destroy_workqueue(pool->compact_wq);
+ destroy_workqueue(pool->release_wq);
kfree(pool);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6051d3bd3b91e96c59e62b8be2dba1cc2b19ee40 Mon Sep 17 00:00:00 2001
From: Henry Burns <henryburns(a)google.com>
Date: Tue, 13 Aug 2019 15:37:21 -0700
Subject: [PATCH] mm/z3fold.c: fix z3fold_destroy_pool() ordering
The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.
If there is work queued on pool->compact_workqueue when it is called,
z3fold_destroy_pool() will do:
z3fold_destroy_pool()
destroy_workqueue(pool->release_wq)
destroy_workqueue(pool->compact_wq)
drain_workqueue(pool->compact_wq)
do_compact_page(zhdr)
kref_put(&zhdr->refcount)
__release_z3fold_page(zhdr, ...)
queue_work_on(pool->release_wq, &pool->work) *BOOM*
So compact_wq needs to be destroyed before release_wq.
Link: http://lkml.kernel.org/r/20190726224810.79660-1-henryburns@google.com
Fixes: 5d03a6613957 ("mm/z3fold.c: use kref to prevent page free/compact race")
Signed-off-by: Henry Burns <henryburns(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Jonathan Adams <jwadams(a)google.com>
Cc: Vitaly Vul <vitaly.vul(a)sony.com>
Cc: Vitaly Wool <vitalywool(a)gmail.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk
Cc: Henry Burns <henrywolfeburns(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/z3fold.c b/mm/z3fold.c
index 1a029a7432ee..43de92f52961 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -818,8 +818,15 @@ static void z3fold_destroy_pool(struct z3fold_pool *pool)
{
kmem_cache_destroy(pool->c_handle);
z3fold_unregister_migration(pool);
- destroy_workqueue(pool->release_wq);
+
+ /*
+ * We need to destroy pool->compact_wq before pool->release_wq,
+ * as any pending work on pool->compact_wq will call
+ * queue_work(pool->release_wq, &pool->work).
+ */
+
destroy_workqueue(pool->compact_wq);
+ destroy_workqueue(pool->release_wq);
kfree(pool);
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a53190a4aaa36494f4d7209fd1fcc6f2ee08e0e0 Mon Sep 17 00:00:00 2001
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Date: Tue, 13 Aug 2019 15:37:18 -0700
Subject: [PATCH] mm: mempolicy: handle vma with unmovable pages mapped
correctly in mbind
When running syzkaller internally, we ran into the below bug on 4.9.x
kernel:
kernel BUG at mm/huge_memory.c:2124!
invalid opcode: 0000 [#1] SMP KASAN
CPU: 0 PID: 1518 Comm: syz-executor107 Not tainted 4.9.168+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
task: ffff880067b34900 task.stack: ffff880068998000
RIP: split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
Call Trace:
split_huge_page include/linux/huge_mm.h:100 [inline]
queue_pages_pte_range+0x7e1/0x1480 mm/mempolicy.c:538
walk_pmd_range mm/pagewalk.c:50 [inline]
walk_pud_range mm/pagewalk.c:90 [inline]
walk_pgd_range mm/pagewalk.c:116 [inline]
__walk_page_range+0x44a/0xdb0 mm/pagewalk.c:208
walk_page_range+0x154/0x370 mm/pagewalk.c:285
queue_pages_range+0x115/0x150 mm/mempolicy.c:694
do_mbind mm/mempolicy.c:1241 [inline]
SYSC_mbind+0x3c3/0x1030 mm/mempolicy.c:1370
SyS_mbind+0x46/0x60 mm/mempolicy.c:1352
do_syscall_64+0x1d2/0x600 arch/x86/entry/common.c:282
entry_SYSCALL_64_after_swapgs+0x5d/0xdb
Code: c7 80 1c 02 00 e8 26 0a 76 01 <0f> 0b 48 c7 c7 40 46 45 84 e8 4c
RIP [<ffffffff81895d6b>] split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
RSP <ffff88006899f980>
with the below test:
uint64_t r[1] = {0xffffffffffffffff};
int main(void)
{
syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
intptr_t res = 0;
res = syscall(__NR_socket, 0x11, 3, 0x300);
if (res != -1)
r[0] = res;
*(uint32_t*)0x20000040 = 0x10000;
*(uint32_t*)0x20000044 = 1;
*(uint32_t*)0x20000048 = 0xc520;
*(uint32_t*)0x2000004c = 1;
syscall(__NR_setsockopt, r[0], 0x107, 0xd, 0x20000040, 0x10);
syscall(__NR_mmap, 0x20fed000, 0x10000, 0, 0x8811, r[0], 0);
*(uint64_t*)0x20000340 = 2;
syscall(__NR_mbind, 0x20ff9000, 0x4000, 0x4002, 0x20000340, 0x45d4, 3);
return 0;
}
Actually the test does:
mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000
socket(AF_PACKET, SOCK_RAW, 768) = 3
setsockopt(3, SOL_PACKET, PACKET_TX_RING, {block_size=65536, block_nr=1, frame_size=50464, frame_nr=1}, 16) = 0
mmap(0x20fed000, 65536, PROT_NONE, MAP_SHARED|MAP_FIXED|MAP_POPULATE|MAP_DENYWRITE, 3, 0) = 0x20fed000
mbind(..., MPOL_MF_STRICT|MPOL_MF_MOVE) = 0
The setsockopt() would allocate compound pages (16 pages in this test)
for packet tx ring, then the mmap() would call packet_mmap() to map the
pages into the user address space specified by the mmap() call.
When calling mbind(), it would scan the vma to queue the pages for
migration to the new node. It would split any huge page since 4.9
doesn't support THP migration, however, the packet tx ring compound
pages are not THP and even not movable. So, the above bug is triggered.
However, the later kernel is not hit by this issue due to commit
d44d363f6578 ("mm: don't assume anonymous pages have SwapBacked flag"),
which just removes the PageSwapBacked check for a different reason.
But, there is a deeper issue. According to the semantic of mbind(), it
should return -EIO if MPOL_MF_MOVE or MPOL_MF_MOVE_ALL was specified and
MPOL_MF_STRICT was also specified, but the kernel was unable to move all
existing pages in the range. The tx ring of the packet socket is
definitely not movable, however, mbind() returns success for this case.
Although the most socket file associates with non-movable pages, but XDP
may have movable pages from gup. So, it sounds not fine to just check
the underlying file type of vma in vma_migratable().
Change migrate_page_add() to check if the page is movable or not, if it
is unmovable, just return -EIO. But do not abort pte walk immediately,
since there may be pages off LRU temporarily. We should migrate other
pages if MPOL_MF_MOVE* is specified. Set has_unmovable flag if some
paged could not be not moved, then return -EIO for mbind() eventually.
With this change the above test would return -EIO as expected.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-3-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-3-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 932c26845e3e..547cd403ed02 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -403,7 +403,7 @@ static const struct mempolicy_operations mpol_ops[MPOL_MAX] = {
},
};
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags);
struct queue_pages {
@@ -463,12 +463,11 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(walk->vma)) {
+ if (!vma_migratable(walk->vma) ||
+ migrate_page_add(page, qp->pagelist, flags)) {
ret = 1;
goto unlock;
}
-
- migrate_page_add(page, qp->pagelist, flags);
} else
ret = -EIO;
unlock:
@@ -532,7 +531,14 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
has_unmovable = true;
break;
}
- migrate_page_add(page, qp->pagelist, flags);
+
+ /*
+ * Do not abort immediately since there may be
+ * temporary off LRU pages in the range. Still
+ * need migrate other LRU pages.
+ */
+ if (migrate_page_add(page, qp->pagelist, flags))
+ has_unmovable = true;
} else
break;
}
@@ -961,7 +967,7 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
/*
* page migration, thp tail pages can be passed.
*/
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
struct page *head = compound_head(page);
@@ -974,8 +980,19 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
mod_node_page_state(page_pgdat(head),
NR_ISOLATED_ANON + page_is_file_cache(head),
hpage_nr_pages(head));
+ } else if (flags & MPOL_MF_STRICT) {
+ /*
+ * Non-movable page may reach here. And, there may be
+ * temporary off LRU pages or non-LRU movable pages.
+ * Treat them as unmovable pages since they can't be
+ * isolated, so they can't be moved at the moment. It
+ * should return -EIO for this case too.
+ */
+ return -EIO;
}
}
+
+ return 0;
}
/* page allocation callback for NUMA node migration */
@@ -1178,9 +1195,10 @@ static struct page *new_page(struct page *page, unsigned long start)
}
#else
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
+ return -EIO;
}
int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a53190a4aaa36494f4d7209fd1fcc6f2ee08e0e0 Mon Sep 17 00:00:00 2001
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Date: Tue, 13 Aug 2019 15:37:18 -0700
Subject: [PATCH] mm: mempolicy: handle vma with unmovable pages mapped
correctly in mbind
When running syzkaller internally, we ran into the below bug on 4.9.x
kernel:
kernel BUG at mm/huge_memory.c:2124!
invalid opcode: 0000 [#1] SMP KASAN
CPU: 0 PID: 1518 Comm: syz-executor107 Not tainted 4.9.168+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
task: ffff880067b34900 task.stack: ffff880068998000
RIP: split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
Call Trace:
split_huge_page include/linux/huge_mm.h:100 [inline]
queue_pages_pte_range+0x7e1/0x1480 mm/mempolicy.c:538
walk_pmd_range mm/pagewalk.c:50 [inline]
walk_pud_range mm/pagewalk.c:90 [inline]
walk_pgd_range mm/pagewalk.c:116 [inline]
__walk_page_range+0x44a/0xdb0 mm/pagewalk.c:208
walk_page_range+0x154/0x370 mm/pagewalk.c:285
queue_pages_range+0x115/0x150 mm/mempolicy.c:694
do_mbind mm/mempolicy.c:1241 [inline]
SYSC_mbind+0x3c3/0x1030 mm/mempolicy.c:1370
SyS_mbind+0x46/0x60 mm/mempolicy.c:1352
do_syscall_64+0x1d2/0x600 arch/x86/entry/common.c:282
entry_SYSCALL_64_after_swapgs+0x5d/0xdb
Code: c7 80 1c 02 00 e8 26 0a 76 01 <0f> 0b 48 c7 c7 40 46 45 84 e8 4c
RIP [<ffffffff81895d6b>] split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
RSP <ffff88006899f980>
with the below test:
uint64_t r[1] = {0xffffffffffffffff};
int main(void)
{
syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
intptr_t res = 0;
res = syscall(__NR_socket, 0x11, 3, 0x300);
if (res != -1)
r[0] = res;
*(uint32_t*)0x20000040 = 0x10000;
*(uint32_t*)0x20000044 = 1;
*(uint32_t*)0x20000048 = 0xc520;
*(uint32_t*)0x2000004c = 1;
syscall(__NR_setsockopt, r[0], 0x107, 0xd, 0x20000040, 0x10);
syscall(__NR_mmap, 0x20fed000, 0x10000, 0, 0x8811, r[0], 0);
*(uint64_t*)0x20000340 = 2;
syscall(__NR_mbind, 0x20ff9000, 0x4000, 0x4002, 0x20000340, 0x45d4, 3);
return 0;
}
Actually the test does:
mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000
socket(AF_PACKET, SOCK_RAW, 768) = 3
setsockopt(3, SOL_PACKET, PACKET_TX_RING, {block_size=65536, block_nr=1, frame_size=50464, frame_nr=1}, 16) = 0
mmap(0x20fed000, 65536, PROT_NONE, MAP_SHARED|MAP_FIXED|MAP_POPULATE|MAP_DENYWRITE, 3, 0) = 0x20fed000
mbind(..., MPOL_MF_STRICT|MPOL_MF_MOVE) = 0
The setsockopt() would allocate compound pages (16 pages in this test)
for packet tx ring, then the mmap() would call packet_mmap() to map the
pages into the user address space specified by the mmap() call.
When calling mbind(), it would scan the vma to queue the pages for
migration to the new node. It would split any huge page since 4.9
doesn't support THP migration, however, the packet tx ring compound
pages are not THP and even not movable. So, the above bug is triggered.
However, the later kernel is not hit by this issue due to commit
d44d363f6578 ("mm: don't assume anonymous pages have SwapBacked flag"),
which just removes the PageSwapBacked check for a different reason.
But, there is a deeper issue. According to the semantic of mbind(), it
should return -EIO if MPOL_MF_MOVE or MPOL_MF_MOVE_ALL was specified and
MPOL_MF_STRICT was also specified, but the kernel was unable to move all
existing pages in the range. The tx ring of the packet socket is
definitely not movable, however, mbind() returns success for this case.
Although the most socket file associates with non-movable pages, but XDP
may have movable pages from gup. So, it sounds not fine to just check
the underlying file type of vma in vma_migratable().
Change migrate_page_add() to check if the page is movable or not, if it
is unmovable, just return -EIO. But do not abort pte walk immediately,
since there may be pages off LRU temporarily. We should migrate other
pages if MPOL_MF_MOVE* is specified. Set has_unmovable flag if some
paged could not be not moved, then return -EIO for mbind() eventually.
With this change the above test would return -EIO as expected.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-3-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-3-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 932c26845e3e..547cd403ed02 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -403,7 +403,7 @@ static const struct mempolicy_operations mpol_ops[MPOL_MAX] = {
},
};
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags);
struct queue_pages {
@@ -463,12 +463,11 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(walk->vma)) {
+ if (!vma_migratable(walk->vma) ||
+ migrate_page_add(page, qp->pagelist, flags)) {
ret = 1;
goto unlock;
}
-
- migrate_page_add(page, qp->pagelist, flags);
} else
ret = -EIO;
unlock:
@@ -532,7 +531,14 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
has_unmovable = true;
break;
}
- migrate_page_add(page, qp->pagelist, flags);
+
+ /*
+ * Do not abort immediately since there may be
+ * temporary off LRU pages in the range. Still
+ * need migrate other LRU pages.
+ */
+ if (migrate_page_add(page, qp->pagelist, flags))
+ has_unmovable = true;
} else
break;
}
@@ -961,7 +967,7 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
/*
* page migration, thp tail pages can be passed.
*/
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
struct page *head = compound_head(page);
@@ -974,8 +980,19 @@ static void migrate_page_add(struct page *page, struct list_head *pagelist,
mod_node_page_state(page_pgdat(head),
NR_ISOLATED_ANON + page_is_file_cache(head),
hpage_nr_pages(head));
+ } else if (flags & MPOL_MF_STRICT) {
+ /*
+ * Non-movable page may reach here. And, there may be
+ * temporary off LRU pages or non-LRU movable pages.
+ * Treat them as unmovable pages since they can't be
+ * isolated, so they can't be moved at the moment. It
+ * should return -EIO for this case too.
+ */
+ return -EIO;
}
}
+
+ return 0;
}
/* page allocation callback for NUMA node migration */
@@ -1178,9 +1195,10 @@ static struct page *new_page(struct page *page, unsigned long start)
}
#else
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
+ return -EIO;
}
int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d883544515aae54842c21730b880172e7894fde9 Mon Sep 17 00:00:00 2001
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Date: Tue, 13 Aug 2019 15:37:15 -0700
Subject: [PATCH] mm: mempolicy: make the behavior consistent when
MPOL_MF_MOVE* and MPOL_MF_STRICT were specified
When both MPOL_MF_MOVE* and MPOL_MF_STRICT was specified, mbind() should
try best to migrate misplaced pages, if some of the pages could not be
migrated, then return -EIO.
There are three different sub-cases:
1. vma is not migratable
2. vma is migratable, but there are unmovable pages
3. vma is migratable, pages are movable, but migrate_pages() fails
If #1 happens, kernel would just abort immediately, then return -EIO,
after a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when
MPOL_MF_STRICT is specified").
If #3 happens, kernel would set policy and migrate pages with
best-effort, but won't rollback the migrated pages and reset the policy
back.
Before that commit, they behaves in the same way. It'd better to keep
their behavior consistent. But, rolling back the migrated pages and
resetting the policy back sounds not feasible, so just make #1 behave as
same as #3.
Userspace will know that not everything was successfully migrated (via
-EIO), and can take whatever steps it deems necessary - attempt
rollback, determine which exact page(s) are violating the policy, etc.
Make queue_pages_range() return 1 to indicate there are unmovable pages
or vma is not migratable.
The #2 is not handled correctly in the current kernel, the following
patch will fix it.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-2-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-2-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index f48693f75b37..932c26845e3e 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -429,11 +429,14 @@ static inline bool queue_pages_required(struct page *page,
}
/*
- * queue_pages_pmd() has three possible return values:
- * 1 - pages are placed on the right node or queued successfully.
- * 0 - THP was split.
- * -EIO - is migration entry or MPOL_MF_STRICT was specified and an existing
- * page was already on a node that does not follow the policy.
+ * queue_pages_pmd() has four possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 2 - THP was split.
+ * -EIO - is migration entry or only MPOL_MF_STRICT was specified and an
+ * existing page was already on a node that does not follow the
+ * policy.
*/
static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -451,19 +454,17 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
if (is_huge_zero_page(page)) {
spin_unlock(ptl);
__split_huge_pmd(walk->vma, pmd, addr, false, NULL);
+ ret = 2;
goto out;
}
- if (!queue_pages_required(page, qp)) {
- ret = 1;
+ if (!queue_pages_required(page, qp))
goto unlock;
- }
- ret = 1;
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
if (!vma_migratable(walk->vma)) {
- ret = -EIO;
+ ret = 1;
goto unlock;
}
@@ -479,6 +480,13 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
/*
* Scan through pages checking if pages follow certain conditions,
* and move them to the pagelist if they do.
+ *
+ * queue_pages_pte_range() has three possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * -EIO - only MPOL_MF_STRICT was specified and an existing page was already
+ * on a node that does not follow the policy.
*/
static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -488,17 +496,17 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
struct queue_pages *qp = walk->private;
unsigned long flags = qp->flags;
int ret;
+ bool has_unmovable = false;
pte_t *pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
if (ptl) {
ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
- if (ret > 0)
- return 0;
- else if (ret < 0)
+ if (ret != 2)
return ret;
}
+ /* THP was split, fall through to pte walk */
if (pmd_trans_unstable(pmd))
return 0;
@@ -519,14 +527,21 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (!queue_pages_required(page, qp))
continue;
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(vma))
+ /* MPOL_MF_STRICT must be specified if we get here */
+ if (!vma_migratable(vma)) {
+ has_unmovable = true;
break;
+ }
migrate_page_add(page, qp->pagelist, flags);
} else
break;
}
pte_unmap_unlock(pte - 1, ptl);
cond_resched();
+
+ if (has_unmovable)
+ return 1;
+
return addr != end ? -EIO : 0;
}
@@ -639,7 +654,13 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
*
* If pages found in a given range are on a set of nodes (determined by
* @nodes and @flags,) it's isolated and queued to the pagelist which is
- * passed via @private.)
+ * passed via @private.
+ *
+ * queue_pages_range() has three possible return values:
+ * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 0 - queue pages successfully or no misplaced page.
+ * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.
*/
static int
queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
@@ -1182,6 +1203,7 @@ static long do_mbind(unsigned long start, unsigned long len,
struct mempolicy *new;
unsigned long end;
int err;
+ int ret;
LIST_HEAD(pagelist);
if (flags & ~(unsigned long)MPOL_MF_VALID)
@@ -1243,10 +1265,15 @@ static long do_mbind(unsigned long start, unsigned long len,
if (err)
goto mpol_out;
- err = queue_pages_range(mm, start, end, nmask,
+ ret = queue_pages_range(mm, start, end, nmask,
flags | MPOL_MF_INVERT, &pagelist);
- if (!err)
- err = mbind_range(mm, start, end, new);
+
+ if (ret < 0) {
+ err = -EIO;
+ goto up_out;
+ }
+
+ err = mbind_range(mm, start, end, new);
if (!err) {
int nr_failed = 0;
@@ -1259,13 +1286,14 @@ static long do_mbind(unsigned long start, unsigned long len,
putback_movable_pages(&pagelist);
}
- if (nr_failed && (flags & MPOL_MF_STRICT))
+ if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))
err = -EIO;
} else
putback_movable_pages(&pagelist);
+up_out:
up_write(&mm->mmap_sem);
- mpol_out:
+mpol_out:
mpol_put(new);
return err;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d883544515aae54842c21730b880172e7894fde9 Mon Sep 17 00:00:00 2001
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Date: Tue, 13 Aug 2019 15:37:15 -0700
Subject: [PATCH] mm: mempolicy: make the behavior consistent when
MPOL_MF_MOVE* and MPOL_MF_STRICT were specified
When both MPOL_MF_MOVE* and MPOL_MF_STRICT was specified, mbind() should
try best to migrate misplaced pages, if some of the pages could not be
migrated, then return -EIO.
There are three different sub-cases:
1. vma is not migratable
2. vma is migratable, but there are unmovable pages
3. vma is migratable, pages are movable, but migrate_pages() fails
If #1 happens, kernel would just abort immediately, then return -EIO,
after a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when
MPOL_MF_STRICT is specified").
If #3 happens, kernel would set policy and migrate pages with
best-effort, but won't rollback the migrated pages and reset the policy
back.
Before that commit, they behaves in the same way. It'd better to keep
their behavior consistent. But, rolling back the migrated pages and
resetting the policy back sounds not feasible, so just make #1 behave as
same as #3.
Userspace will know that not everything was successfully migrated (via
-EIO), and can take whatever steps it deems necessary - attempt
rollback, determine which exact page(s) are violating the policy, etc.
Make queue_pages_range() return 1 to indicate there are unmovable pages
or vma is not migratable.
The #2 is not handled correctly in the current kernel, the following
patch will fix it.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-2-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-2-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index f48693f75b37..932c26845e3e 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -429,11 +429,14 @@ static inline bool queue_pages_required(struct page *page,
}
/*
- * queue_pages_pmd() has three possible return values:
- * 1 - pages are placed on the right node or queued successfully.
- * 0 - THP was split.
- * -EIO - is migration entry or MPOL_MF_STRICT was specified and an existing
- * page was already on a node that does not follow the policy.
+ * queue_pages_pmd() has four possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 2 - THP was split.
+ * -EIO - is migration entry or only MPOL_MF_STRICT was specified and an
+ * existing page was already on a node that does not follow the
+ * policy.
*/
static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -451,19 +454,17 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
if (is_huge_zero_page(page)) {
spin_unlock(ptl);
__split_huge_pmd(walk->vma, pmd, addr, false, NULL);
+ ret = 2;
goto out;
}
- if (!queue_pages_required(page, qp)) {
- ret = 1;
+ if (!queue_pages_required(page, qp))
goto unlock;
- }
- ret = 1;
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
if (!vma_migratable(walk->vma)) {
- ret = -EIO;
+ ret = 1;
goto unlock;
}
@@ -479,6 +480,13 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
/*
* Scan through pages checking if pages follow certain conditions,
* and move them to the pagelist if they do.
+ *
+ * queue_pages_pte_range() has three possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * -EIO - only MPOL_MF_STRICT was specified and an existing page was already
+ * on a node that does not follow the policy.
*/
static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -488,17 +496,17 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
struct queue_pages *qp = walk->private;
unsigned long flags = qp->flags;
int ret;
+ bool has_unmovable = false;
pte_t *pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
if (ptl) {
ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
- if (ret > 0)
- return 0;
- else if (ret < 0)
+ if (ret != 2)
return ret;
}
+ /* THP was split, fall through to pte walk */
if (pmd_trans_unstable(pmd))
return 0;
@@ -519,14 +527,21 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (!queue_pages_required(page, qp))
continue;
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(vma))
+ /* MPOL_MF_STRICT must be specified if we get here */
+ if (!vma_migratable(vma)) {
+ has_unmovable = true;
break;
+ }
migrate_page_add(page, qp->pagelist, flags);
} else
break;
}
pte_unmap_unlock(pte - 1, ptl);
cond_resched();
+
+ if (has_unmovable)
+ return 1;
+
return addr != end ? -EIO : 0;
}
@@ -639,7 +654,13 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
*
* If pages found in a given range are on a set of nodes (determined by
* @nodes and @flags,) it's isolated and queued to the pagelist which is
- * passed via @private.)
+ * passed via @private.
+ *
+ * queue_pages_range() has three possible return values:
+ * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 0 - queue pages successfully or no misplaced page.
+ * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.
*/
static int
queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
@@ -1182,6 +1203,7 @@ static long do_mbind(unsigned long start, unsigned long len,
struct mempolicy *new;
unsigned long end;
int err;
+ int ret;
LIST_HEAD(pagelist);
if (flags & ~(unsigned long)MPOL_MF_VALID)
@@ -1243,10 +1265,15 @@ static long do_mbind(unsigned long start, unsigned long len,
if (err)
goto mpol_out;
- err = queue_pages_range(mm, start, end, nmask,
+ ret = queue_pages_range(mm, start, end, nmask,
flags | MPOL_MF_INVERT, &pagelist);
- if (!err)
- err = mbind_range(mm, start, end, new);
+
+ if (ret < 0) {
+ err = -EIO;
+ goto up_out;
+ }
+
+ err = mbind_range(mm, start, end, new);
if (!err) {
int nr_failed = 0;
@@ -1259,13 +1286,14 @@ static long do_mbind(unsigned long start, unsigned long len,
putback_movable_pages(&pagelist);
}
- if (nr_failed && (flags & MPOL_MF_STRICT))
+ if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))
err = -EIO;
} else
putback_movable_pages(&pagelist);
+up_out:
up_write(&mm->mmap_sem);
- mpol_out:
+mpol_out:
mpol_put(new);
return err;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1de13ee59225dfc98d483f8cce7d83f97c0b31de Mon Sep 17 00:00:00 2001
From: Ralph Campbell <rcampbell(a)nvidia.com>
Date: Tue, 13 Aug 2019 15:37:11 -0700
Subject: [PATCH] mm/hmm: fix bad subpage pointer in try_to_unmap_one
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When migrating an anonymous private page to a ZONE_DEVICE private page,
the source page->mapping and page->index fields are copied to the
destination ZONE_DEVICE struct page and the page_mapcount() is
increased. This is so rmap_walk() can be used to unmap and migrate the
page back to system memory.
However, try_to_unmap_one() computes the subpage pointer from a swap pte
which computes an invalid page pointer and a kernel panic results such
as:
BUG: unable to handle page fault for address: ffffea1fffffffc8
Currently, only single pages can be migrated to device private memory so
no subpage computation is needed and it can be set to "page".
[rcampbell(a)nvidia.com: add comment]
Link: http://lkml.kernel.org/r/20190724232700.23327-4-rcampbell@nvidia.com
Link: http://lkml.kernel.org/r/20190719192955.30462-4-rcampbell@nvidia.com
Fixes: a5430dda8a3a1c ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: "Jérôme Glisse" <jglisse(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Lai Jiangshan <jiangshanlai(a)gmail.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/rmap.c b/mm/rmap.c
index e5dfe2ae6b0d..003377e24232 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1475,7 +1475,15 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
/*
* No need to invalidate here it will synchronize on
* against the special swap migration pte.
+ *
+ * The assignment to subpage above was computed from a
+ * swap PTE which results in an invalid pointer.
+ * Since only PAGE_SIZE pages can currently be
+ * migrated, just set it to page. This will need to be
+ * changed when hugepage migrations to device private
+ * memory are supported.
*/
+ subpage = page;
goto discard;
}
From: Suganath Prabu S <suganath-prabu.subramani(a)broadcom.com>
Although SAS3 & SAS3.5 IT HBA controllers support
64-bit DMA addressing, as per hardware design,
if DMA able range contains all 64-bits set (0xFFFFFFFF-FFFFFFFF) then
it results in a firmware fault.
e.g. SGE's start address is 0xFFFFFFFF-FFFF000 and
data length is 0x1000 bytes. when HBA tries to DMA the data
at 0xFFFFFFFF-FFFFFFFF location then HBA will
fault the firmware.
Fix:
Driver will set 63-bit DMA mask to ensure the above address
will not be used.
Cc: <stable(a)vger.kernel.org> # 4.4.186, # 4.9.186 # 4.14.136
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani(a)broadcom.com>
---
RESEND: Resending this patch to include 4.14.136 stable kernel.
This Patch is for stable kernels 4.4.186, 4.9.186 and 4.14.136.
Original patch is applied to 5.3/scsi-fixes.
commit ID: df9a606184bfdb5ae3ca9d226184e9489f5c24f7
drivers/scsi/mpt3sas/mpt3sas_base.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 9b53672..7af7a08 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1686,9 +1686,11 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
{
struct sysinfo s;
u64 consistent_dma_mask;
+ /* Set 63 bit DMA mask for all SAS3 and SAS35 controllers */
+ int dma_mask = (ioc->hba_mpi_version_belonged > MPI2_VERSION) ? 63 : 64;
if (ioc->dma_mask)
- consistent_dma_mask = DMA_BIT_MASK(64);
+ consistent_dma_mask = DMA_BIT_MASK(dma_mask);
else
consistent_dma_mask = DMA_BIT_MASK(32);
@@ -1696,11 +1698,11 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
const uint64_t required_mask =
dma_get_required_mask(&pdev->dev);
if ((required_mask > DMA_BIT_MASK(32)) &&
- !pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
+ !pci_set_dma_mask(pdev, DMA_BIT_MASK(dma_mask)) &&
!pci_set_consistent_dma_mask(pdev, consistent_dma_mask)) {
ioc->base_add_sg_single = &_base_add_sg_single_64;
ioc->sge_size = sizeof(Mpi2SGESimple64_t);
- ioc->dma_mask = 64;
+ ioc->dma_mask = dma_mask;
goto out;
}
}
@@ -1726,7 +1728,7 @@ static int
_base_change_consistent_dma_mask(struct MPT3SAS_ADAPTER *ioc,
struct pci_dev *pdev)
{
- if (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64))) {
+ if (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(ioc->dma_mask))) {
if (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)))
return -ENODEV;
}
@@ -3325,7 +3327,7 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc, int sleep_flag)
total_sz += sz;
} while (ioc->rdpq_array_enable && (++i < ioc->reply_queue_count));
- if (ioc->dma_mask == 64) {
+ if (ioc->dma_mask > 32) {
if (_base_change_consistent_dma_mask(ioc, ioc->pdev) != 0) {
pr_warn(MPT3SAS_FMT
"no suitable consistent DMA mask for %s\n",
--
2.16.3
This is the start of the stable review cycle for the 4.14.139 release.
There are 69 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri 16 Aug 2019 04:55:34 PM UTC.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.139-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.139-rc1
Luca Coelho <luciano.coelho(a)intel.com>
iwlwifi: mvm: fix version check for GEO_TX_POWER_LIMIT support
Luca Coelho <luciano.coelho(a)intel.com>
iwlwifi: mvm: don't send GEO_TX_POWER_LIMIT on version < 41
Emmanuel Grumbach <emmanuel.grumbach(a)intel.com>
iwlwifi: mvm: fix an out-of-bound access
Emmanuel Grumbach <emmanuel.grumbach(a)intel.com>
iwlwifi: don't unmap as page memory that was mapped as single
Brian Norris <briannorris(a)chromium.org>
mwifiex: fix 802.11n/WPA detection
Wanpeng Li <wanpengli(a)tencent.com>
KVM: Fix leak vCPU's VMCS value into other pCPU
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: Fix an Oops in nfs4_do_setattr
Trond Myklebust <trond.myklebust(a)primarydata.com>
NFSv4: Only pass the delegation to setattr if we're sending a truncate
Steve French <stfrench(a)microsoft.com>
smb3: send CAP_DFS capability during session setup
Pavel Shilovsky <pshilov(a)microsoft.com>
SMB3: Fix deadlock in validate negotiate hits reconnect
Brian Norris <briannorris(a)chromium.org>
mac80211: don't WARN on short WMM parameters from AP
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457)
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda - Don't override global PCM hw info flag
Wenwen Wang <wenwen(a)cs.uga.edu>
ALSA: firewire: fix a memory leak bug
Stanislav Lisovskiy <stanislav.lisovskiy(a)intel.com>
drm/i915: Fix wrong escape clock divisor init for GLK
Guenter Roeck <linux(a)roeck-us.net>
hwmon: (nct7802) Fix wrong detection of in4 presence
Tomas Bortoli <tomasbortoli(a)gmail.com>
can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices
Tomas Bortoli <tomasbortoli(a)gmail.com>
can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices
Roderick Colenbrander <roderick(a)gaikai.com>
HID: sony: Fix race condition between rumble and device remove.
Leonard Crestez <leonard.crestez(a)nxp.com>
perf/core: Fix creating kernel counters for PMUs that override event->cpu
Peter Zijlstra <peterz(a)infradead.org>
tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
Wenwen Wang <wenwen(a)cs.uga.edu>
test_firmware: fix a memory leak bug
Hannes Reinecke <hare(a)suse.de>
scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG
Tyrel Datwyler <tyreld(a)linux.vnet.ibm.com>
scsi: ibmvfc: fix WARN_ON during event pool release
Junxiao Bi <junxiao.bi(a)oracle.com>
scsi: megaraid_sas: fix panic on loading firmware crashdump
Arnd Bergmann <arnd(a)arndb.de>
ARM: davinci: fix sleep.S build error on ARMv4
Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id()
Arnd Bergmann <arnd(a)arndb.de>
drbd: dynamically allocate shash descriptor
Arnaldo Carvalho de Melo <acme(a)redhat.com>
perf probe: Avoid calling freeing routine multiple times for same pointer
Jiri Olsa <jolsa(a)kernel.org>
perf tools: Fix proper buffer size for feature processing
Charles Keepax <ckeepax(a)opensource.cirrus.com>
ALSA: compress: Be more restrictive about when a drain is allowed
Charles Keepax <ckeepax(a)opensource.cirrus.com>
ALSA: compress: Don't allow paritial drain operations on capture streams
Charles Keepax <ckeepax(a)opensource.cirrus.com>
ALSA: compress: Prevent bypasses of set_params
Charles Keepax <ckeepax(a)opensource.cirrus.com>
ALSA: compress: Fix regression on compressed capture streams
Julian Wiedmann <jwi(a)linux.ibm.com>
s390/qdio: add sanity checks to the fast-requeue path
Wen Yang <wen.yang99(a)zte.com.cn>
cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
Qian Cai <cai(a)lca.pw>
drm: silence variable 'conn' set but not used
Björn Gerhart <gerhart(a)posteo.de>
hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
Brian Norris <briannorris(a)chromium.org>
mac80211: don't warn about CW params when not using them
Thomas Tai <thomas.tai(a)oracle.com>
iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND
Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
scripts/sphinx-pre-install: fix script for RHEL/CentOS
Laura Garcia Liebana <nevola(a)gmail.com>
netfilter: nft_hash: fix symhash with modulus one
Miaohe Lin <linmiaohe(a)huawei.com>
netfilter: Fix rpfilter dropping vrf packets by mistake
Farhan Ali <alifm(a)linux.ibm.com>
vfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn
Florian Westphal <fw(a)strlen.de>
netfilter: nfnetlink: avoid deadlock due to synchronous request_module
Stephane Grosjean <s.grosjean(a)peak-system.com>
can: peak_usb: fix potential double kfree_skb()
Nikita Yushchenko <nikita.yoush(a)cogentembedded.com>
can: rcar_canfd: fix possible IRQ storm on high load
Suzuki K Poulose <suzuki.poulose(a)arm.com>
usb: yurex: Fix use-after-free in yurex_delete
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
usb: host: xhci-rcar: Fix timeout in xhci_suspend()
Thomas Richter <tmricht(a)linux.ibm.com>
perf record: Fix module size on s390
Adrian Hunter <adrian.hunter(a)intel.com>
perf db-export: Fix thread__exec_comm()
Thomas Richter <tmricht(a)linux.ibm.com>
perf annotate: Fix s390 gap between kernel end and module start
Joerg Roedel <jroedel(a)suse.de>
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
Joerg Roedel <jroedel(a)suse.de>
x86/mm: Sync also unmappings in vmalloc_sync_all()
Joerg Roedel <jroedel(a)suse.de>
x86/mm: Check for pfn instead of page in vmalloc_sync_one()
Ben Hutchings <ben(a)decadent.org.uk>
tcp: Clear sk_send_head after purging the write queue
Gary R Hook <gary.hook(a)amd.com>
crypto: ccp - Add support for valid authsize values less than 16
Gary R Hook <gary.hook(a)amd.com>
crypto: ccp - Validate buffer lengths for copy operations
Nick Desaulniers <ndesaulniers(a)google.com>
lkdtm: support llvm-objcopy
Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Input: synaptics - enable RMI mode for HP Spectre X360
Mikulas Patocka <mpatocka(a)redhat.com>
loop: set PF_MEMALLOC_NOIO for the worker thread
Kevin Hao <haokexin(a)gmail.com>
mmc: cavium: Add the missing dma unmap when the dma has finished.
Kevin Hao <haokexin(a)gmail.com>
mmc: cavium: Set the correct dma max segment size for mmc_host
Wenwen Wang <wenwen(a)cs.uga.edu>
sound: fix a memory leak bug
Oliver Neukum <oneukum(a)suse.com>
usb: iowarrior: fix deadlock on disconnect
Gavin Li <git(a)thegavinli.com>
usb: usbfs: fix double-free of usb memory upon submiturb error
Gary R Hook <gary.hook(a)amd.com>
crypto: ccp - Ignore tag length when decrypting GCM ciphertext
Gary R Hook <gary.hook(a)amd.com>
crypto: ccp - Fix oops by properly managing allocated structures
Joe Perches <joe(a)perches.com>
iio: adc: max9611: Fix misuse of GENMASK macro
-------------
Diffstat:
Makefile | 4 +-
arch/arm/mach-davinci/sleep.S | 1 +
arch/powerpc/kvm/powerpc.c | 5 +
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm.c | 6 ++
arch/x86/kvm/vmx.c | 6 ++
arch/x86/kvm/x86.c | 16 +++
arch/x86/mm/fault.c | 15 ++-
drivers/acpi/arm64/iort.c | 4 +-
drivers/block/drbd/drbd_receiver.c | 14 ++-
drivers/block/loop.c | 2 +-
drivers/cpufreq/pasemi-cpufreq.c | 23 ++---
drivers/crypto/ccp/ccp-crypto-aes-galois.c | 14 +++
drivers/crypto/ccp/ccp-ops.c | 139 +++++++++++++++++++--------
drivers/firmware/Kconfig | 5 +-
drivers/firmware/iscsi_ibft.c | 4 +
drivers/gpu/drm/drm_framebuffer.c | 2 +-
drivers/gpu/drm/i915/intel_dsi_pll.c | 4 +-
drivers/hid/hid-sony.c | 15 ++-
drivers/hwmon/nct6775.c | 3 +-
drivers/hwmon/nct7802.c | 6 +-
drivers/iio/adc/max9611.c | 2 +-
drivers/input/mouse/synaptics.c | 1 +
drivers/misc/Makefile | 3 +-
drivers/mmc/host/cavium.c | 4 +-
drivers/net/can/rcar/rcar_canfd.c | 9 +-
drivers/net/can/usb/peak_usb/pcan_usb_core.c | 8 +-
drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 2 +-
drivers/net/can/usb/peak_usb/pcan_usb_pro.c | 2 +-
drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 29 ++++--
drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 2 +
drivers/net/wireless/marvell/mwifiex/main.h | 1 +
drivers/net/wireless/marvell/mwifiex/scan.c | 3 +-
drivers/s390/cio/qdio_main.c | 12 +--
drivers/s390/cio/vfio_ccw_cp.c | 4 +-
drivers/scsi/device_handler/scsi_dh_alua.c | 7 +-
drivers/scsi/ibmvscsi/ibmvfc.c | 2 +-
drivers/scsi/megaraid/megaraid_sas_base.c | 3 +
drivers/tty/tty_ldsem.c | 5 +-
drivers/usb/core/devio.c | 2 -
drivers/usb/host/xhci-rcar.c | 9 +-
drivers/usb/misc/iowarrior.c | 7 +-
drivers/usb/misc/yurex.c | 2 +-
fs/cifs/smb2pdu.c | 7 +-
fs/nfs/nfs4proc.c | 12 ++-
include/linux/ccp.h | 2 +
include/linux/kvm_host.h | 1 +
include/net/tcp.h | 3 +
include/sound/compress_driver.h | 5 +-
kernel/events/core.c | 2 +-
lib/test_firmware.c | 5 +-
mm/vmalloc.c | 9 ++
net/ipv4/netfilter/ipt_rpfilter.c | 1 +
net/ipv6/netfilter/ip6t_rpfilter.c | 8 +-
net/mac80211/driver-ops.c | 13 ++-
net/mac80211/mlme.c | 10 ++
net/netfilter/nfnetlink.c | 2 +-
net/netfilter/nft_hash.c | 2 +-
scripts/sphinx-pre-install | 2 +-
sound/core/compress_offload.c | 60 +++++++++---
sound/firewire/packets-buffer.c | 2 +-
sound/pci/hda/hda_controller.c | 13 ++-
sound/pci/hda/hda_controller.h | 2 +-
sound/pci/hda/hda_intel.c | 63 +++++++++++-
sound/sound_core.c | 3 +-
tools/perf/arch/s390/util/machine.c | 31 +++++-
tools/perf/builtin-probe.c | 10 ++
tools/perf/util/header.c | 2 +-
tools/perf/util/machine.c | 3 +-
tools/perf/util/machine.h | 2 +-
tools/perf/util/symbol.c | 7 +-
tools/perf/util/symbol.h | 1 +
tools/perf/util/thread.c | 12 ++-
virt/kvm/kvm_main.c | 25 ++++-
74 files changed, 558 insertions(+), 170 deletions(-)
After _gadget_stop_activity is executed, we can consider the hardware
operation for gadget has finished, and the udc can be stopped and enter
low power mode. So, any later hardware operations (from usb_ep_ops APIs
or usb_gadget_ops APIs) should be considered invalid, any deinitializatons
has been covered at _gadget_stop_activity.
I meet this problem when I plug out usb cable from PC using mass_storage
gadget, my callstack like: vbus interrupt->.vbus_session->
composite_disconnect ->pm_runtime_put_sync(&_gadget->dev),
the composite_disconnect will call fsg_disable, but fsg_disable calls
usb_ep_disable using async way, there are register accesses for
usb_ep_disable. So sometimes, I get system hang due to visit register
without clock, sometimes not.
The Linux Kernel USB maintainer Alan Stern suggests this kinds of solution.
See: http://marc.info/?l=linux-usb&m=138541769810983&w=2.
Cc: <stable(a)vger.kernel.org> #v4.9+
Signed-off-by: Peter Chen <peter.chen(a)nxp.com>
---
This patch is at NXP internal tree long time, and no issues have found.
Submit to mainline kenrel.
drivers/usb/chipidea/udc.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
index 053432d79bf7..8f18e7b6cadf 100644
--- a/drivers/usb/chipidea/udc.c
+++ b/drivers/usb/chipidea/udc.c
@@ -709,12 +709,6 @@ static int _gadget_stop_activity(struct usb_gadget *gadget)
struct ci_hdrc *ci = container_of(gadget, struct ci_hdrc, gadget);
unsigned long flags;
- spin_lock_irqsave(&ci->lock, flags);
- ci->gadget.speed = USB_SPEED_UNKNOWN;
- ci->remote_wakeup = 0;
- ci->suspended = 0;
- spin_unlock_irqrestore(&ci->lock, flags);
-
/* flush all endpoints */
gadget_for_each_ep(ep, gadget) {
usb_ep_fifo_flush(ep);
@@ -732,6 +726,12 @@ static int _gadget_stop_activity(struct usb_gadget *gadget)
ci->status = NULL;
}
+ spin_lock_irqsave(&ci->lock, flags);
+ ci->gadget.speed = USB_SPEED_UNKNOWN;
+ ci->remote_wakeup = 0;
+ ci->suspended = 0;
+ spin_unlock_irqrestore(&ci->lock, flags);
+
return 0;
}
@@ -1303,6 +1303,10 @@ static int ep_disable(struct usb_ep *ep)
return -EBUSY;
spin_lock_irqsave(hwep->lock, flags);
+ if (hwep->ci->gadget.speed == USB_SPEED_UNKNOWN) {
+ spin_unlock_irqrestore(hwep->lock, flags);
+ return 0;
+ }
/* only internal SW should disable ctrl endpts */
@@ -1392,6 +1396,10 @@ static int ep_queue(struct usb_ep *ep, struct usb_request *req,
return -EINVAL;
spin_lock_irqsave(hwep->lock, flags);
+ if (hwep->ci->gadget.speed == USB_SPEED_UNKNOWN) {
+ spin_unlock_irqrestore(hwep->lock, flags);
+ return 0;
+ }
retval = _ep_queue(ep, req, gfp_flags);
spin_unlock_irqrestore(hwep->lock, flags);
return retval;
@@ -1415,8 +1423,8 @@ static int ep_dequeue(struct usb_ep *ep, struct usb_request *req)
return -EINVAL;
spin_lock_irqsave(hwep->lock, flags);
-
- hw_ep_flush(hwep->ci, hwep->num, hwep->dir);
+ if (hwep->ci->gadget.speed != USB_SPEED_UNKNOWN)
+ hw_ep_flush(hwep->ci, hwep->num, hwep->dir);
list_for_each_entry_safe(node, tmpnode, &hwreq->tds, td) {
dma_pool_free(hwep->td_pool, node->ptr, node->dma);
@@ -1487,6 +1495,10 @@ static void ep_fifo_flush(struct usb_ep *ep)
}
spin_lock_irqsave(hwep->lock, flags);
+ if (hwep->ci->gadget.speed == USB_SPEED_UNKNOWN) {
+ spin_unlock_irqrestore(hwep->lock, flags);
+ return;
+ }
hw_ep_flush(hwep->ci, hwep->num, hwep->dir);
@@ -1559,6 +1571,10 @@ static int ci_udc_wakeup(struct usb_gadget *_gadget)
int ret = 0;
spin_lock_irqsave(&ci->lock, flags);
+ if (ci->gadget.speed == USB_SPEED_UNKNOWN) {
+ spin_unlock_irqrestore(&ci->lock, flags);
+ return 0;
+ }
if (!ci->remote_wakeup) {
ret = -EOPNOTSUPP;
goto out;
--
2.17.1
As reported in [1], the hisi-lpc driver has certain issues in handling
logical PIO regions, specifically unregistering regions.
This series add a method to unregister a logical PIO region, and fixes up
the driver to use them.
RCU usage in logical PIO code looks to always have been broken, so that
is fixed also. This is not a major fix as the list which RCU protects
would be rarely modified.
At this point, there are still separate ongoing discussions about how to
stop the logical PIO and PCI host bridge code leaking ranges, as in [2],
which I haven't had a chance to look at recently.
Hopefully this series can go through the arm soc tree and the maintainers
have no issue with that. I'm talking specifically about the logical PIO
code, which went through PCI tree on only previous upstreaming.
[1] https://lore.kernel.org/lkml/1560770148-57960-1-git-send-email-john.garry@h…
[2] https://lore.kernel.org/lkml/4b24fd36-e716-7c5e-31cc-13da727802e7@huawei.co…
Changes since v3:
https://lore.kernel.org/lkml/1561566418-22714-1-git-send-email-john.garry@h…
- drop optimisation patch (lib: logic_pio: Enforce LOGIC_PIO_INDIRECT
region ops are set at registration)
Not a fix
- rebase to v5.3-rc2
- cc stable
Change since v2:
- Add hisi_lpc_acpi_remove() stub to fix build for !CONFIG_ACPI
Changes since v1:
- Add more reasoning in RCU fix patch
- Create separate patch to change LOGIC_PIO_CPU_MMIO registration to
accomodate unregistration
John Garry (5):
lib: logic_pio: Fix RCU usage
lib: logic_pio: Avoid possible overlap for unregistering regions
lib: logic_pio: Add logic_pio_unregister_range()
bus: hisi_lpc: Unregister logical PIO range to avoid potential
use-after-free
bus: hisi_lpc: Add .remove method to avoid driver unbind crash
drivers/bus/hisi_lpc.c | 47 +++++++++++++++++++++----
include/linux/logic_pio.h | 1 +
lib/logic_pio.c | 73 +++++++++++++++++++++++++++++----------
3 files changed, 96 insertions(+), 25 deletions(-)
--
2.17.1
This is a note to let you know that I've just added the patch titled
tty/serial: atmel: reschedule TX after RX was started
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 1bc102260d278de0af89c58536f4bbabd2ef28be Mon Sep 17 00:00:00 2001
From: Razvan Stefanescu <razvan.stefanescu(a)microchip.com>
Date: Tue, 13 Aug 2019 10:40:25 +0300
Subject: tty/serial: atmel: reschedule TX after RX was started
When half-duplex RS485 communication is used, after RX is started, TX
tasklet still needs to be scheduled tasklet. This avoids console freezing
when more data is to be transmitted, if the serial communication is not
closed.
Fixes: 69646d7a3689 ("tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped")
Signed-off-by: Razvan Stefanescu <razvan.stefanescu(a)microchip.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20190813074025.16218-1-razvan.stefanescu@microchi…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/atmel_serial.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c
index 19a85d6fe3d2..9a54c9e6d36e 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1400,7 +1400,6 @@ atmel_handle_transmit(struct uart_port *port, unsigned int pending)
atmel_port->hd_start_rx = false;
atmel_start_rx(port);
- return;
}
atmel_tasklet_schedule(atmel_port, &atmel_port->tasklet_tx);
--
2.22.1
Testing with RTL8822BE hardware, when available memory is low, we
frequently see a kernel panic and system freeze.
First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
rx routine starvation
WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
[ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
Then we see a variety of different error conditions and kernel panics,
such as this one (trimmed):
rtw_pci 0000:02:00.0: pci bus timeout, check dma status
skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:105!
invalid opcode: 0000 [#1] SMP NOPTI
RIP: 0010:skb_panic+0x43/0x45
When skb allocation fails and the "rx routine starvation" is hit, the
function returns immediately without updating the RX ring. At this
point, the RX ring may continue referencing an old skb which was already
handed off to ieee80211_rx_irqsafe(). When it comes to be used again,
bad things happen.
This patch allocates a new, data-sized skb first in RX ISR. After
copying the data in, we pass it to the upper layers. However, if skb
allocation fails, we effectively drop the frame. In both cases, the
original, full size ring skb is reused.
In addition, by fixing the kernel crash, the RX routine should now
generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053
Signed-off-by: Jian-Hong Pan <jian-hong(a)endlessm.com>
Cc: <stable(a)vger.kernel.org>
---
v2:
- Allocate new data-sized skb and put data into it, then pass it to
mac80211. Reuse the original skb in RX ring by DMA sync.
- Modify the commit message.
- Introduce following [PATCH v3 2/2] rtw88: pci: Use DMA sync instead
of remapping in RX ISR.
v3:
- Same as v2.
drivers/net/wireless/realtek/rtw88/pci.c | 49 +++++++++++-------------
1 file changed, 22 insertions(+), 27 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c
index cfe05ba7280d..e9fe3ad896c8 100644
--- a/drivers/net/wireless/realtek/rtw88/pci.c
+++ b/drivers/net/wireless/realtek/rtw88/pci.c
@@ -763,6 +763,7 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci,
u32 pkt_offset;
u32 pkt_desc_sz = chip->rx_pkt_desc_sz;
u32 buf_desc_sz = chip->rx_buf_desc_sz;
+ u32 new_len;
u8 *rx_desc;
dma_addr_t dma;
@@ -790,40 +791,34 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci,
pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz +
pkt_stat.shift;
- if (pkt_stat.is_c2h) {
- /* keep rx_desc, halmac needs it */
- skb_put(skb, pkt_stat.pkt_len + pkt_offset);
+ /* discard current skb if the new skb cannot be allocated as a
+ * new one in rx ring later
+ */
+ new_len = pkt_stat.pkt_len + pkt_offset;
+ new = dev_alloc_skb(new_len);
+ if (WARN_ONCE(!new, "rx routine starvation\n"))
+ goto next_rp;
+
+ /* put the DMA data including rx_desc from phy to new skb */
+ skb_put_data(new, skb->data, new_len);
- /* pass offset for further operation */
- *((u32 *)skb->cb) = pkt_offset;
- skb_queue_tail(&rtwdev->c2h_queue, skb);
+ if (pkt_stat.is_c2h) {
+ /* pass rx_desc & offset for further operation */
+ *((u32 *)new->cb) = pkt_offset;
+ skb_queue_tail(&rtwdev->c2h_queue, new);
ieee80211_queue_work(rtwdev->hw, &rtwdev->c2h_work);
} else {
- /* remove rx_desc, maybe use skb_pull? */
- skb_put(skb, pkt_stat.pkt_len);
- skb_reserve(skb, pkt_offset);
-
- /* alloc a smaller skb to mac80211 */
- new = dev_alloc_skb(pkt_stat.pkt_len);
- if (!new) {
- new = skb;
- } else {
- skb_put_data(new, skb->data, skb->len);
- dev_kfree_skb_any(skb);
- }
- /* TODO: merge into rx.c */
- rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
+ /* remove rx_desc */
+ skb_pull(new, pkt_offset);
+
+ rtw_rx_stats(rtwdev, pkt_stat.vif, new);
memcpy(new->cb, &rx_status, sizeof(rx_status));
ieee80211_rx_irqsafe(rtwdev->hw, new);
}
- /* skb delivered to mac80211, alloc a new one in rx ring */
- new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
- if (WARN(!new, "rx routine starvation\n"))
- return;
-
- ring->buf[cur_rp] = new;
- rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);
+next_rp:
+ /* new skb delivered to mac80211, re-enable original skb DMA */
+ rtw_pci_reset_rx_desc(rtwdev, skb, ring, cur_rp, buf_desc_sz);
/* host read next element in ring */
if (++cur_rp >= ring->r.len)
--
2.22.0
The patch titled
Subject: mm, vmscan: do not special-case slab reclaim when watermarks are boosted
has been removed from the -mm tree. Its filename was
mm-vmscan-do-not-special-case-slab-reclaim-when-watermarks-are-boosted.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mel Gorman <mgorman(a)techsingularity.net>
Subject: mm, vmscan: do not special-case slab reclaim when watermarks are boosted
Dave Chinner reported a problem pointing a finger at commit 1c30844d2dfe
("mm: reclaim small amounts of memory when an external fragmentation event
occurs"). The report is extensive (see
https://lore.kernel.org/linux-mm/20190807091858.2857-1-david@fromorbit.com/)
and it's worth recording the most relevant parts (colorful language and
typos included).
When running a simple, steady state 4kB file creation test to
simulate extracting tarballs larger than memory full of small
files into the filesystem, I noticed that once memory fills up
the cache balance goes to hell.
The workload is creating one dirty cached inode for every dirty
page, both of which should require a single IO each to clean and
reclaim, and creation of inodes is throttled by the rate at which
dirty writeback runs at (via balance dirty pages). Hence the ingest
rate of new cached inodes and page cache pages is identical and
steady. As a result, memory reclaim should quickly find a steady
balance between page cache and inode caches.
The moment memory fills, the page cache is reclaimed at a much
faster rate than the inode cache, and evidence suggests that
the inode cache shrinker is not being called when large batches
of pages are being reclaimed. In roughly the same time period
that it takes to fill memory with 50% pages and 50% slab caches,
memory reclaim reduces the page cache down to just dirty pages
and slab caches fill the entirety of memory.
The LRU is largely full of dirty pages, and we're getting spikes
of random writeback from memory reclaim so it's all going to shit.
Behaviour never recovers, the page cache remains pinned at just
dirty pages, and nothing I could tune would make any difference.
vfs_cache_pressure makes no difference - I would set it so high
it should trim the entire inode caches in a single pass, yet it
didn't do anything. It was clear from tracing and live telemetry
that the shrinkers were pretty much not running except when
there was absolutely no memory free at all, and then they did
the minimum necessary to free memory to make progress.
So I went looking at the code, trying to find places where pages
got reclaimed and the shrinkers weren't called. There's only one
- kswapd doing boosted reclaim as per commit 1c30844d2dfe ("mm:
reclaim small amounts of memory when an external fragmentation
event occurs").
The watermark boosting introduced by the commit is triggered in response
to an allocation "fragmentation event". The boosting was not intended to
target THP specifically and triggers even if THP is disabled. However,
with Dave's perfectly reasonable workload, fragmentation events can be
very common given the ratio of slab to page cache allocations so boosting
remains active for long periods of time.
As high-order allocations might use compaction and compaction cannot move
slab pages the decision was made in the commit to special-case kswapd when
watermarks are boosted -- kswapd avoids reclaiming slab as reclaiming slab
does not directly help compaction.
As Dave notes, this decision means that slab can be artificially protected
for long periods of time and messes up the balance with slab and page
caches.
Removing the special casing can still indirectly help avoid
fragmentation by avoiding fragmentation-causing events due to slab
allocation as pages from a slab pageblock will have some slab objects
freed. Furthermore, with the special casing, reclaim behaviour is
unpredictable as kswapd sometimes examines slab and sometimes does not
in a manner that is tricky to tune or analyse.
This patch removes the special casing. The downside is that this is not a
universal performance win. Some benchmarks that depend on the residency
of data when rereading metadata may see a regression when slab reclaim is
restored to its original behaviour. Similarly, some benchmarks that only
read-once or write-once may perform better when page reclaim is too
aggressive. The primary upside is that slab shrinker is less surprising
(arguably more sane but that's a matter of opinion), behaves consistently
regardless of the fragmentation state of the system and properly obeys VM
sysctls.
A fsmark benchmark configuration was constructed similar to what Dave
reported and is codified by the mmtest configuration
config-io-fsmark-small-file-stream. It was evaluated on a 1-socket
machine to avoid dealing with NUMA-related issues and the timing of
reclaim. The storage was an SSD Samsung Evo and a fresh trimmed XFS
filesystem was used for the test data.
This is not an exact replication of Dave's setup. The configuration
scales its parameters depending on the memory size of the SUT to behave
similarly across machines. The parameters mean the first sample reported
by fs_mark is using 50% of RAM which will barely be throttled and look
like a big outlier. Dave used fake NUMA to have multiple kswapd instances
which I didn't replicate. Finally, the number of iterations differ from
Dave's test as the target disk was not large enough. While not identical,
it should be representative.
fsmark
5.3.0-rc3 5.3.0-rc3
vanilla shrinker-v1r1
Min 1-files/sec 4444.80 ( 0.00%) 4765.60 ( 7.22%)
1st-qrtle 1-files/sec 5005.10 ( 0.00%) 5091.70 ( 1.73%)
2nd-qrtle 1-files/sec 4917.80 ( 0.00%) 4855.60 ( -1.26%)
3rd-qrtle 1-files/sec 4667.40 ( 0.00%) 4831.20 ( 3.51%)
Max-1 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Max-5 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Max-10 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Max-90 1-files/sec 4649.60 ( 0.00%) 4780.70 ( 2.82%)
Max-95 1-files/sec 4491.00 ( 0.00%) 4768.20 ( 6.17%)
Max-99 1-files/sec 4491.00 ( 0.00%) 4768.20 ( 6.17%)
Max 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Hmean 1-files/sec 5004.75 ( 0.00%) 5075.96 ( 1.42%)
Stddev 1-files/sec 1778.70 ( 0.00%) 1369.66 ( 23.00%)
CoeffVar 1-files/sec 33.70 ( 0.00%) 26.05 ( 22.71%)
BHmean-99 1-files/sec 5053.72 ( 0.00%) 5101.52 ( 0.95%)
BHmean-95 1-files/sec 5053.72 ( 0.00%) 5101.52 ( 0.95%)
BHmean-90 1-files/sec 5107.05 ( 0.00%) 5131.41 ( 0.48%)
BHmean-75 1-files/sec 5208.45 ( 0.00%) 5206.68 ( -0.03%)
BHmean-50 1-files/sec 5405.53 ( 0.00%) 5381.62 ( -0.44%)
BHmean-25 1-files/sec 6179.75 ( 0.00%) 6095.14 ( -1.37%)
5.3.0-rc3 5.3.0-rc3
vanillashrinker-v1r1
Duration User 501.82 497.29
Duration System 4401.44 4424.08
Duration Elapsed 8124.76 8358.05
This is showing a slight skew for the max result representing a large
outlier for the 1st, 2nd and 3rd quartile are similar indicating that the
bulk of the results show little difference. Note that an earlier version
of the fsmark configuration showed a regression but that included more
samples taken while memory was still filling.
Note that the elapsed time is higher. Part of this is that the
configuration included time to delete all the test files when the test
completes -- the test automation handles the possibility of testing fsmark
with multiple thread counts. Without the patch, many of these objects
would be memory resident which is part of what the patch is addressing.
There are other important observations that justify the patch.
1. With the vanilla kernel, the number of dirty pages in the system
is very low for much of the test. With this patch, dirty pages
is generally kept at 10% which matches vm.dirty_background_ratio
which is normal expected historical behaviour.
2. With the vanilla kernel, the ratio of Slab/Pagecache is close to
0.95 for much of the test i.e. Slab is being left alone and dominating
memory consumption. With the patch applied, the ratio varies between
0.35 and 0.45 with the bulk of the measured ratios roughly half way
between those values. This is a different balance to what Dave reported
but it was at least consistent.
3. Slabs are scanned throughout the entire test with the patch applied.
The vanille kernel has periods with no scan activity and then relatively
massive spikes.
4. Without the patch, kswapd scan rates are very variable. With the patch,
the scan rates remain quite stead.
4. Overall vmstats are closer to normal expectations
5.3.0-rc3 5.3.0-rc3
vanilla shrinker-v1r1
Ops Direct pages scanned 99388.00 328410.00
Ops Kswapd pages scanned 45382917.00 33451026.00
Ops Kswapd pages reclaimed 30869570.00 25239655.00
Ops Direct pages reclaimed 74131.00 5830.00
Ops Kswapd efficiency % 68.02 75.45
Ops Kswapd velocity 5585.75 4002.25
Ops Page reclaim immediate 1179721.00 430927.00
Ops Slabs scanned 62367361.00 73581394.00
Ops Direct inode steals 2103.00 1002.00
Ops Kswapd inode steals 570180.00 5183206.00
o Vanilla kernel is hitting direct reclaim more frequently,
not very much in absolute terms but the fact the patch
reduces it is interesting
o "Page reclaim immediate" in the vanilla kernel indicates
dirty pages are being encountered at the tail of the LRU.
This is generally bad and means in this case that the LRU
is not long enough for dirty pages to be cleaned by the
background flush in time. This is much reduced by the
patch.
o With the patch, kswapd is reclaiming 10 times more slab
pages than with the vanilla kernel. This is indicative
of the watermark boosting over-protecting slab
A more complete set of tests were run that were part of the basis
for introducing boosting and while there are some differences, they
are well within tolerances.
Bottom line, the special casing kswapd to avoid slab behaviour is
unpredictable and can lead to abnormal results for normal workloads. This
patch restores the expected behaviour that slab and page cache is balanced
consistently for a workload with a steady allocation ratio of
slab/pagecache pages. It also means that if there are workloads that
favour the preservation of slab over pagecache that it can be tuned via
vm.vfs_cache_pressure where as the vanilla kernel effectively ignores the
parameter when boosting is active.
Link: http://lkml.kernel.org/r/20190808182946.GM2739@techsingularity.net
Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
Signed-off-by: Mel Gorman <mgorman(a)techsingularity.net>
Reviewed-by: Dave Chinner <dchinner(a)redhat.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [5.0+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-do-not-special-case-slab-reclaim-when-watermarks-are-boosted
+++ a/mm/vmscan.c
@@ -88,9 +88,6 @@ struct scan_control {
/* Can pages be swapped as part of reclaim? */
unsigned int may_swap:1;
- /* e.g. boosted watermark reclaim leaves slabs alone */
- unsigned int may_shrinkslab:1;
-
/*
* Cgroups are not reclaimed below their configured memory.low,
* unless we threaten to OOM. If any cgroups are skipped due to
@@ -2714,10 +2711,8 @@ static bool shrink_node(pg_data_t *pgdat
shrink_node_memcg(pgdat, memcg, sc, &lru_pages);
node_lru_pages += lru_pages;
- if (sc->may_shrinkslab) {
- shrink_slab(sc->gfp_mask, pgdat->node_id,
- memcg, sc->priority);
- }
+ shrink_slab(sc->gfp_mask, pgdat->node_id, memcg,
+ sc->priority);
/* Record the group's reclaim efficiency */
vmpressure(sc->gfp_mask, memcg, false,
@@ -3194,7 +3189,6 @@ unsigned long try_to_free_pages(struct z
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = 1,
- .may_shrinkslab = 1,
};
/*
@@ -3238,7 +3232,6 @@ unsigned long mem_cgroup_shrink_node(str
.may_unmap = 1,
.reclaim_idx = MAX_NR_ZONES - 1,
.may_swap = !noswap,
- .may_shrinkslab = 1,
};
unsigned long lru_pages;
@@ -3286,7 +3279,6 @@ unsigned long try_to_free_mem_cgroup_pag
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = may_swap,
- .may_shrinkslab = 1,
};
set_task_reclaim_state(current, &sc.reclaim_state);
@@ -3598,7 +3590,6 @@ restart:
*/
sc.may_writepage = !laptop_mode && !nr_boost_reclaim;
sc.may_swap = !nr_boost_reclaim;
- sc.may_shrinkslab = !nr_boost_reclaim;
/*
* Do some background aging of the anon list, to give
_
Patches currently in -mm which might be from mgorman(a)techsingularity.net are
The patch titled
Subject: seq_file: fix problem when seeking mid-record
has been removed from the -mm tree. Its filename was
seq_file-fix-problem-when-seeking-mid-record.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: NeilBrown <neilb(a)suse.com>
Subject: seq_file: fix problem when seeking mid-record
If you use lseek or similar (e.g. pread) to access a location in a
seq_file file that is within a record, rather than at a record boundary,
then the first read will return the remainder of the record, and the
second read will return the whole of that same record (instead of the next
record). When seeking to a record boundary, the next record is correctly
returned.
This bug was introduced by a recent patch (identified below). Before that
patch, seq_read() would increment m->index when the last of the buffer was
returned (m->count == 0). After that patch, we rely on ->next to
increment m->index after filling the buffer - but there was one place
where that didn't happen.
Link: https://lkml.kernel.org/lkml/877e7xl029.fsf@notabene.neil.brown.name/
Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface")
Signed-off-by: NeilBrown <neilb(a)suse.com>
Reported-by: Sergei Turchanov <turchanov(a)farpost.com>
Tested-by: Sergei Turchanov <turchanov(a)farpost.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Markus Elfring <Markus.Elfring(a)web.de>
Cc: <stable(a)vger.kernel.org> [4.19+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/seq_file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/seq_file.c~seq_file-fix-problem-when-seeking-mid-record
+++ a/fs/seq_file.c
@@ -119,6 +119,7 @@ static int traverse(struct seq_file *m,
}
if (seq_has_overflowed(m))
goto Eoverflow;
+ p = m->op->next(m, p, &m->index);
if (pos + m->count > offset) {
m->from = offset - pos;
m->count -= m->from;
@@ -126,7 +127,6 @@ static int traverse(struct seq_file *m,
}
pos += m->count;
m->count = 0;
- p = m->op->next(m, p, &m->index);
if (pos == offset)
break;
}
_
Patches currently in -mm which might be from neilb(a)suse.com are
The patch titled
Subject: mm/usercopy: use memory range to be accessed for wraparound check
has been removed from the -mm tree. Its filename was
mm-usercopy-use-memory-range-to-be-accessed-for-wraparound-check.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: "Isaac J. Manjarres" <isaacm(a)codeaurora.org>
Subject: mm/usercopy: use memory range to be accessed for wraparound check
Currently, when checking to see if accessing n bytes starting at address
"ptr" will cause a wraparound in the memory addresses, the check in
check_bogus_address() adds an extra byte, which is incorrect, as the range
of addresses that will be accessed is [ptr, ptr + (n - 1)].
This can lead to incorrectly detecting a wraparound in the memory address,
when trying to read 4 KB from memory that is mapped to the the last
possible page in the virtual address space, when in fact, accessing that
range of memory would not cause a wraparound to occur.
Use the memory range that will actually be accessed when considering if
accessing a certain amount of bytes will cause the memory address to wrap
around.
Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeauror…
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Co-developed-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Reviewed-by: William Kucharski <william.kucharski(a)oracle.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Trilok Soni <tsoni(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/usercopy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/usercopy.c~mm-usercopy-use-memory-range-to-be-accessed-for-wraparound-check
+++ a/mm/usercopy.c
@@ -147,7 +147,7 @@ static inline void check_bogus_address(c
bool to_user)
{
/* Reject if object wraps past end of memory. */
- if (ptr + n < ptr)
+ if (ptr + (n - 1) < ptr)
usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n);
/* Reject if NULL or ZERO-allocation. */
_
Patches currently in -mm which might be from isaacm(a)codeaurora.org are
The patch titled
Subject: mm/memcontrol.c: fix use after free in mem_cgroup_iter()
has been removed from the -mm tree. Its filename was
mm-memcontrol-fix-use-after-free-in-mem_cgroup_iter.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Miles Chen <miles.chen(a)mediatek.com>
Subject: mm/memcontrol.c: fix use after free in mem_cgroup_iter()
This patch is sent to report an use after free in mem_cgroup_iter() after
merging commit be2657752e9e ("mm: memcg: fix use after free in
mem_cgroup_iter()").
I work with android kernel tree (4.9 & 4.14), and commit be2657752e9e
("mm: memcg: fix use after free in mem_cgroup_iter()") has been merged to
the trees. However, I can still observe use after free issues addressed
in the commit be2657752e9e. (on low-end devices, a few times this month)
backtrace:
css_tryget <- crash here
mem_cgroup_iter
shrink_node
shrink_zones
do_try_to_free_pages
try_to_free_pages
__perform_reclaim
__alloc_pages_direct_reclaim
__alloc_pages_slowpath
__alloc_pages_nodemask
To debug, I poisoned mem_cgroup before freeing it:
static void __mem_cgroup_free(struct mem_cgroup *memcg)
for_each_node(node)
free_mem_cgroup_per_node_info(memcg, node);
free_percpu(memcg->stat);
+ /* poison memcg before freeing it */
+ memset(memcg, 0x78, sizeof(struct mem_cgroup));
kfree(memcg);
}
The coredump shows the position=0xdbbc2a00 is freed.
(gdb) p/x ((struct mem_cgroup_per_node *)0xe5009e00)->iter[8]
$13 = {position = 0xdbbc2a00, generation = 0x2efd}
0xdbbc2a00: 0xdbbc2e00 0x00000000 0xdbbc2800 0x00000100
0xdbbc2a10: 0x00000200 0x78787878 0x00026218 0x00000000
0xdbbc2a20: 0xdcad6000 0x00000001 0x78787800 0x00000000
0xdbbc2a30: 0x78780000 0x00000000 0x0068fb84 0x78787878
0xdbbc2a40: 0x78787878 0x78787878 0x78787878 0xe3fa5cc0
0xdbbc2a50: 0x78787878 0x78787878 0x00000000 0x00000000
0xdbbc2a60: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a70: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a80: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a90: 0x00000001 0x00000000 0x00000000 0x00100000
0xdbbc2aa0: 0x00000001 0xdbbc2ac8 0x00000000 0x00000000
0xdbbc2ab0: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2ac0: 0x00000000 0x00000000 0xe5b02618 0x00001000
0xdbbc2ad0: 0x00000000 0x78787878 0x78787878 0x78787878
0xdbbc2ae0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2af0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b00: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b10: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b20: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b30: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b40: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b50: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b60: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b70: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b80: 0x78787878 0x78787878 0x00000000 0x78787878
0xdbbc2b90: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2ba0: 0x78787878 0x78787878 0x78787878 0x78787878
In the reclaim path, try_to_free_pages() does not setup
sc.target_mem_cgroup and sc is passed to do_try_to_free_pages(), ...,
shrink_node().
In mem_cgroup_iter(), root is set to root_mem_cgroup because
sc->target_mem_cgroup is NULL. It is possible to assign a memcg to
root_mem_cgroup.nodeinfo.iter in mem_cgroup_iter().
try_to_free_pages
struct scan_control sc = {...}, target_mem_cgroup is 0x0;
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup *root = sc->target_mem_cgroup;
memcg = mem_cgroup_iter(root, NULL, &reclaim);
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
My device uses memcg non-hierarchical mode. When we release a memcg:
invalidate_reclaim_iterators() reaches only dead_memcg and its parents.
If non-hierarchical mode is used, invalidate_reclaim_iterators() never
reaches root_mem_cgroup.
static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
{
struct mem_cgroup *memcg = dead_memcg;
for (; memcg; memcg = parent_mem_cgroup(memcg)
...
}
So the use after free scenario looks like:
CPU1 CPU2
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
invalidate_reclaim_iterators(memcg);
...
__mem_cgroup_free()
kfree(memcg);
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id);
iter = &mz->iter[reclaim->priority];
pos = READ_ONCE(iter->position);
css_tryget(&pos->css) <- use after free
To avoid this, we should also invalidate root_mem_cgroup.nodeinfo.iter in
invalidate_reclaim_iterators().
[cai(a)lca.pw: fix -Wparentheses compilation warning]
Link: http://lkml.kernel.org/r/1564580753-17531-1-git-send-email-cai@lca.pw
Link: http://lkml.kernel.org/r/20190730015729.4406-1-miles.chen@mediatek.com
Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
Signed-off-by: Miles Chen <miles.chen(a)mediatek.com>
Signed-off-by: Qian Cai <cai(a)lca.pw>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 39 +++++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 10 deletions(-)
--- a/mm/memcontrol.c~mm-memcontrol-fix-use-after-free-in-mem_cgroup_iter
+++ a/mm/memcontrol.c
@@ -1130,26 +1130,45 @@ void mem_cgroup_iter_break(struct mem_cg
css_put(&prev->css);
}
-static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
+static void __invalidate_reclaim_iterators(struct mem_cgroup *from,
+ struct mem_cgroup *dead_memcg)
{
- struct mem_cgroup *memcg = dead_memcg;
struct mem_cgroup_reclaim_iter *iter;
struct mem_cgroup_per_node *mz;
int nid;
int i;
- for (; memcg; memcg = parent_mem_cgroup(memcg)) {
- for_each_node(nid) {
- mz = mem_cgroup_nodeinfo(memcg, nid);
- for (i = 0; i <= DEF_PRIORITY; i++) {
- iter = &mz->iter[i];
- cmpxchg(&iter->position,
- dead_memcg, NULL);
- }
+ for_each_node(nid) {
+ mz = mem_cgroup_nodeinfo(from, nid);
+ for (i = 0; i <= DEF_PRIORITY; i++) {
+ iter = &mz->iter[i];
+ cmpxchg(&iter->position,
+ dead_memcg, NULL);
}
}
}
+static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
+{
+ struct mem_cgroup *memcg = dead_memcg;
+ struct mem_cgroup *last;
+
+ do {
+ __invalidate_reclaim_iterators(memcg, dead_memcg);
+ last = memcg;
+ } while ((memcg = parent_mem_cgroup(memcg)));
+
+ /*
+ * When cgruop1 non-hierarchy mode is used,
+ * parent_mem_cgroup() does not walk all the way up to the
+ * cgroup root (root_mem_cgroup). So we have to handle
+ * dead_memcg from cgroup root separately.
+ */
+ if (last != root_mem_cgroup)
+ __invalidate_reclaim_iterators(root_mem_cgroup,
+ dead_memcg);
+}
+
/**
* mem_cgroup_scan_tasks - iterate over tasks of a memory cgroup hierarchy
* @memcg: hierarchy root
_
Patches currently in -mm which might be from miles.chen(a)mediatek.com are
The patch titled
Subject: mm/z3fold.c: fix z3fold_destroy_pool() race condition
has been removed from the -mm tree. Its filename was
mm-z3foldc-fix-z3fold_destroy_pool-race-condition.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Henry Burns <henryburns(a)google.com>
Subject: mm/z3fold.c: fix z3fold_destroy_pool() race condition
The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.
Calling z3fold_deregister_migration() before the workqueues are drained
means that there can be allocated pages referencing a freed inode,
causing any thread in compaction to be able to trip over the bad
pointer in PageMovable().
Link: http://lkml.kernel.org/r/20190726224810.79660-2-henryburns@google.com
Fixes: 1f862989b04a ("mm/z3fold.c: support page migration")
Signed-off-by: Henry Burns <henryburns(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Jonathan Adams <jwadams(a)google.com>
Cc: Vitaly Vul <vitaly.vul(a)sony.com>
Cc: Vitaly Wool <vitalywool(a)gmail.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Henry Burns <henrywolfeburns(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/z3fold.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/z3fold.c~mm-z3foldc-fix-z3fold_destroy_pool-race-condition
+++ a/mm/z3fold.c
@@ -817,16 +817,19 @@ out:
static void z3fold_destroy_pool(struct z3fold_pool *pool)
{
kmem_cache_destroy(pool->c_handle);
- z3fold_unregister_migration(pool);
/*
* We need to destroy pool->compact_wq before pool->release_wq,
* as any pending work on pool->compact_wq will call
* queue_work(pool->release_wq, &pool->work).
+ *
+ * There are still outstanding pages until both workqueues are drained,
+ * so we cannot unregister migration until then.
*/
destroy_workqueue(pool->compact_wq);
destroy_workqueue(pool->release_wq);
+ z3fold_unregister_migration(pool);
kfree(pool);
}
_
Patches currently in -mm which might be from henryburns(a)google.com are
mm-z3foldc-fix-race-between-migration-and-destruction.patch
The patch titled
Subject: mm/z3fold.c: fix z3fold_destroy_pool() ordering
has been removed from the -mm tree. Its filename was
mm-z3foldc-fix-z3fold_destroy_pool-ordering.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Henry Burns <henryburns(a)google.com>
Subject: mm/z3fold.c: fix z3fold_destroy_pool() ordering
The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.
If there is work queued on pool->compact_workqueue when it is called,
z3fold_destroy_pool() will do:
z3fold_destroy_pool()
destroy_workqueue(pool->release_wq)
destroy_workqueue(pool->compact_wq)
drain_workqueue(pool->compact_wq)
do_compact_page(zhdr)
kref_put(&zhdr->refcount)
__release_z3fold_page(zhdr, ...)
queue_work_on(pool->release_wq, &pool->work) *BOOM*
So compact_wq needs to be destroyed before release_wq.
Link: http://lkml.kernel.org/r/20190726224810.79660-1-henryburns@google.com
Fixes: 5d03a6613957 ("mm/z3fold.c: use kref to prevent page free/compact race")
Signed-off-by: Henry Burns <henryburns(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Jonathan Adams <jwadams(a)google.com>
Cc: Vitaly Vul <vitaly.vul(a)sony.com>
Cc: Vitaly Wool <vitalywool(a)gmail.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk
Cc: Henry Burns <henrywolfeburns(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/z3fold.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/z3fold.c~mm-z3foldc-fix-z3fold_destroy_pool-ordering
+++ a/mm/z3fold.c
@@ -818,8 +818,15 @@ static void z3fold_destroy_pool(struct z
{
kmem_cache_destroy(pool->c_handle);
z3fold_unregister_migration(pool);
- destroy_workqueue(pool->release_wq);
+
+ /*
+ * We need to destroy pool->compact_wq before pool->release_wq,
+ * as any pending work on pool->compact_wq will call
+ * queue_work(pool->release_wq, &pool->work).
+ */
+
destroy_workqueue(pool->compact_wq);
+ destroy_workqueue(pool->release_wq);
kfree(pool);
}
_
Patches currently in -mm which might be from henryburns(a)google.com are
mm-z3foldc-fix-race-between-migration-and-destruction.patch
The patch titled
Subject: mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind
has been removed from the -mm tree. Its filename was
mm-mempolicy-handle-vma-with-unmovable-pages-mapped-correctly-in-mbind.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Subject: mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind
When running syzkaller internally, we ran into the below bug on 4.9.x
kernel:
kernel BUG at mm/huge_memory.c:2124!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 1518 Comm: syz-executor107 Not tainted 4.9.168+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
task: ffff880067b34900 task.stack: ffff880068998000
RIP: 0010:[<ffffffff81895d6b>] [<ffffffff81895d6b>] split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
RSP: 0018:ffff88006899f980 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffffea00018f1700 RCX: 0000000000000000
RDX: 1ffffd400031e2e7 RSI: 0000000000000001 RDI: ffffea00018f1738
RBP: ffff88006899f9e8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: fffffbfff0d8b13e R12: ffffea00018f1400
R13: ffffea00018f1400 R14: ffffea00018f1720 R15: ffffea00018f1401
FS: 00007fa333996740(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000040 CR3: 0000000066b9c000 CR4: 00000000000606f0
Stack:
0000000000000246 ffff880067b34900 0000000000000000 ffff88007ffdc000
0000000000000000 ffff88006899f9e8 ffffffff812b4015 ffff880064c64e18
ffffea00018f1401 dffffc0000000000 ffffea00018f1700 0000000020ffd000
Call Trace:
[<ffffffff818490f1>] split_huge_page include/linux/huge_mm.h:100 [inline]
[<ffffffff818490f1>] queue_pages_pte_range+0x7e1/0x1480 mm/mempolicy.c:538
[<ffffffff817ed0da>] walk_pmd_range mm/pagewalk.c:50 [inline]
[<ffffffff817ed0da>] walk_pud_range mm/pagewalk.c:90 [inline]
[<ffffffff817ed0da>] walk_pgd_range mm/pagewalk.c:116 [inline]
[<ffffffff817ed0da>] __walk_page_range+0x44a/0xdb0 mm/pagewalk.c:208
[<ffffffff817edb94>] walk_page_range+0x154/0x370 mm/pagewalk.c:285
[<ffffffff81844515>] queue_pages_range+0x115/0x150 mm/mempolicy.c:694
[<ffffffff8184f493>] do_mbind mm/mempolicy.c:1241 [inline]
[<ffffffff8184f493>] SYSC_mbind+0x3c3/0x1030 mm/mempolicy.c:1370
[<ffffffff81850146>] SyS_mbind+0x46/0x60 mm/mempolicy.c:1352
[<ffffffff810097e2>] do_syscall_64+0x1d2/0x600 arch/x86/entry/common.c:282
[<ffffffff82ff6f93>] entry_SYSCALL_64_after_swapgs+0x5d/0xdb
Code: c7 80 1c 02 00 e8 26 0a 76 01 <0f> 0b 48 c7 c7 40 46 45 84 e8 4c
RIP [<ffffffff81895d6b>] split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
RSP <ffff88006899f980>
with the below test:
---8<---
uint64_t r[1] = {0xffffffffffffffff};
int main(void)
{
syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
intptr_t res = 0;
res = syscall(__NR_socket, 0x11, 3, 0x300);
if (res != -1)
r[0] = res;
*(uint32_t*)0x20000040 = 0x10000;
*(uint32_t*)0x20000044 = 1;
*(uint32_t*)0x20000048 = 0xc520;
*(uint32_t*)0x2000004c = 1;
syscall(__NR_setsockopt, r[0], 0x107, 0xd, 0x20000040, 0x10);
syscall(__NR_mmap, 0x20fed000, 0x10000, 0, 0x8811, r[0], 0);
*(uint64_t*)0x20000340 = 2;
syscall(__NR_mbind, 0x20ff9000, 0x4000, 0x4002, 0x20000340,
0x45d4, 3);
return 0;
}
---8<---
Actually the test does:
mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000
socket(AF_PACKET, SOCK_RAW, 768) = 3
setsockopt(3, SOL_PACKET, PACKET_TX_RING, {block_size=65536, block_nr=1, frame_size=50464, frame_nr=1}, 16) = 0
mmap(0x20fed000, 65536, PROT_NONE, MAP_SHARED|MAP_FIXED|MAP_POPULATE|MAP_DENYWRITE, 3, 0) = 0x20fed000
mbind(..., MPOL_MF_STRICT|MPOL_MF_MOVE) = 0
The setsockopt() would allocate compound pages (16 pages in this test) for
packet tx ring, then the mmap() would call packet_mmap() to map the pages
into the user address space specified by the mmap() call.
When calling mbind(), it would scan the vma to queue the pages for
migration to the new node. It would split any huge page since 4.9 doesn't
support THP migration, however, the packet tx ring compound pages are not
THP and even not movable. So, the above bug is triggered.
However, the later kernel is not hit by this issue due to
d44d363f65780f2ac2 ("mm: don't assume anonymous pages have SwapBacked
flag"), which just removes the PageSwapBacked check for a different
reason.
But, there is a deeper issue. According to the semantic of mbind(), it
should return -EIO if MPOL_MF_MOVE or MPOL_MF_MOVE_ALL was specified and
MPOL_MF_STRICT was also specified, but the kernel was unable to move all
existing pages in the range. The tx ring of the packet socket is
definitely not movable, however, mbind() returns success for this case.
Although the most socket file associates with non-movable pages, but XDP
may have movable pages from gup. So, it sounds not fine to just check the
underlying file type of vma in vma_migratable().
Change migrate_page_add() to check if the page is movable or not, if it is
unmovable, just return -EIO. But do not abort pte walk immediately, since
there may be pages off LRU temporarily. We should migrate other pages if
MPOL_MF_MOVE* is specified. Set has_unmovable flag if some paged could
not be not moved, then return -EIO for mbind() eventually.
With this change the above test would return -EIO as expected.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-3-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-3-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
--- a/mm/mempolicy.c~mm-mempolicy-handle-vma-with-unmovable-pages-mapped-correctly-in-mbind
+++ a/mm/mempolicy.c
@@ -403,7 +403,7 @@ static const struct mempolicy_operations
},
};
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags);
struct queue_pages {
@@ -463,12 +463,11 @@ static int queue_pages_pmd(pmd_t *pmd, s
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(walk->vma)) {
+ if (!vma_migratable(walk->vma) ||
+ migrate_page_add(page, qp->pagelist, flags)) {
ret = 1;
goto unlock;
}
-
- migrate_page_add(page, qp->pagelist, flags);
} else
ret = -EIO;
unlock:
@@ -532,7 +531,14 @@ static int queue_pages_pte_range(pmd_t *
has_unmovable = true;
break;
}
- migrate_page_add(page, qp->pagelist, flags);
+
+ /*
+ * Do not abort immediately since there may be
+ * temporary off LRU pages in the range. Still
+ * need migrate other LRU pages.
+ */
+ if (migrate_page_add(page, qp->pagelist, flags))
+ has_unmovable = true;
} else
break;
}
@@ -961,7 +967,7 @@ static long do_get_mempolicy(int *policy
/*
* page migration, thp tail pages can be passed.
*/
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
struct page *head = compound_head(page);
@@ -974,8 +980,19 @@ static void migrate_page_add(struct page
mod_node_page_state(page_pgdat(head),
NR_ISOLATED_ANON + page_is_file_cache(head),
hpage_nr_pages(head));
+ } else if (flags & MPOL_MF_STRICT) {
+ /*
+ * Non-movable page may reach here. And, there may be
+ * temporary off LRU pages or non-LRU movable pages.
+ * Treat them as unmovable pages since they can't be
+ * isolated, so they can't be moved at the moment. It
+ * should return -EIO for this case too.
+ */
+ return -EIO;
}
}
+
+ return 0;
}
/* page allocation callback for NUMA node migration */
@@ -1178,9 +1195,10 @@ static struct page *new_page(struct page
}
#else
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
+ return -EIO;
}
int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
_
Patches currently in -mm which might be from yang.shi(a)linux.alibaba.com are
mm-thp-extract-split_queue_-into-a-struct.patch
mm-move-mem_cgroup_uncharge-out-of-__page_cache_release.patch
mm-shrinker-make-shrinker-not-depend-on-memcg-kmem.patch
mm-thp-make-deferred-split-shrinker-memcg-aware.patch
The patch titled
Subject: mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified
has been removed from the -mm tree. Its filename was
mm-mempolicy-make-the-behavior-consistent-when-mpol_mf_move-and-mpol_mf_strict-were-specified.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Subject: mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified
When both MPOL_MF_MOVE* and MPOL_MF_STRICT was specified, mbind() should
try best to migrate misplaced pages, if some of the pages could not be
migrated, then return -EIO.
There are three different sub-cases:
1. vma is not migratable
2. vma is migratable, but there are unmovable pages
3. vma is migratable, pages are movable, but migrate_pages() fails
If #1 happens, kernel would just abort immediately, then return -EIO,
after a7f40cfe3b7ada ("mm: mempolicy: make mbind() return -EIO when
MPOL_MF_STRICT is specified").
If #3 happens, kernel would set policy and migrate pages with best-effort,
but won't rollback the migrated pages and reset the policy back.
Before that commit, they behaves in the same way. It'd better to keep
their behavior consistent. But, rolling back the migrated pages and
resetting the policy back sounds not feasible, so just make #1 behave as
same as #3.
Userspace will know that not everything was successfully migrated (via
-EIO), and can take whatever steps it deems necessary - attempt rollback,
determine which exact page(s) are violating the policy, etc.
Make queue_pages_range() return 1 to indicate there are unmovable pages or
vma is not migratable.
The #2 is not handled correctly in the current kernel, the following patch
will fix it.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-2-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-2-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 68 +++++++++++++++++++++++++++++++++--------------
1 file changed, 48 insertions(+), 20 deletions(-)
--- a/mm/mempolicy.c~mm-mempolicy-make-the-behavior-consistent-when-mpol_mf_move-and-mpol_mf_strict-were-specified
+++ a/mm/mempolicy.c
@@ -429,11 +429,14 @@ static inline bool queue_pages_required(
}
/*
- * queue_pages_pmd() has three possible return values:
- * 1 - pages are placed on the right node or queued successfully.
- * 0 - THP was split.
- * -EIO - is migration entry or MPOL_MF_STRICT was specified and an existing
- * page was already on a node that does not follow the policy.
+ * queue_pages_pmd() has four possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 2 - THP was split.
+ * -EIO - is migration entry or only MPOL_MF_STRICT was specified and an
+ * existing page was already on a node that does not follow the
+ * policy.
*/
static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -451,19 +454,17 @@ static int queue_pages_pmd(pmd_t *pmd, s
if (is_huge_zero_page(page)) {
spin_unlock(ptl);
__split_huge_pmd(walk->vma, pmd, addr, false, NULL);
+ ret = 2;
goto out;
}
- if (!queue_pages_required(page, qp)) {
- ret = 1;
+ if (!queue_pages_required(page, qp))
goto unlock;
- }
- ret = 1;
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
if (!vma_migratable(walk->vma)) {
- ret = -EIO;
+ ret = 1;
goto unlock;
}
@@ -479,6 +480,13 @@ out:
/*
* Scan through pages checking if pages follow certain conditions,
* and move them to the pagelist if they do.
+ *
+ * queue_pages_pte_range() has three possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * -EIO - only MPOL_MF_STRICT was specified and an existing page was already
+ * on a node that does not follow the policy.
*/
static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -488,17 +496,17 @@ static int queue_pages_pte_range(pmd_t *
struct queue_pages *qp = walk->private;
unsigned long flags = qp->flags;
int ret;
+ bool has_unmovable = false;
pte_t *pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
if (ptl) {
ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
- if (ret > 0)
- return 0;
- else if (ret < 0)
+ if (ret != 2)
return ret;
}
+ /* THP was split, fall through to pte walk */
if (pmd_trans_unstable(pmd))
return 0;
@@ -519,14 +527,21 @@ static int queue_pages_pte_range(pmd_t *
if (!queue_pages_required(page, qp))
continue;
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(vma))
+ /* MPOL_MF_STRICT must be specified if we get here */
+ if (!vma_migratable(vma)) {
+ has_unmovable = true;
break;
+ }
migrate_page_add(page, qp->pagelist, flags);
} else
break;
}
pte_unmap_unlock(pte - 1, ptl);
cond_resched();
+
+ if (has_unmovable)
+ return 1;
+
return addr != end ? -EIO : 0;
}
@@ -639,7 +654,13 @@ static int queue_pages_test_walk(unsigne
*
* If pages found in a given range are on a set of nodes (determined by
* @nodes and @flags,) it's isolated and queued to the pagelist which is
- * passed via @private.)
+ * passed via @private.
+ *
+ * queue_pages_range() has three possible return values:
+ * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 0 - queue pages successfully or no misplaced page.
+ * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.
*/
static int
queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
@@ -1182,6 +1203,7 @@ static long do_mbind(unsigned long start
struct mempolicy *new;
unsigned long end;
int err;
+ int ret;
LIST_HEAD(pagelist);
if (flags & ~(unsigned long)MPOL_MF_VALID)
@@ -1243,10 +1265,15 @@ static long do_mbind(unsigned long start
if (err)
goto mpol_out;
- err = queue_pages_range(mm, start, end, nmask,
+ ret = queue_pages_range(mm, start, end, nmask,
flags | MPOL_MF_INVERT, &pagelist);
- if (!err)
- err = mbind_range(mm, start, end, new);
+
+ if (ret < 0) {
+ err = -EIO;
+ goto up_out;
+ }
+
+ err = mbind_range(mm, start, end, new);
if (!err) {
int nr_failed = 0;
@@ -1259,13 +1286,14 @@ static long do_mbind(unsigned long start
putback_movable_pages(&pagelist);
}
- if (nr_failed && (flags & MPOL_MF_STRICT))
+ if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))
err = -EIO;
} else
putback_movable_pages(&pagelist);
+up_out:
up_write(&mm->mmap_sem);
- mpol_out:
+mpol_out:
mpol_put(new);
return err;
}
_
Patches currently in -mm which might be from yang.shi(a)linux.alibaba.com are
mm-thp-extract-split_queue_-into-a-struct.patch
mm-move-mem_cgroup_uncharge-out-of-__page_cache_release.patch
mm-shrinker-make-shrinker-not-depend-on-memcg-kmem.patch
mm-thp-make-deferred-split-shrinker-memcg-aware.patch
The patch titled
Subject: mm/hmm: fix bad subpage pointer in try_to_unmap_one
has been removed from the -mm tree. Its filename was
mm-hmm-fix-bad-subpage-pointer-in-try_to_unmap_one.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ralph Campbell <rcampbell(a)nvidia.com>
Subject: mm/hmm: fix bad subpage pointer in try_to_unmap_one
When migrating an anonymous private page to a ZONE_DEVICE private page,
the source page->mapping and page->index fields are copied to the
destination ZONE_DEVICE struct page and the page_mapcount() is increased.
This is so rmap_walk() can be used to unmap and migrate the page back to
system memory. However, try_to_unmap_one() computes the subpage pointer
from a swap pte which computes an invalid page pointer and a kernel panic
results such as:
BUG: unable to handle page fault for address: ffffea1fffffffc8
Currently, only single pages can be migrated to device private memory so
no subpage computation is needed and it can be set to "page".
[rcampbell(a)nvidia.com: add comment]
Link: http://lkml.kernel.org/r/20190724232700.23327-4-rcampbell@nvidia.com
Link: http://lkml.kernel.org/r/20190719192955.30462-4-rcampbell@nvidia.com
Fixes: a5430dda8a3a1c ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: "Jérôme Glisse" <jglisse(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Lai Jiangshan <jiangshanlai(a)gmail.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/rmap.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/mm/rmap.c~mm-hmm-fix-bad-subpage-pointer-in-try_to_unmap_one
+++ a/mm/rmap.c
@@ -1475,7 +1475,15 @@ static bool try_to_unmap_one(struct page
/*
* No need to invalidate here it will synchronize on
* against the special swap migration pte.
+ *
+ * The assignment to subpage above was computed from a
+ * swap PTE which results in an invalid pointer.
+ * Since only PAGE_SIZE pages can currently be
+ * migrated, just set it to page. This will need to be
+ * changed when hugepage migrations to device private
+ * memory are supported.
*/
+ subpage = page;
goto discard;
}
_
Patches currently in -mm which might be from rcampbell(a)nvidia.com are
The patch titled
Subject: mm/hmm: fix ZONE_DEVICE anon page mapping reuse
has been removed from the -mm tree. Its filename was
mm-hmm-fix-zone_device-anon-page-mapping-reuse.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ralph Campbell <rcampbell(a)nvidia.com>
Subject: mm/hmm: fix ZONE_DEVICE anon page mapping reuse
When a ZONE_DEVICE private page is freed, the page->mapping field can be
set. If this page is reused as an anonymous page, the previous value can
prevent the page from being inserted into the CPU's anon rmap table. For
example, when migrating a pte_none() page to device memory:
migrate_vma(ops, vma, start, end, src, dst, private)
migrate_vma_collect()
src[] = MIGRATE_PFN_MIGRATE
migrate_vma_prepare()
/* no page to lock or isolate so OK */
migrate_vma_unmap()
/* no page to unmap so OK */
ops->alloc_and_copy()
/* driver allocates ZONE_DEVICE page for dst[] */
migrate_vma_pages()
migrate_vma_insert_page()
page_add_new_anon_rmap()
__page_set_anon_rmap()
/* This check sees the page's stale mapping field */
if (PageAnon(page))
return
/* page->mapping is not updated */
The result is that the migration appears to succeed but a subsequent CPU
fault will be unable to migrate the page back to system memory or worse.
Clear the page->mapping field when freeing the ZONE_DEVICE page so stale
pointer data doesn't affect future page use.
Link: http://lkml.kernel.org/r/20190719192955.30462-3-rcampbell@nvidia.com
Fixes: b7a523109fb5c9d2d6dd ("mm: don't clear ->mapping in hmm_devmem_free")
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Jan Kara <jack(a)suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: "Jérôme Glisse" <jglisse(a)redhat.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai(a)gmail.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memremap.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
--- a/mm/memremap.c~mm-hmm-fix-zone_device-anon-page-mapping-reuse
+++ a/mm/memremap.c
@@ -403,6 +403,30 @@ void __put_devmap_managed_page(struct pa
mem_cgroup_uncharge(page);
+ /*
+ * When a device_private page is freed, the page->mapping field
+ * may still contain a (stale) mapping value. For example, the
+ * lower bits of page->mapping may still identify the page as
+ * an anonymous page. Ultimately, this entire field is just
+ * stale and wrong, and it will cause errors if not cleared.
+ * One example is:
+ *
+ * migrate_vma_pages()
+ * migrate_vma_insert_page()
+ * page_add_new_anon_rmap()
+ * __page_set_anon_rmap()
+ * ...checks page->mapping, via PageAnon(page) call,
+ * and incorrectly concludes that the page is an
+ * anonymous page. Therefore, it incorrectly,
+ * silently fails to set up the new anon rmap.
+ *
+ * For other types of ZONE_DEVICE pages, migration is either
+ * handled differently or not done at all, so there is no need
+ * to clear page->mapping.
+ */
+ if (is_device_private_page(page))
+ page->mapping = NULL;
+
page->pgmap->ops->page_free(page);
} else if (!count)
__put_page(page);
_
Patches currently in -mm which might be from rcampbell(a)nvidia.com are
This is a note to let you know that I've just added the patch titled
tty/serial: atmel: reschedule TX after RX was started
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From a8441389f0daef2d74ecd79ab27067953f0b2483 Mon Sep 17 00:00:00 2001
From: Razvan Stefanescu <razvan.stefanescu(a)microchip.com>
Date: Tue, 13 Aug 2019 10:40:25 +0300
Subject: tty/serial: atmel: reschedule TX after RX was started
When half-duplex RS485 communication is used, after RX is started, TX
tasklet still needs to be scheduled tasklet. This avoids console freezing
when more data is to be transmitted, if the serial communication is not
closed.
Fixes: 69646d7a3689 ("tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped")
Signed-off-by: Razvan Stefanescu <razvan.stefanescu(a)microchip.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20190813074025.16218-1-razvan.stefanescu@microchi…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/atmel_serial.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c
index 19a85d6fe3d2..9a54c9e6d36e 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1400,7 +1400,6 @@ atmel_handle_transmit(struct uart_port *port, unsigned int pending)
atmel_port->hd_start_rx = false;
atmel_start_rx(port);
- return;
}
atmel_tasklet_schedule(atmel_port, &atmel_port->tasklet_tx);
--
2.22.1
This is a note to let you know that I've just added the patch titled
usb: chipidea: imx: fix EPROBE_DEFER support during driver probe
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 141822aa3f79efc8a2ec3ed464f2fd2c93ccd803 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Andr=C3=A9=20Draszik?= <git(a)andred.net>
Date: Sat, 10 Aug 2019 16:07:58 +0100
Subject: usb: chipidea: imx: fix EPROBE_DEFER support during driver probe
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If driver probe needs to be deferred, e.g. because ci_hdrc_add_device()
isn't ready yet, this driver currently misbehaves badly:
a) success is still reported to the driver core (meaning a 2nd
probe attempt will never be done), leaving the driver in
a dysfunctional state and the hardware unusable
b) driver remove / shutdown OOPSes:
[ 206.786916] Unable to handle kernel paging request at virtual address fffffdff
[ 206.794148] pgd = 880b9f82
[ 206.796890] [fffffdff] *pgd=abf5e861, *pte=00000000, *ppte=00000000
[ 206.803179] Internal error: Oops: 37 [#1] PREEMPT SMP ARM
[ 206.808581] Modules linked in: wl18xx evbug
[ 206.813308] CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 4.19.35+gf345c93b4195 #1
[ 206.821053] Hardware name: Freescale i.MX7 Dual (Device Tree)
[ 206.826813] PC is at ci_hdrc_remove_device+0x4/0x20
[ 206.831699] LR is at ci_hdrc_imx_remove+0x20/0xe8
[ 206.836407] pc : [<805cd4b0>] lr : [<805d62cc>] psr: 20000013
[ 206.842678] sp : a806be40 ip : 00000001 fp : 80adbd3c
[ 206.847906] r10: 80b1b794 r9 : 80d5dfe0 r8 : a8192c44
[ 206.853136] r7 : 80db93a0 r6 : a8192c10 r5 : a8192c00 r4 : a93a4a00
[ 206.859668] r3 : 00000000 r2 : a8192ce4 r1 : ffffffff r0 : fffffdfb
[ 206.866201] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 206.873341] Control: 10c5387d Table: a9e0c06a DAC: 00000051
[ 206.879092] Process systemd-shutdow (pid: 1, stack limit = 0xb271353c)
[ 206.885624] Stack: (0xa806be40 to 0xa806c000)
[ 206.889992] be40: a93a4a00 805d62cc a8192c1c a8170e10 a8192c10 8049a490 80d04d08 00000000
[ 206.898179] be60: 00000000 80d0da2c fee1dead 00000000 a806a000 00000058 00000000 80148b08
[ 206.906366] be80: 01234567 80148d8c a9858600 00000000 00000000 00000000 00000000 80d04d08
[ 206.914553] bea0: 00000000 00000000 a82741e0 a9858600 00000024 00000002 a9858608 00000005
[ 206.922740] bec0: 0000001e 8022c058 00000000 00000000 a806bf14 a9858600 00000000 a806befc
[ 206.930927] bee0: a806bf78 00000000 7ee12c30 8022c18c a806bef8 a806befc 00000000 00000001
[ 206.939115] bf00: 00000000 00000024 a806bf14 00000005 7ee13b34 7ee12c68 00000004 7ee13f20
[ 206.947302] bf20: 00000010 7ee12c7c 00000005 7ee12d04 0000000a 76e7dc00 00000001 80d0f140
[ 206.955490] bf40: ab637880 a974de40 60000013 80d0f140 ab6378a0 80d04d08 a8080470 a9858600
[ 206.963677] bf60: a9858600 00000000 00000000 8022c24c 00000000 80144310 00000000 00000000
[ 206.971864] bf80: 80101204 80d04d08 00000000 80d04d08 00000000 00000000 00000003 00000058
[ 206.980051] bfa0: 80101204 80101000 00000000 00000000 fee1dead 28121969 01234567 00000000
[ 206.988237] bfc0: 00000000 00000000 00000003 00000058 00000000 00000000 00000000 00000000
[ 206.996425] bfe0: 0049ffb0 7ee13d58 0048a84b 76f245a6 60000030 fee1dead 00000000 00000000
[ 207.004622] [<805cd4b0>] (ci_hdrc_remove_device) from [<805d62cc>] (ci_hdrc_imx_remove+0x20/0xe8)
[ 207.013509] [<805d62cc>] (ci_hdrc_imx_remove) from [<8049a490>] (device_shutdown+0x16c/0x218)
[ 207.022050] [<8049a490>] (device_shutdown) from [<80148b08>] (kernel_restart+0xc/0x50)
[ 207.029980] [<80148b08>] (kernel_restart) from [<80148d8c>] (sys_reboot+0xf4/0x1f0)
[ 207.037648] [<80148d8c>] (sys_reboot) from [<80101000>] (ret_fast_syscall+0x0/0x54)
[ 207.045308] Exception stack(0xa806bfa8 to 0xa806bff0)
[ 207.050368] bfa0: 00000000 00000000 fee1dead 28121969 01234567 00000000
[ 207.058554] bfc0: 00000000 00000000 00000003 00000058 00000000 00000000 00000000 00000000
[ 207.066737] bfe0: 0049ffb0 7ee13d58 0048a84b 76f245a6
[ 207.071799] Code: ebffffa8 e3a00000 e8bd8010 e92d4010 (e5904004)
[ 207.078021] ---[ end trace be47424e3fd46e9f ]---
[ 207.082647] Kernel panic - not syncing: Fatal exception
[ 207.087894] ---[ end Kernel panic - not syncing: Fatal exception ]---
c) the error path in combination with driver removal causes
imbalanced calls to the clk_*() and pm_()* APIs
a) happens because the original intended return value is
overwritten (with 0) by the return code of
regulator_disable() in ci_hdrc_imx_probe()'s error path
b) happens because ci_pdev is -EPROBE_DEFER, which causes
ci_hdrc_remove_device() to OOPS
Fix a) by being more careful in ci_hdrc_imx_probe()'s error
path and not overwriting the real error code
Fix b) by calling the respective cleanup functions during
remove only when needed (when ci_pdev != NULL, i.e. when
everything was initialised correctly). This also has the
side effect of not causing imbalanced clk_*() and pm_*()
API calls as part of the error code path.
Fixes: 7c8e8909417e ("usb: chipidea: imx: add HSIC support")
Signed-off-by: André Draszik <git(a)andred.net>
Cc: stable <stable(a)vger.kernel.org>
CC: Peter Chen <Peter.Chen(a)nxp.com>
CC: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
CC: Shawn Guo <shawnguo(a)kernel.org>
CC: Sascha Hauer <s.hauer(a)pengutronix.de>
CC: Pengutronix Kernel Team <kernel(a)pengutronix.de>
CC: Fabio Estevam <festevam(a)gmail.com>
CC: NXP Linux Team <linux-imx(a)nxp.com>
CC: linux-usb(a)vger.kernel.org
CC: linux-arm-kernel(a)lists.infradead.org
CC: linux-kernel(a)vger.kernel.org
Link: https://lore.kernel.org/r/20190810150758.17694-1-git@andred.net
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/chipidea/ci_hdrc_imx.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c b/drivers/usb/chipidea/ci_hdrc_imx.c
index b5abfe89190c..df8812c30640 100644
--- a/drivers/usb/chipidea/ci_hdrc_imx.c
+++ b/drivers/usb/chipidea/ci_hdrc_imx.c
@@ -454,9 +454,11 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev)
imx_disable_unprepare_clks(dev);
disable_hsic_regulator:
if (data->hsic_pad_regulator)
- ret = regulator_disable(data->hsic_pad_regulator);
+ /* don't overwrite original ret (cf. EPROBE_DEFER) */
+ regulator_disable(data->hsic_pad_regulator);
if (pdata.flags & CI_HDRC_PMQOS)
pm_qos_remove_request(&data->pm_qos_req);
+ data->ci_pdev = NULL;
return ret;
}
@@ -469,14 +471,17 @@ static int ci_hdrc_imx_remove(struct platform_device *pdev)
pm_runtime_disable(&pdev->dev);
pm_runtime_put_noidle(&pdev->dev);
}
- ci_hdrc_remove_device(data->ci_pdev);
+ if (data->ci_pdev)
+ ci_hdrc_remove_device(data->ci_pdev);
if (data->override_phy_control)
usb_phy_shutdown(data->phy);
- imx_disable_unprepare_clks(&pdev->dev);
- if (data->plat_data->flags & CI_HDRC_PMQOS)
- pm_qos_remove_request(&data->pm_qos_req);
- if (data->hsic_pad_regulator)
- regulator_disable(data->hsic_pad_regulator);
+ if (data->ci_pdev) {
+ imx_disable_unprepare_clks(&pdev->dev);
+ if (data->plat_data->flags & CI_HDRC_PMQOS)
+ pm_qos_remove_request(&data->pm_qos_req);
+ if (data->hsic_pad_regulator)
+ regulator_disable(data->hsic_pad_regulator);
+ }
return 0;
}
--
2.22.1
This is a note to let you know that I've just added the patch titled
usb: cdc-acm: make sure a refcount is taken early enough
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From c52873e5a1ef72f845526d9f6a50704433f9c625 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Thu, 8 Aug 2019 16:21:19 +0200
Subject: usb: cdc-acm: make sure a refcount is taken early enough
destroy() will decrement the refcount on the interface, so that
it needs to be taken so early that it never undercounts.
Fixes: 7fb57a019f94e ("USB: cdc-acm: Fix potential deadlock (lockdep warning)")
Cc: stable <stable(a)vger.kernel.org>
Reported-and-tested-by: syzbot+1b2449b7b5dc240d107a(a)syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Link: https://lore.kernel.org/r/20190808142119.7998-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/class/cdc-acm.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 183b41753c98..62f4fb9b362f 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -1301,10 +1301,6 @@ static int acm_probe(struct usb_interface *intf,
tty_port_init(&acm->port);
acm->port.ops = &acm_port_ops;
- minor = acm_alloc_minor(acm);
- if (minor < 0)
- goto alloc_fail1;
-
ctrlsize = usb_endpoint_maxp(epctrl);
readsize = usb_endpoint_maxp(epread) *
(quirks == SINGLE_RX_URB ? 1 : 2);
@@ -1312,6 +1308,13 @@ static int acm_probe(struct usb_interface *intf,
acm->writesize = usb_endpoint_maxp(epwrite) * 20;
acm->control = control_interface;
acm->data = data_interface;
+
+ usb_get_intf(acm->control); /* undone in destruct() */
+
+ minor = acm_alloc_minor(acm);
+ if (minor < 0)
+ goto alloc_fail1;
+
acm->minor = minor;
acm->dev = usb_dev;
if (h.usb_cdc_acm_descriptor)
@@ -1458,7 +1461,6 @@ static int acm_probe(struct usb_interface *intf,
usb_driver_claim_interface(&acm_driver, data_interface, acm);
usb_set_intfdata(data_interface, acm);
- usb_get_intf(control_interface);
tty_dev = tty_port_register_device(&acm->port, acm_tty_driver, minor,
&control_interface->dev);
if (IS_ERR(tty_dev)) {
--
2.22.1
This is a note to let you know that I've just added the patch titled
USB: CDC: fix sanity checks in CDC union parser
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 54364278fb3cabdea51d6398b07c87415065b3fc Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Tue, 13 Aug 2019 11:35:41 +0200
Subject: USB: CDC: fix sanity checks in CDC union parser
A few checks checked for the size of the pointer to a structure
instead of the structure itself. Copy & paste issue presumably.
Fixes: e4c6fb7794982 ("usbnet: move the CDC parser into USB core")
Cc: stable <stable(a)vger.kernel.org>
Reported-by: syzbot+45a53506b65321c1fe91(a)syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Link: https://lore.kernel.org/r/20190813093541.18889-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/message.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index e844bb7b5676..5adf489428aa 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -2218,14 +2218,14 @@ int cdc_parse_cdc_header(struct usb_cdc_parsed_header *hdr,
(struct usb_cdc_dmm_desc *)buffer;
break;
case USB_CDC_MDLM_TYPE:
- if (elength < sizeof(struct usb_cdc_mdlm_desc *))
+ if (elength < sizeof(struct usb_cdc_mdlm_desc))
goto next_desc;
if (desc)
return -EINVAL;
desc = (struct usb_cdc_mdlm_desc *)buffer;
break;
case USB_CDC_MDLM_DETAIL_TYPE:
- if (elength < sizeof(struct usb_cdc_mdlm_detail_desc *))
+ if (elength < sizeof(struct usb_cdc_mdlm_detail_desc))
goto next_desc;
if (detail)
return -EINVAL;
--
2.22.1
Hello stable(a)vger.kernel.org
I am Karen Ngui by name got your details based on recommendations
from LinkedIn 5 star marketers and its feels good to be part of
your professional network.
I sent you a proposal via email on the 10th of this month did
you get it ? Do reply to my message so I can elaborate more on my
proposal.
Thank You For Your Time.
On Sun, Aug 11, 2019 at 10:08 AM kbuild test robot <lkp(a)intel.com> wrote:
>
> CC: kbuild-all(a)01.org
> TO: Daniel Borkmann <daniel(a)iogearbox.net>
> CC: "Greg Kroah-Hartman" <gregkh(a)linuxfoundation.org>
> CC: Thomas Gleixner <tglx(a)linutronix.de>
>
> tree: https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stabl… linux-4.14.y
> head: 3ffe1e79c174b2093f7ee3df589a7705572c9620
> commit: e28951100515c9fd8f8d4b06ed96576e3527ad82 [8386/9999] x86/retpolines: Disable switch jump tables when retpolines are enabled
> config: x86_64-rhel-7.6 (attached as .config)
> compiler: clang version 10.0.0 (git://gitmirror/llvm_project 45a3fd206fb06f77a08968c99a8172cbf2ccdd0f)
> reproduce:
> git checkout e28951100515c9fd8f8d4b06ed96576e3527ad82
> # save the attached .config to linux build tree
> make ARCH=x86_64
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp(a)intel.com>
>
> All warnings (new ones prefixed by >>):
>
> In file included from drivers/gpu/drm/i915/gvt/opregion.c:25:
> In file included from drivers/gpu/drm/i915/i915_drv.h:61:
> In file included from drivers/gpu/drm/i915/intel_uc.h:31:
> In file included from drivers/gpu/drm/i915/i915_vma.h:34:
> drivers/gpu/drm/i915/i915_gem_object.h:290:1: warning: attribute declaration must precede definition [-Wignored-attributes]
> __deprecated
> ^
Was there another patch that fixes this that should have been
backported to stable (4.14) along with this?
> include/linux/compiler-gcc.h:106:37: note: expanded from macro '__deprecated'
> #define __deprecated __attribute__((deprecated))
> ^
> include/drm/drm_gem.h:247:20: note: previous definition is here
> static inline void drm_gem_object_reference(struct drm_gem_object *obj)
> ^
> In file included from drivers/gpu/drm/i915/gvt/opregion.c:25:
> In file included from drivers/gpu/drm/i915/i915_drv.h:61:
> In file included from drivers/gpu/drm/i915/intel_uc.h:31:
> In file included from drivers/gpu/drm/i915/i915_vma.h:34:
> drivers/gpu/drm/i915/i915_gem_object.h:300:1: warning: attribute declaration must precede definition [-Wignored-attributes]
> __deprecated
> ^
> include/linux/compiler-gcc.h:106:37: note: expanded from macro '__deprecated'
> #define __deprecated __attribute__((deprecated))
> ^
> include/drm/drm_gem.h:285:20: note: previous definition is here
> static inline void drm_gem_object_unreference(struct drm_gem_object *obj)
> ^
> In file included from drivers/gpu/drm/i915/gvt/opregion.c:25:
> In file included from drivers/gpu/drm/i915/i915_drv.h:61:
> In file included from drivers/gpu/drm/i915/intel_uc.h:31:
> In file included from drivers/gpu/drm/i915/i915_vma.h:34:
> drivers/gpu/drm/i915/i915_gem_object.h:303:1: warning: attribute declaration must precede definition [-Wignored-attributes]
> __deprecated
> ^
> include/linux/compiler-gcc.h:106:37: note: expanded from macro '__deprecated'
> #define __deprecated __attribute__((deprecated))
> ^
> include/drm/drm_gem.h:273:1: note: previous definition is here
> drm_gem_object_unreference_unlocked(struct drm_gem_object *obj)
> ^
> 3 warnings generated.
> >> drivers/gpu/drm/i915/gvt/opregion.o: warning: objtool: intel_vgpu_emulate_opregion_request()+0xbe: can't find jump dest instruction at .text+0x6dd
>
> ---
> 0-DAY kernel test infrastructure Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all Intel Corporation
--
Thanks,
~Nick Desaulniers
With the existing pintbl, we already have many entries in it. it is
better to figure out a new match to reduce the size of the pintbl.
For example, there are over 10 entries in the pintbl for:
0x10ec0255, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE
If we define a new tbl like below, and with the new adding match
function, we can remove those over 10 entries:
SND_HDA_PIN_QUIRK(0x10ec0255, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
{0x19, 0x40000000},
{0x1a, 0x40000000},),
Here we put 0x19 and 0x1a in the tbl just because these two pins are
undefined on Dell laptops with the codec alc255, and these two pins
will be overwritten by ALC255_FIXUP_DELL1_MIC_NO_PRESENCE.
In summary: the new match will check vendor id and codec id first,
then check the pin_cfg defined in the tbl, only all pin_cfgs in the
tbl are undef and the corresponding pin_cfgs on the laptop are undef
too, this match returns true.
This new match function has lower priority than existing match
functions, so the existing tbls still work as before after applying this
patch.
My plan is to change the existing tbl to undef tbl for MIC_NO_PRESENCE
fixups gradually.
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
---
sound/pci/hda/hda_auto_parser.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/sound/pci/hda/hda_auto_parser.c b/sound/pci/hda/hda_auto_parser.c
index 92390d457567..cfada7401b86 100644
--- a/sound/pci/hda/hda_auto_parser.c
+++ b/sound/pci/hda/hda_auto_parser.c
@@ -915,6 +915,36 @@ static bool pin_config_match(struct hda_codec *codec,
return true;
}
+/* match the pintbl which only contains specific pins with undef configuration */
+static bool pin_config_match_undef(struct hda_codec *codec,
+ const struct hda_pintbl *pins)
+{
+ bool match = false;
+
+ for (; pins->nid; pins++) {
+ const struct hda_pincfg *pin;
+ int i;
+
+ if ((pins->val & 0xf0000000) != 0x40000000)
+ return false;
+
+ match = false;
+ snd_array_for_each(&codec->init_pins, i, pin) {
+ if (pin->nid != pins->nid)
+ continue;
+ if ((pin->cfg & 0xf0000000) != 0x40000000)
+ return false;
+ match = true;
+ break;
+ }
+
+ if (match == false)
+ return false;
+ }
+
+ return match;
+}
+
/**
* snd_hda_pick_pin_fixup - Pick up a fixup matching with the pin quirk list
* @codec: the HDA codec
@@ -935,7 +965,7 @@ void snd_hda_pick_pin_fixup(struct hda_codec *codec,
continue;
if (codec->core.vendor_id != pq->codec)
continue;
- if (pin_config_match(codec, pq->pins)) {
+ if (pin_config_match(codec, pq->pins) || pin_config_match_undef(codec, pq->pins)) {
codec->fixup_id = pq->value;
#ifdef CONFIG_SND_DEBUG_VERBOSE
codec->fixup_name = pq->name;
--
2.17.1
Hi Sasha,
On Mon, Aug 12, 2019 at 07:05:47PM +0000, Sasha Levin wrote:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: d3862e44daa7 dma-buf/sw-sync: Fix locking around sync_timeline lists.
>
> The bot has tested the following trees: v5.2.8, v4.19.66, v4.14.138, v4.9.189.
>
> v5.2.8: Build OK!
> v4.19.66: Build OK!
> v4.14.138: Build OK!
> v4.9.189: Failed to apply! Possible dependencies:
> Unable to calculate
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
The backporting instruction has an explicit # v4.14+ in there, so failure
to apply to older kernels is expected.
Can you perhaps teach this trick to your script perhaps? Iirc we're using
the official format even.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: d36a8d2fb62c - Linux 5.2.8
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/100903
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out the following commit:
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: d36a8d2fb62c - Linux 5.2.8
We grabbed the d6535b968bdc commit of the stable queue repository.
We then merged the patchset with `git am`:
revert-pci-add-missing-link-delays-required-by-the-pcie-spec.patch
iio-ingenic-jz47xx-set-clock-divider-on-probe.patch
iio-cros_ec_accel_legacy-fix-incorrect-channel-setting.patch
iio-imu-mpu6050-add-missing-available-scan-masks.patch
iio-adc-gyroadc-fix-uninitialized-return-code.patch
iio-adc-max9611-fix-misuse-of-genmask-macro.patch
staging-gasket-apex-fix-copy-paste-typo.patch
staging-wilc1000-flush-the-workqueue-before-deinit-the-host.patch
staging-android-ion-bail-out-upon-sigkill-when-allocating-memory.patch
staging-fbtft-fix-probing-of-gpio-descriptor.patch
staging-fbtft-fix-reset-assertion-when-using-gpio-descriptor.patch
crypto-ccp-fix-oops-by-properly-managing-allocated-structures.patch
crypto-ccp-add-support-for-valid-authsize-values-less-than-16.patch
crypto-ccp-ignore-tag-length-when-decrypting-gcm-ciphertext.patch
driver-core-platform-return-enxio-for-missing-gpioint.patch
usb-usbfs-fix-double-free-of-usb-memory-upon-submiturb-error.patch
revert-usb-rio500-simplify-locking.patch
usb-iowarrior-fix-deadlock-on-disconnect.patch
sound-fix-a-memory-leak-bug.patch
mmc-cavium-set-the-correct-dma-max-segment-size-for-mmc_host.patch
mmc-cavium-add-the-missing-dma-unmap-when-the-dma-has-finished.patch
loop-set-pf_memalloc_noio-for-the-worker-thread.patch
bdev-fixup-error-handling-in-blkdev_get.patch
input-usbtouchscreen-initialize-pm-mutex-before-using-it.patch
input-elantech-enable-smbus-on-new-2018-systems.patch
input-synaptics-enable-rmi-mode-for-hp-spectre-x360.patch
x86-mm-check-for-pfn-instead-of-page-in-vmalloc_sync_one.patch
x86-mm-sync-also-unmappings-in-vmalloc_sync_all.patch
mm-vmalloc-sync-unmappings-in-__purge_vmap_area_lazy.patch
coresight-fix-debug_locks_warn_on-for-uninitialized-attribute.patch
perf-annotate-fix-s390-gap-between-kernel-end-and-module-start.patch
perf-db-export-fix-thread__exec_comm.patch
perf-record-fix-module-size-on-s390.patch
x86-purgatory-do-not-use-__builtin_memcpy-and-__builtin_memset.patch
x86-purgatory-use-cflags_remove-rather-than-reset-kbuild_cflags.patch
genirq-affinity-create-affinity-mask-for-single-vector.patch
gfs2-gfs2_walk_metadata-fix.patch
usb-host-xhci-rcar-fix-timeout-in-xhci_suspend.patch
usb-yurex-fix-use-after-free-in-yurex_delete.patch
usb-typec-ucsi-ccg-fix-uninitilized-symbol-error.patch
usb-typec-tcpm-free-log-buf-memory-when-remove-debug-file.patch
usb-typec-tcpm-remove-tcpm-dir-if-no-children.patch
usb-typec-tcpm-add-null-check-before-dereferencing-config.patch
usb-typec-tcpm-ignore-unsupported-unknown-alternate-mode-requests.patch
can-rcar_canfd-fix-possible-irq-storm-on-high-load.patch
can-flexcan-fix-stop-mode-acknowledgment.patch
can-flexcan-fix-an-use-after-free-in-flexcan_setup_stop_mode.patch
can-peak_usb-fix-potential-double-kfree_skb.patch
powerpc-fix-off-by-one-in-max_zone_pfn-initializatio.patch
netfilter-nfnetlink-avoid-deadlock-due-to-synchronou.patch
vfio-ccw-set-pa_nr-to-0-if-memory-allocation-fails-f.patch
vfio-ccw-don-t-call-cp_free-if-we-are-processing-a-c.patch
netfilter-fix-rpfilter-dropping-vrf-packets-by-mista.patch
netfilter-nf_tables-fix-module-autoload-for-redir.patch
netfilter-conntrack-always-store-window-size-un-scal.patch
netfilter-nft_hash-fix-symhash-with-modulus-one.patch
scripts-sphinx-pre-install-fix-script-for-rhel-cento.patch
scripts-sphinx-pre-install-don-t-use-latex-with-cent.patch
scripts-sphinx-pre-install-fix-latexmk-dependencies.patch
rq-qos-don-t-reset-has_sleepers-on-spurious-wakeups.patch
rq-qos-set-ourself-task_uninterruptible-after-we-sch.patch
rq-qos-use-a-mb-for-got_token.patch
netfilter-nf_tables-support-auto-loading-for-inet-na.patch
drm-amd-display-no-audio-endpoint-for-dell-mst-displ.patch
drm-amd-display-clock-does-not-lower-in-updateplanes.patch
drm-amd-display-wait-for-backlight-programming-compl.patch
drm-amd-display-fix-dmcu-hang-when-going-into-modern.patch
drm-amd-display-use-encoder-s-engine-id-to-find-matc.patch
drm-amd-display-put-back-front-end-initialization-se.patch
drm-amd-display-allocate-4-ddc-engines-for-rv2.patch
drm-amd-display-fix-dc_create-failure-handling-and-6.patch
drm-amd-display-only-enable-audio-if-speaker-allocat.patch
drm-amd-display-increase-size-of-audios-array.patch
iscsi_ibft-make-iscsi_ibft-dependson-acpi-instead-of.patch
nl80211-fix-nl80211_he_max_capability_len.patch
mac80211-fix-possible-memory-leak-in-ieee80211_assig.patch
mac80211-don-t-warn-about-cw-params-when-not-using-t.patch
allocate_flower_entry-should-check-for-null-deref.patch
hwmon-occ-fix-division-by-zero-issue.patch
hwmon-nct6775-fix-register-address-and-added-missed-.patch
arm-dts-imx6ul-fix-clock-frequency-property-name-of-.patch
powerpc-papr_scm-force-a-scm-unbind-if-initial-scm-b.patch
arm64-force-ssbs-on-context-switch.patch
arm64-entry-sp-alignment-fault-doesn-t-write-to-far_.patch
iommu-vt-d-check-if-domain-pgd-was-allocated.patch
drm-msm-dpu-correct-dpu-encoder-spinlock-initializat.patch
drm-silence-variable-conn-set-but-not-used.patch
arm64-dts-imx8mm-correct-sai3-rxc-txfs-pin-s-mux-opt.patch
arm64-dts-imx8mq-fix-sai-compatible.patch
cpufreq-pasemi-fix-use-after-free-in-pas_cpufreq_cpu.patch
s390-qdio-add-sanity-checks-to-the-fast-requeue-path.patch
alsa-compress-fix-regression-on-compressed-capture-s.patch
alsa-compress-prevent-bypasses-of-set_params.patch
alsa-compress-don-t-allow-paritial-drain-operations-.patch
alsa-compress-be-more-restrictive-about-when-a-drain.patch
perf-script-fix-off-by-one-in-brstackinsn-ipc-comput.patch
perf-tools-fix-proper-buffer-size-for-feature-proces.patch
perf-stat-fix-segfault-for-event-group-in-repeat-mod.patch
perf-session-fix-loading-of-compressed-data-split-ac.patch
perf-probe-avoid-calling-freeing-routine-multiple-ti.patch
drbd-dynamically-allocate-shash-descriptor.patch
acpi-iort-fix-off-by-one-check-in-iort_dev_find_its_.patch
nvme-ignore-subnqn-for-adata-sx6000lnp.patch
nvme-fix-memory-leak-caused-by-incorrect-subsystem-f.patch
arm-davinci-fix-sleep.s-build-error-on-armv4.patch
arm-dts-bcm-bcm47094-add-missing-cells-for-mdio-bus-.patch
scsi-megaraid_sas-fix-panic-on-loading-firmware-cras.patch
scsi-ibmvfc-fix-warn_on-during-event-pool-release.patch
scsi-scsi_dh_alua-always-use-a-2-second-delay-before.patch
test_firmware-fix-a-memory-leak-bug.patch
tty-ldsem-locking-rwsem-add-missing-acquire-to-read_.patch
perf-x86-intel-fix-slots-pebs-event-constraint.patch
perf-x86-intel-fix-invalid-bit-13-for-icelake-msr_of.patch
perf-x86-apply-more-accurate-check-on-hypervisor-pla.patch
perf-core-fix-creating-kernel-counters-for-pmus-that.patch
s390-dma-provide-proper-arch_zone_dma_bits-value.patch
gen_compile_commands-lower-the-entry-count-threshold.patch
hid-sony-fix-race-condition-between-rumble-and-device-remove.patch
alsa-usb-audio-fix-a-memory-leak-bug.patch
kvm-nsvm-properly-map-nested-vmcb.patch
can-peak_usb-pcan_usb_pro-fix-info-leaks-to-usb-devices.patch
can-peak_usb-pcan_usb_fd-fix-info-leaks-to-usb-devices.patch
hwmon-nct7802-fix-wrong-detection-of-in4-presence.patch
hwmon-lm75-fixup-tmp75b-clr_mask.patch
drm-i915-fix-wrong-escape-clock-divisor-init-for-glk.patch
alsa-firewire-fix-a-memory-leak-bug.patch
alsa-hiface-fix-multiple-memory-leak-bugs.patch
alsa-hda-don-t-override-global-pcm-hw-info-flag.patch
alsa-hda-workaround-for-crackled-sound-on-amd-controller-1022-1457.patch
mac80211-don-t-warn-on-short-wmm-parameters-from-ap.patch
dax-dax_layout_busy_page-should-not-unmap-cow-pages.patch
smb3-fix-deadlock-in-validate-negotiate-hits-reconnect.patch
smb3-send-cap_dfs-capability-during-session-setup.patch
nfsv4-fix-delegation-state-recovery.patch
nfsv4-check-the-return-value-of-update_open_stateid.patch
nfsv4-fix-an-oops-in-nfs4_do_setattr.patch
kvm-fix-leak-vcpu-s-vmcs-value-into-other-pcpu.patch
kvm-arm-arm64-sync-ich_vmcr_el2-back-when-about-to-block.patch
mwifiex-fix-802.11n-wpa-detection.patch
iwlwifi-don-t-unmap-as-page-memory-that-was-mapped-as-single.patch
iwlwifi-mvm-fix-an-out-of-bound-access.patch
iwlwifi-mvm-fix-a-use-after-free-bug-in-iwl_mvm_tx_tso_segment.patch
iwlwifi-mvm-don-t-send-geo_tx_power_limit-on-version-41.patch
iwlwifi-mvm-fix-version-check-for-geo_tx_power_limit-support.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
ppc64le:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ audit: audit testsuite test [14]
✅ httpd: mod_ssl smoke sanity [15]
✅ iotop: sanity [16]
✅ tuned: tune-processes-through-perf [17]
✅ pciutils: update pci ids test [18]
✅ Usex - version 1.9-29 [19]
x86_64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ audit: audit testsuite test [14]
✅ httpd: mod_ssl smoke sanity [15]
✅ iotop: sanity [16]
✅ tuned: tune-processes-through-perf [17]
✅ pciutils: sanity smoke test [20]
✅ pciutils: update pci ids test [18]
✅ Usex - version 1.9-29 [19]
✅ storage: SCSI VPD [21]
✅ stress: stress-ng [22]
Test source:
💚 Pull requests are welcome for new tests or improvements to existing tests!
[0]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[1]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/filesystems…
[2]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/packages/se…
[3]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/lvm/…
[4]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/swra…
[5]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/blk
[6]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/container/p…
[7]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[8]: https://github.com/CKI-project/tests-beaker/archive/master.zip#filesystems/…
[9]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/jvm
[10]: https://github.com/CKI-project/tests-beaker/archive/master.zip#misc/amtu
[11]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[12]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[13]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[14]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/aud…
[15]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/htt…
[16]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/iot…
[17]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/tun…
[18]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/upd…
[19]: https://github.com/CKI-project/tests-beaker/archive/master.zip#standards/us…
[20]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/san…
[21]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/scsi…
[22]: https://github.com/CKI-project/tests-beaker/archive/master.zip#stress/stres…
Waived tests (marked with 🚧)
-----------------------------
This test run included waived tests. Such tests are executed but their results
are not taken into account. Tests are waived when their results are not
reliable enough, e.g. when they're just introduced or are being fixed.
FS_IOC_GETFSLABEL/FS_IOC_SETFSLABEL were added in linux-4.18
in xfs, but not in the compat_ioctl case, so add them here.
FITRIM was added earlier and also lacks a line the same function,
but this is ok because there is an entry in fs/compat_ioctl.c for
it.
Adding all three here to keep the native and compat ioctl handlers
in sync.
Cc: stable(a)vger.kernel.org
Fixes: f7664b31975b ("xfs: implement online get/set fs label")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/xfs/xfs_ioctl32.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
index ad91e81a2fcf..63bdc4c1535b 100644
--- a/fs/xfs/xfs_ioctl32.c
+++ b/fs/xfs/xfs_ioctl32.c
@@ -576,6 +576,9 @@ xfs_file_compat_ioctl(
case XFS_IOC_SCRUB_METADATA:
case XFS_IOC_BULKSTAT:
case XFS_IOC_INUMBERS:
+ case FITRIM:
+ case FS_IOC_GETFSLABEL:
+ case FS_IOC_SETFSLABEL:
return xfs_file_ioctl(filp, cmd, (unsigned long)arg);
#if !defined(BROKEN_X86_ALIGNMENT) || defined(CONFIG_X86_X32)
/*
--
2.20.0
From: Florian Westphal <fw(a)strlen.de>
[ Upstream commit 1b0890cd60829bd51455dc5ad689ed58c4408227 ]
Thomas and Juliana report a deadlock when running:
(rmmod nf_conntrack_netlink/xfrm_user)
conntrack -e NEW -E &
modprobe -v xfrm_user
They provided following analysis:
conntrack -e NEW -E
netlink_bind()
netlink_lock_table() -> increases "nl_table_users"
nfnetlink_bind()
# does not unlock the table as it's locked by netlink_bind()
__request_module()
call_usermodehelper_exec()
This triggers "modprobe nf_conntrack_netlink" from kernel, netlink_bind()
won't return until modprobe process is done.
"modprobe xfrm_user":
xfrm_user_init()
register_pernet_subsys()
-> grab pernet_ops_rwsem
..
netlink_table_grab()
calls schedule() as "nl_table_users" is non-zero
so modprobe is blocked because netlink_bind() increased
nl_table_users while also holding pernet_ops_rwsem.
"modprobe nf_conntrack_netlink" runs and inits nf_conntrack_netlink:
ctnetlink_init()
register_pernet_subsys()
-> blocks on "pernet_ops_rwsem" thanks to xfrm_user module
both modprobe processes wait on one another -- neither can make
progress.
Switch netlink_bind() to "nowait" modprobe -- this releases the netlink
table lock, which then allows both modprobe instances to complete.
Reported-by: Thomas Jarosch <thomas.jarosch(a)intra2net.com>
Reported-by: Juliana Rodrigueiro <juliana.rodrigueiro(a)intra2net.com>
Signed-off-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
net/netfilter/nfnetlink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 916913454624f..7f2c1915763f8 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -575,7 +575,7 @@ static int nfnetlink_bind(struct net *net, int group)
ss = nfnetlink_get_subsys(type << 8);
rcu_read_unlock();
if (!ss)
- request_module("nfnetlink-subsys-%d", type);
+ request_module_nowait("nfnetlink-subsys-%d", type);
return 0;
}
#endif
--
2.20.1
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From df612421fe2566654047769c6852ffae1a31df16 Mon Sep 17 00:00:00 2001
From: Brian Norris <briannorris(a)chromium.org>
Date: Wed, 24 Jul 2019 12:46:34 -0700
Subject: [PATCH] mwifiex: fix 802.11n/WPA detection
Commit 63d7ef36103d ("mwifiex: Don't abort on small, spec-compliant
vendor IEs") adjusted the ieee_types_vendor_header struct, which
inadvertently messed up the offsets used in
mwifiex_is_wpa_oui_present(). Add that offset back in, mirroring
mwifiex_is_rsn_oui_present().
As it stands, commit 63d7ef36103d breaks compatibility with WPA (not
WPA2) 802.11n networks, since we hit the "info: Disable 11n if AES is
not supported by AP" case in mwifiex_is_network_compatible().
Fixes: 63d7ef36103d ("mwifiex: Don't abort on small, spec-compliant vendor IEs")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
Signed-off-by: Kalle Valo <kvalo(a)codeaurora.org>
diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
index 3e442c7f7882..095837fba300 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.h
+++ b/drivers/net/wireless/marvell/mwifiex/main.h
@@ -124,6 +124,7 @@ enum {
#define MWIFIEX_MAX_TOTAL_SCAN_TIME (MWIFIEX_TIMER_10S - MWIFIEX_TIMER_1S)
+#define WPA_GTK_OUI_OFFSET 2
#define RSN_GTK_OUI_OFFSET 2
#define MWIFIEX_OUI_NOT_PRESENT 0
diff --git a/drivers/net/wireless/marvell/mwifiex/scan.c b/drivers/net/wireless/marvell/mwifiex/scan.c
index 0d6d41727037..21dda385f6c6 100644
--- a/drivers/net/wireless/marvell/mwifiex/scan.c
+++ b/drivers/net/wireless/marvell/mwifiex/scan.c
@@ -181,7 +181,8 @@ mwifiex_is_wpa_oui_present(struct mwifiex_bssdescriptor *bss_desc, u32 cipher)
u8 ret = MWIFIEX_OUI_NOT_PRESENT;
if (has_vendor_hdr(bss_desc->bcn_wpa_ie, WLAN_EID_VENDOR_SPECIFIC)) {
- iebody = (struct ie_body *) bss_desc->bcn_wpa_ie->data;
+ iebody = (struct ie_body *)((u8 *)bss_desc->bcn_wpa_ie->data +
+ WPA_GTK_OUI_OFFSET);
oui = &mwifiex_wpa_oui[cipher][0];
ret = mwifiex_search_oui_in_ie(iebody, oui);
if (ret)
Checking bypass_reg is incorrect for calculating the cnt_clk rates.
Instead we should be checking that there is a proper hardware register
that holds the clock divider.
Cc: stable(a)vger.kernel.org
Signed-off-by: Dinh Nguyen <dinguyen(a)kernel.org>
---
drivers/clk/socfpga/clk-periph-s10.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/socfpga/clk-periph-s10.c b/drivers/clk/socfpga/clk-periph-s10.c
index 5c50e723ecae..1a191eeeebba 100644
--- a/drivers/clk/socfpga/clk-periph-s10.c
+++ b/drivers/clk/socfpga/clk-periph-s10.c
@@ -38,7 +38,7 @@ static unsigned long clk_peri_cnt_clk_recalc_rate(struct clk_hw *hwclk,
if (socfpgaclk->fixed_div) {
div = socfpgaclk->fixed_div;
} else {
- if (!socfpgaclk->bypass_reg)
+ if (socfpgaclk->hw.reg)
div = ((readl(socfpgaclk->hw.reg) & 0x7ff) + 1);
}
--
2.20.0
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3d584a3c85d6fe2cf878f220d4ad7145e7f89218 Mon Sep 17 00:00:00 2001
From: Anders Roxell <anders.roxell(a)linaro.org>
Date: Fri, 26 Jul 2019 13:27:05 +0200
Subject: [PATCH] arm64: KVM: regmap: Fix unexpected switch fall-through
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When fall-through warnings was enabled by default, commit d93512ef0f0e
("Makefile: Globally enable fall-through warning"), the following
warnings was starting to show up:
In file included from ../arch/arm64/include/asm/kvm_emulate.h:19,
from ../arch/arm64/kvm/regmap.c:13:
../arch/arm64/kvm/regmap.c: In function ‘vcpu_write_spsr32’:
../arch/arm64/include/asm/kvm_hyp.h:31:3: warning: this statement may fall
through [-Wimplicit-fallthrough=]
asm volatile(ALTERNATIVE(__msr_s(r##nvh, "%x0"), \
^~~
../arch/arm64/include/asm/kvm_hyp.h:46:31: note: in expansion of macro ‘write_sysreg_elx’
#define write_sysreg_el1(v,r) write_sysreg_elx(v, r, _EL1, _EL12)
^~~~~~~~~~~~~~~~
../arch/arm64/kvm/regmap.c:180:3: note: in expansion of macro ‘write_sysreg_el1’
write_sysreg_el1(v, SYS_SPSR);
^~~~~~~~~~~~~~~~
../arch/arm64/kvm/regmap.c:181:2: note: here
case KVM_SPSR_ABT:
^~~~
In file included from ../arch/arm64/include/asm/cputype.h:132,
from ../arch/arm64/include/asm/cache.h:8,
from ../include/linux/cache.h:6,
from ../include/linux/printk.h:9,
from ../include/linux/kernel.h:15,
from ../include/asm-generic/bug.h:18,
from ../arch/arm64/include/asm/bug.h:26,
from ../include/linux/bug.h:5,
from ../include/linux/mmdebug.h:5,
from ../include/linux/mm.h:9,
from ../arch/arm64/kvm/regmap.c:11:
../arch/arm64/include/asm/sysreg.h:837:2: warning: this statement may fall
through [-Wimplicit-fallthrough=]
asm volatile("msr " __stringify(r) ", %x0" \
^~~
../arch/arm64/kvm/regmap.c:182:3: note: in expansion of macro ‘write_sysreg’
write_sysreg(v, spsr_abt);
^~~~~~~~~~~~
../arch/arm64/kvm/regmap.c:183:2: note: here
case KVM_SPSR_UND:
^~~~
Rework to add a 'break;' in the swich-case since it didn't have that,
leading to an interresting set of bugs.
Cc: stable(a)vger.kernel.org # v4.17+
Fixes: a892819560c4 ("KVM: arm64: Prepare to handle deferred save/restore of 32-bit registers")
Signed-off-by: Anders Roxell <anders.roxell(a)linaro.org>
[maz: reworked commit message, fixed stable range]
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/arch/arm64/kvm/regmap.c b/arch/arm64/kvm/regmap.c
index 0d60e4f0af66..a900181e3867 100644
--- a/arch/arm64/kvm/regmap.c
+++ b/arch/arm64/kvm/regmap.c
@@ -178,13 +178,18 @@ void vcpu_write_spsr32(struct kvm_vcpu *vcpu, unsigned long v)
switch (spsr_idx) {
case KVM_SPSR_SVC:
write_sysreg_el1(v, SYS_SPSR);
+ break;
case KVM_SPSR_ABT:
write_sysreg(v, spsr_abt);
+ break;
case KVM_SPSR_UND:
write_sysreg(v, spsr_und);
+ break;
case KVM_SPSR_IRQ:
write_sysreg(v, spsr_irq);
+ break;
case KVM_SPSR_FIQ:
write_sysreg(v, spsr_fiq);
+ break;
}
}
The patch below does not apply to the 5.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3d584a3c85d6fe2cf878f220d4ad7145e7f89218 Mon Sep 17 00:00:00 2001
From: Anders Roxell <anders.roxell(a)linaro.org>
Date: Fri, 26 Jul 2019 13:27:05 +0200
Subject: [PATCH] arm64: KVM: regmap: Fix unexpected switch fall-through
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When fall-through warnings was enabled by default, commit d93512ef0f0e
("Makefile: Globally enable fall-through warning"), the following
warnings was starting to show up:
In file included from ../arch/arm64/include/asm/kvm_emulate.h:19,
from ../arch/arm64/kvm/regmap.c:13:
../arch/arm64/kvm/regmap.c: In function ‘vcpu_write_spsr32’:
../arch/arm64/include/asm/kvm_hyp.h:31:3: warning: this statement may fall
through [-Wimplicit-fallthrough=]
asm volatile(ALTERNATIVE(__msr_s(r##nvh, "%x0"), \
^~~
../arch/arm64/include/asm/kvm_hyp.h:46:31: note: in expansion of macro ‘write_sysreg_elx’
#define write_sysreg_el1(v,r) write_sysreg_elx(v, r, _EL1, _EL12)
^~~~~~~~~~~~~~~~
../arch/arm64/kvm/regmap.c:180:3: note: in expansion of macro ‘write_sysreg_el1’
write_sysreg_el1(v, SYS_SPSR);
^~~~~~~~~~~~~~~~
../arch/arm64/kvm/regmap.c:181:2: note: here
case KVM_SPSR_ABT:
^~~~
In file included from ../arch/arm64/include/asm/cputype.h:132,
from ../arch/arm64/include/asm/cache.h:8,
from ../include/linux/cache.h:6,
from ../include/linux/printk.h:9,
from ../include/linux/kernel.h:15,
from ../include/asm-generic/bug.h:18,
from ../arch/arm64/include/asm/bug.h:26,
from ../include/linux/bug.h:5,
from ../include/linux/mmdebug.h:5,
from ../include/linux/mm.h:9,
from ../arch/arm64/kvm/regmap.c:11:
../arch/arm64/include/asm/sysreg.h:837:2: warning: this statement may fall
through [-Wimplicit-fallthrough=]
asm volatile("msr " __stringify(r) ", %x0" \
^~~
../arch/arm64/kvm/regmap.c:182:3: note: in expansion of macro ‘write_sysreg’
write_sysreg(v, spsr_abt);
^~~~~~~~~~~~
../arch/arm64/kvm/regmap.c:183:2: note: here
case KVM_SPSR_UND:
^~~~
Rework to add a 'break;' in the swich-case since it didn't have that,
leading to an interresting set of bugs.
Cc: stable(a)vger.kernel.org # v4.17+
Fixes: a892819560c4 ("KVM: arm64: Prepare to handle deferred save/restore of 32-bit registers")
Signed-off-by: Anders Roxell <anders.roxell(a)linaro.org>
[maz: reworked commit message, fixed stable range]
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/arch/arm64/kvm/regmap.c b/arch/arm64/kvm/regmap.c
index 0d60e4f0af66..a900181e3867 100644
--- a/arch/arm64/kvm/regmap.c
+++ b/arch/arm64/kvm/regmap.c
@@ -178,13 +178,18 @@ void vcpu_write_spsr32(struct kvm_vcpu *vcpu, unsigned long v)
switch (spsr_idx) {
case KVM_SPSR_SVC:
write_sysreg_el1(v, SYS_SPSR);
+ break;
case KVM_SPSR_ABT:
write_sysreg(v, spsr_abt);
+ break;
case KVM_SPSR_UND:
write_sysreg(v, spsr_und);
+ break;
case KVM_SPSR_IRQ:
write_sysreg(v, spsr_irq);
+ break;
case KVM_SPSR_FIQ:
write_sysreg(v, spsr_fiq);
+ break;
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5eeaf10eec394b28fad2c58f1f5c3a5da0e87d1c Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Fri, 2 Aug 2019 10:28:32 +0100
Subject: [PATCH] KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
Since commit commit 328e56647944 ("KVM: arm/arm64: vgic: Defer
touching GICH_VMCR to vcpu_load/put"), we leave ICH_VMCR_EL2 (or
its GICv2 equivalent) loaded as long as we can, only syncing it
back when we're scheduled out.
There is a small snag with that though: kvm_vgic_vcpu_pending_irq(),
which is indirectly called from kvm_vcpu_check_block(), needs to
evaluate the guest's view of ICC_PMR_EL1. At the point were we
call kvm_vcpu_check_block(), the vcpu is still loaded, and whatever
changes to PMR is not visible in memory until we do a vcpu_put().
Things go really south if the guest does the following:
mov x0, #0 // or any small value masking interrupts
msr ICC_PMR_EL1, x0
[vcpu preempted, then rescheduled, VMCR sampled]
mov x0, #ff // allow all interrupts
msr ICC_PMR_EL1, x0
wfi // traps to EL2, so samping of VMCR
[interrupt arrives just after WFI]
Here, the hypervisor's view of PMR is zero, while the guest has enabled
its interrupts. kvm_vgic_vcpu_pending_irq() will then say that no
interrupts are pending (despite an interrupt being received) and we'll
block for no reason. If the guest doesn't have a periodic interrupt
firing once it has blocked, it will stay there forever.
To avoid this unfortuante situation, let's resync VMCR from
kvm_arch_vcpu_blocking(), ensuring that a following kvm_vcpu_check_block()
will observe the latest value of PMR.
This has been found by booting an arm64 Linux guest with the pseudo NMI
feature, and thus using interrupt priorities to mask interrupts instead
of the usual PSTATE masking.
Cc: stable(a)vger.kernel.org # 4.12
Fixes: 328e56647944 ("KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/put")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 46bbc949c20a..7a30524a80ee 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -350,6 +350,7 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
void kvm_vgic_load(struct kvm_vcpu *vcpu);
void kvm_vgic_put(struct kvm_vcpu *vcpu);
+void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu);
#define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel))
#define vgic_initialized(k) ((k)->arch.vgic.initialized)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index c704fa696184..482b20256fa8 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -323,6 +323,17 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
{
+ /*
+ * If we're about to block (most likely because we've just hit a
+ * WFI), we need to sync back the state of the GIC CPU interface
+ * so that we have the lastest PMR and group enables. This ensures
+ * that kvm_arch_vcpu_runnable has up-to-date data to decide
+ * whether we have pending interrupts.
+ */
+ preempt_disable();
+ kvm_vgic_vmcr_sync(vcpu);
+ preempt_enable();
+
kvm_vgic_v4_enable_doorbell(vcpu);
}
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 6dd5ad706c92..96aab77d0471 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -484,10 +484,17 @@ void vgic_v2_load(struct kvm_vcpu *vcpu)
kvm_vgic_global_state.vctrl_base + GICH_APR);
}
-void vgic_v2_put(struct kvm_vcpu *vcpu)
+void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu)
{
struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
+}
+
+void vgic_v2_put(struct kvm_vcpu *vcpu)
+{
+ struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
+
+ vgic_v2_vmcr_sync(vcpu);
cpu_if->vgic_apr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_APR);
}
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index c2c9ce009f63..0c653a1e5215 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -662,12 +662,17 @@ void vgic_v3_load(struct kvm_vcpu *vcpu)
__vgic_v3_activate_traps(vcpu);
}
-void vgic_v3_put(struct kvm_vcpu *vcpu)
+void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu)
{
struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
if (likely(cpu_if->vgic_sre))
cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr);
+}
+
+void vgic_v3_put(struct kvm_vcpu *vcpu)
+{
+ vgic_v3_vmcr_sync(vcpu);
kvm_call_hyp(__vgic_v3_save_aprs, vcpu);
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 04786c8ec77e..13d4b38a94ec 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -919,6 +919,17 @@ void kvm_vgic_put(struct kvm_vcpu *vcpu)
vgic_v3_put(vcpu);
}
+void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu)
+{
+ if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
+ return;
+
+ if (kvm_vgic_global_state.type == VGIC_V2)
+ vgic_v2_vmcr_sync(vcpu);
+ else
+ vgic_v3_vmcr_sync(vcpu);
+}
+
int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 57205beaa981..11adbdac1d56 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -193,6 +193,7 @@ int vgic_register_dist_iodev(struct kvm *kvm, gpa_t dist_base_address,
void vgic_v2_init_lrs(void);
void vgic_v2_load(struct kvm_vcpu *vcpu);
void vgic_v2_put(struct kvm_vcpu *vcpu);
+void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu);
void vgic_v2_save_state(struct kvm_vcpu *vcpu);
void vgic_v2_restore_state(struct kvm_vcpu *vcpu);
@@ -223,6 +224,7 @@ bool vgic_v3_check_base(struct kvm *kvm);
void vgic_v3_load(struct kvm_vcpu *vcpu);
void vgic_v3_put(struct kvm_vcpu *vcpu);
+void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu);
bool vgic_has_its(struct kvm *kvm);
int kvm_vgic_register_its_device(void);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5eeaf10eec394b28fad2c58f1f5c3a5da0e87d1c Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Fri, 2 Aug 2019 10:28:32 +0100
Subject: [PATCH] KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
Since commit commit 328e56647944 ("KVM: arm/arm64: vgic: Defer
touching GICH_VMCR to vcpu_load/put"), we leave ICH_VMCR_EL2 (or
its GICv2 equivalent) loaded as long as we can, only syncing it
back when we're scheduled out.
There is a small snag with that though: kvm_vgic_vcpu_pending_irq(),
which is indirectly called from kvm_vcpu_check_block(), needs to
evaluate the guest's view of ICC_PMR_EL1. At the point were we
call kvm_vcpu_check_block(), the vcpu is still loaded, and whatever
changes to PMR is not visible in memory until we do a vcpu_put().
Things go really south if the guest does the following:
mov x0, #0 // or any small value masking interrupts
msr ICC_PMR_EL1, x0
[vcpu preempted, then rescheduled, VMCR sampled]
mov x0, #ff // allow all interrupts
msr ICC_PMR_EL1, x0
wfi // traps to EL2, so samping of VMCR
[interrupt arrives just after WFI]
Here, the hypervisor's view of PMR is zero, while the guest has enabled
its interrupts. kvm_vgic_vcpu_pending_irq() will then say that no
interrupts are pending (despite an interrupt being received) and we'll
block for no reason. If the guest doesn't have a periodic interrupt
firing once it has blocked, it will stay there forever.
To avoid this unfortuante situation, let's resync VMCR from
kvm_arch_vcpu_blocking(), ensuring that a following kvm_vcpu_check_block()
will observe the latest value of PMR.
This has been found by booting an arm64 Linux guest with the pseudo NMI
feature, and thus using interrupt priorities to mask interrupts instead
of the usual PSTATE masking.
Cc: stable(a)vger.kernel.org # 4.12
Fixes: 328e56647944 ("KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/put")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 46bbc949c20a..7a30524a80ee 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -350,6 +350,7 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
void kvm_vgic_load(struct kvm_vcpu *vcpu);
void kvm_vgic_put(struct kvm_vcpu *vcpu);
+void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu);
#define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel))
#define vgic_initialized(k) ((k)->arch.vgic.initialized)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index c704fa696184..482b20256fa8 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -323,6 +323,17 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
{
+ /*
+ * If we're about to block (most likely because we've just hit a
+ * WFI), we need to sync back the state of the GIC CPU interface
+ * so that we have the lastest PMR and group enables. This ensures
+ * that kvm_arch_vcpu_runnable has up-to-date data to decide
+ * whether we have pending interrupts.
+ */
+ preempt_disable();
+ kvm_vgic_vmcr_sync(vcpu);
+ preempt_enable();
+
kvm_vgic_v4_enable_doorbell(vcpu);
}
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 6dd5ad706c92..96aab77d0471 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -484,10 +484,17 @@ void vgic_v2_load(struct kvm_vcpu *vcpu)
kvm_vgic_global_state.vctrl_base + GICH_APR);
}
-void vgic_v2_put(struct kvm_vcpu *vcpu)
+void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu)
{
struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
+}
+
+void vgic_v2_put(struct kvm_vcpu *vcpu)
+{
+ struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
+
+ vgic_v2_vmcr_sync(vcpu);
cpu_if->vgic_apr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_APR);
}
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index c2c9ce009f63..0c653a1e5215 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -662,12 +662,17 @@ void vgic_v3_load(struct kvm_vcpu *vcpu)
__vgic_v3_activate_traps(vcpu);
}
-void vgic_v3_put(struct kvm_vcpu *vcpu)
+void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu)
{
struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
if (likely(cpu_if->vgic_sre))
cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr);
+}
+
+void vgic_v3_put(struct kvm_vcpu *vcpu)
+{
+ vgic_v3_vmcr_sync(vcpu);
kvm_call_hyp(__vgic_v3_save_aprs, vcpu);
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 04786c8ec77e..13d4b38a94ec 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -919,6 +919,17 @@ void kvm_vgic_put(struct kvm_vcpu *vcpu)
vgic_v3_put(vcpu);
}
+void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu)
+{
+ if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
+ return;
+
+ if (kvm_vgic_global_state.type == VGIC_V2)
+ vgic_v2_vmcr_sync(vcpu);
+ else
+ vgic_v3_vmcr_sync(vcpu);
+}
+
int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 57205beaa981..11adbdac1d56 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -193,6 +193,7 @@ int vgic_register_dist_iodev(struct kvm *kvm, gpa_t dist_base_address,
void vgic_v2_init_lrs(void);
void vgic_v2_load(struct kvm_vcpu *vcpu);
void vgic_v2_put(struct kvm_vcpu *vcpu);
+void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu);
void vgic_v2_save_state(struct kvm_vcpu *vcpu);
void vgic_v2_restore_state(struct kvm_vcpu *vcpu);
@@ -223,6 +224,7 @@ bool vgic_v3_check_base(struct kvm *kvm);
void vgic_v3_load(struct kvm_vcpu *vcpu);
void vgic_v3_put(struct kvm_vcpu *vcpu);
+void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu);
bool vgic_has_its(struct kvm *kvm);
int kvm_vgic_register_its_device(void);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e3c8dc761ead061da2220ee8f8132f729ac3ddfe Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Mon, 29 Jul 2019 18:25:00 +0100
Subject: [PATCH] NFSv4: Check the return value of update_open_stateid()
Ensure that we always check the return value of update_open_stateid()
so that we can retry if the update of local state failed. This fixes
infinite looping on state recovery.
Fixes: e23008ec81ef3 ("NFSv4 reduce attribute requests for open reclaim")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Cc: stable(a)vger.kernel.org # v3.7+
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c9e14ce0b7b2..3e0b93f2b61a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1915,8 +1915,9 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
update:
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode))
+ return ERR_PTR(-EAGAIN);
refcount_inc(&state->count);
return state;
@@ -1981,8 +1982,11 @@ _nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode)) {
+ nfs4_put_open_state(state);
+ state = ERR_PTR(-EAGAIN);
+ }
out:
nfs_release_seqid(data->o_arg.seqid);
return state;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e3c8dc761ead061da2220ee8f8132f729ac3ddfe Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Mon, 29 Jul 2019 18:25:00 +0100
Subject: [PATCH] NFSv4: Check the return value of update_open_stateid()
Ensure that we always check the return value of update_open_stateid()
so that we can retry if the update of local state failed. This fixes
infinite looping on state recovery.
Fixes: e23008ec81ef3 ("NFSv4 reduce attribute requests for open reclaim")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Cc: stable(a)vger.kernel.org # v3.7+
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c9e14ce0b7b2..3e0b93f2b61a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1915,8 +1915,9 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
update:
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode))
+ return ERR_PTR(-EAGAIN);
refcount_inc(&state->count);
return state;
@@ -1981,8 +1982,11 @@ _nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode)) {
+ nfs4_put_open_state(state);
+ state = ERR_PTR(-EAGAIN);
+ }
out:
nfs_release_seqid(data->o_arg.seqid);
return state;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e3c8dc761ead061da2220ee8f8132f729ac3ddfe Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Mon, 29 Jul 2019 18:25:00 +0100
Subject: [PATCH] NFSv4: Check the return value of update_open_stateid()
Ensure that we always check the return value of update_open_stateid()
so that we can retry if the update of local state failed. This fixes
infinite looping on state recovery.
Fixes: e23008ec81ef3 ("NFSv4 reduce attribute requests for open reclaim")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Cc: stable(a)vger.kernel.org # v3.7+
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c9e14ce0b7b2..3e0b93f2b61a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1915,8 +1915,9 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
update:
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode))
+ return ERR_PTR(-EAGAIN);
refcount_inc(&state->count);
return state;
@@ -1981,8 +1982,11 @@ _nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode)) {
+ nfs4_put_open_state(state);
+ state = ERR_PTR(-EAGAIN);
+ }
out:
nfs_release_seqid(data->o_arg.seqid);
return state;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e3c8dc761ead061da2220ee8f8132f729ac3ddfe Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Mon, 29 Jul 2019 18:25:00 +0100
Subject: [PATCH] NFSv4: Check the return value of update_open_stateid()
Ensure that we always check the return value of update_open_stateid()
so that we can retry if the update of local state failed. This fixes
infinite looping on state recovery.
Fixes: e23008ec81ef3 ("NFSv4 reduce attribute requests for open reclaim")
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Cc: stable(a)vger.kernel.org # v3.7+
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c9e14ce0b7b2..3e0b93f2b61a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1915,8 +1915,9 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
update:
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode))
+ return ERR_PTR(-EAGAIN);
refcount_inc(&state->count);
return state;
@@ -1981,8 +1982,11 @@ _nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
- update_open_stateid(state, &data->o_res.stateid, NULL,
- data->o_arg.fmode);
+ if (!update_open_stateid(state, &data->o_res.stateid,
+ NULL, data->o_arg.fmode)) {
+ nfs4_put_open_state(state);
+ state = ERR_PTR(-EAGAIN);
+ }
out:
nfs_release_seqid(data->o_arg.seqid);
return state;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3d92aa45fbfd7319e3a19f4ec59fd32b3862b723 Mon Sep 17 00:00:00 2001
From: Wenwen Wang <wenwen(a)cs.uga.edu>
Date: Wed, 7 Aug 2019 04:08:51 -0500
Subject: [PATCH] ALSA: hiface: fix multiple memory leak bugs
In hiface_pcm_init(), 'rt' is firstly allocated through kzalloc(). Later
on, hiface_pcm_init_urb() is invoked to initialize 'rt->out_urbs[i]'. In
hiface_pcm_init_urb(), 'rt->out_urbs[i].buffer' is allocated through
kzalloc(). However, if hiface_pcm_init_urb() fails, both 'rt' and
'rt->out_urbs[i].buffer' are not deallocated, leading to memory leak bugs.
Also, 'rt->out_urbs[i].buffer' is not deallocated if snd_pcm_new() fails.
To fix the above issues, free 'rt' and 'rt->out_urbs[i].buffer'.
Fixes: a91c3fb2f842 ("Add M2Tech hiFace USB-SPDIF driver")
Signed-off-by: Wenwen Wang <wenwen(a)cs.uga.edu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/usb/hiface/pcm.c b/sound/usb/hiface/pcm.c
index 14fc1e1d5d13..c406497c5919 100644
--- a/sound/usb/hiface/pcm.c
+++ b/sound/usb/hiface/pcm.c
@@ -600,14 +600,13 @@ int hiface_pcm_init(struct hiface_chip *chip, u8 extra_freq)
ret = hiface_pcm_init_urb(&rt->out_urbs[i], chip, OUT_EP,
hiface_pcm_out_urb_handler);
if (ret < 0)
- return ret;
+ goto error;
}
ret = snd_pcm_new(chip->card, "USB-SPDIF Audio", 0, 1, 0, &pcm);
if (ret < 0) {
- kfree(rt);
dev_err(&chip->dev->dev, "Cannot create pcm instance\n");
- return ret;
+ goto error;
}
pcm->private_data = rt;
@@ -620,4 +619,10 @@ int hiface_pcm_init(struct hiface_chip *chip, u8 extra_freq)
chip->pcm = rt;
return 0;
+
+error:
+ for (i = 0; i < PCM_N_URBS; i++)
+ kfree(rt->out_urbs[i].buffer);
+ kfree(rt);
+ return ret;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3d92aa45fbfd7319e3a19f4ec59fd32b3862b723 Mon Sep 17 00:00:00 2001
From: Wenwen Wang <wenwen(a)cs.uga.edu>
Date: Wed, 7 Aug 2019 04:08:51 -0500
Subject: [PATCH] ALSA: hiface: fix multiple memory leak bugs
In hiface_pcm_init(), 'rt' is firstly allocated through kzalloc(). Later
on, hiface_pcm_init_urb() is invoked to initialize 'rt->out_urbs[i]'. In
hiface_pcm_init_urb(), 'rt->out_urbs[i].buffer' is allocated through
kzalloc(). However, if hiface_pcm_init_urb() fails, both 'rt' and
'rt->out_urbs[i].buffer' are not deallocated, leading to memory leak bugs.
Also, 'rt->out_urbs[i].buffer' is not deallocated if snd_pcm_new() fails.
To fix the above issues, free 'rt' and 'rt->out_urbs[i].buffer'.
Fixes: a91c3fb2f842 ("Add M2Tech hiFace USB-SPDIF driver")
Signed-off-by: Wenwen Wang <wenwen(a)cs.uga.edu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/usb/hiface/pcm.c b/sound/usb/hiface/pcm.c
index 14fc1e1d5d13..c406497c5919 100644
--- a/sound/usb/hiface/pcm.c
+++ b/sound/usb/hiface/pcm.c
@@ -600,14 +600,13 @@ int hiface_pcm_init(struct hiface_chip *chip, u8 extra_freq)
ret = hiface_pcm_init_urb(&rt->out_urbs[i], chip, OUT_EP,
hiface_pcm_out_urb_handler);
if (ret < 0)
- return ret;
+ goto error;
}
ret = snd_pcm_new(chip->card, "USB-SPDIF Audio", 0, 1, 0, &pcm);
if (ret < 0) {
- kfree(rt);
dev_err(&chip->dev->dev, "Cannot create pcm instance\n");
- return ret;
+ goto error;
}
pcm->private_data = rt;
@@ -620,4 +619,10 @@ int hiface_pcm_init(struct hiface_chip *chip, u8 extra_freq)
chip->pcm = rt;
return 0;
+
+error:
+ for (i = 0; i < PCM_N_URBS; i++)
+ kfree(rt->out_urbs[i].buffer);
+ kfree(rt);
+ return ret;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3d92aa45fbfd7319e3a19f4ec59fd32b3862b723 Mon Sep 17 00:00:00 2001
From: Wenwen Wang <wenwen(a)cs.uga.edu>
Date: Wed, 7 Aug 2019 04:08:51 -0500
Subject: [PATCH] ALSA: hiface: fix multiple memory leak bugs
In hiface_pcm_init(), 'rt' is firstly allocated through kzalloc(). Later
on, hiface_pcm_init_urb() is invoked to initialize 'rt->out_urbs[i]'. In
hiface_pcm_init_urb(), 'rt->out_urbs[i].buffer' is allocated through
kzalloc(). However, if hiface_pcm_init_urb() fails, both 'rt' and
'rt->out_urbs[i].buffer' are not deallocated, leading to memory leak bugs.
Also, 'rt->out_urbs[i].buffer' is not deallocated if snd_pcm_new() fails.
To fix the above issues, free 'rt' and 'rt->out_urbs[i].buffer'.
Fixes: a91c3fb2f842 ("Add M2Tech hiFace USB-SPDIF driver")
Signed-off-by: Wenwen Wang <wenwen(a)cs.uga.edu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/usb/hiface/pcm.c b/sound/usb/hiface/pcm.c
index 14fc1e1d5d13..c406497c5919 100644
--- a/sound/usb/hiface/pcm.c
+++ b/sound/usb/hiface/pcm.c
@@ -600,14 +600,13 @@ int hiface_pcm_init(struct hiface_chip *chip, u8 extra_freq)
ret = hiface_pcm_init_urb(&rt->out_urbs[i], chip, OUT_EP,
hiface_pcm_out_urb_handler);
if (ret < 0)
- return ret;
+ goto error;
}
ret = snd_pcm_new(chip->card, "USB-SPDIF Audio", 0, 1, 0, &pcm);
if (ret < 0) {
- kfree(rt);
dev_err(&chip->dev->dev, "Cannot create pcm instance\n");
- return ret;
+ goto error;
}
pcm->private_data = rt;
@@ -620,4 +619,10 @@ int hiface_pcm_init(struct hiface_chip *chip, u8 extra_freq)
chip->pcm = rt;
return 0;
+
+error:
+ for (i = 0; i < PCM_N_URBS; i++)
+ kfree(rt->out_urbs[i].buffer);
+ kfree(rt);
+ return ret;
}
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: d36a8d2fb62c - Linux 5.2.8
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/99643
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out the following commit:
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: d36a8d2fb62c - Linux 5.2.8
We grabbed the e939b44eeef5 commit of the stable queue repository.
We then merged the patchset with `git am`:
revert-pci-add-missing-link-delays-required-by-the-pcie-spec.patch
iio-ingenic-jz47xx-set-clock-divider-on-probe.patch
iio-cros_ec_accel_legacy-fix-incorrect-channel-setting.patch
iio-imu-mpu6050-add-missing-available-scan-masks.patch
iio-adc-gyroadc-fix-uninitialized-return-code.patch
iio-adc-max9611-fix-misuse-of-genmask-macro.patch
staging-gasket-apex-fix-copy-paste-typo.patch
staging-wilc1000-flush-the-workqueue-before-deinit-the-host.patch
staging-android-ion-bail-out-upon-sigkill-when-allocating-memory.patch
staging-fbtft-fix-probing-of-gpio-descriptor.patch
staging-fbtft-fix-reset-assertion-when-using-gpio-descriptor.patch
crypto-ccp-fix-oops-by-properly-managing-allocated-structures.patch
crypto-ccp-add-support-for-valid-authsize-values-less-than-16.patch
crypto-ccp-ignore-tag-length-when-decrypting-gcm-ciphertext.patch
driver-core-platform-return-enxio-for-missing-gpioint.patch
usb-usbfs-fix-double-free-of-usb-memory-upon-submiturb-error.patch
revert-usb-rio500-simplify-locking.patch
usb-iowarrior-fix-deadlock-on-disconnect.patch
sound-fix-a-memory-leak-bug.patch
mmc-cavium-set-the-correct-dma-max-segment-size-for-mmc_host.patch
mmc-cavium-add-the-missing-dma-unmap-when-the-dma-has-finished.patch
loop-set-pf_memalloc_noio-for-the-worker-thread.patch
bdev-fixup-error-handling-in-blkdev_get.patch
input-usbtouchscreen-initialize-pm-mutex-before-using-it.patch
input-elantech-enable-smbus-on-new-2018-systems.patch
input-synaptics-enable-rmi-mode-for-hp-spectre-x360.patch
x86-mm-check-for-pfn-instead-of-page-in-vmalloc_sync_one.patch
x86-mm-sync-also-unmappings-in-vmalloc_sync_all.patch
mm-vmalloc-sync-unmappings-in-__purge_vmap_area_lazy.patch
coresight-fix-debug_locks_warn_on-for-uninitialized-attribute.patch
perf-annotate-fix-s390-gap-between-kernel-end-and-module-start.patch
perf-db-export-fix-thread__exec_comm.patch
perf-record-fix-module-size-on-s390.patch
x86-purgatory-do-not-use-__builtin_memcpy-and-__builtin_memset.patch
x86-purgatory-use-cflags_remove-rather-than-reset-kbuild_cflags.patch
genirq-affinity-create-affinity-mask-for-single-vector.patch
gfs2-gfs2_walk_metadata-fix.patch
usb-host-xhci-rcar-fix-timeout-in-xhci_suspend.patch
usb-yurex-fix-use-after-free-in-yurex_delete.patch
usb-typec-ucsi-ccg-fix-uninitilized-symbol-error.patch
usb-typec-tcpm-free-log-buf-memory-when-remove-debug-file.patch
usb-typec-tcpm-remove-tcpm-dir-if-no-children.patch
usb-typec-tcpm-add-null-check-before-dereferencing-config.patch
usb-typec-tcpm-ignore-unsupported-unknown-alternate-mode-requests.patch
can-rcar_canfd-fix-possible-irq-storm-on-high-load.patch
can-flexcan-fix-stop-mode-acknowledgment.patch
can-flexcan-fix-an-use-after-free-in-flexcan_setup_stop_mode.patch
can-peak_usb-fix-potential-double-kfree_skb.patch
powerpc-fix-off-by-one-in-max_zone_pfn-initializatio.patch
netfilter-nfnetlink-avoid-deadlock-due-to-synchronou.patch
vfio-ccw-set-pa_nr-to-0-if-memory-allocation-fails-f.patch
vfio-ccw-don-t-call-cp_free-if-we-are-processing-a-c.patch
netfilter-fix-rpfilter-dropping-vrf-packets-by-mista.patch
netfilter-nf_tables-fix-module-autoload-for-redir.patch
netfilter-conntrack-always-store-window-size-un-scal.patch
netfilter-nft_hash-fix-symhash-with-modulus-one.patch
scripts-sphinx-pre-install-fix-script-for-rhel-cento.patch
scripts-sphinx-pre-install-don-t-use-latex-with-cent.patch
scripts-sphinx-pre-install-fix-latexmk-dependencies.patch
rq-qos-don-t-reset-has_sleepers-on-spurious-wakeups.patch
rq-qos-set-ourself-task_uninterruptible-after-we-sch.patch
rq-qos-use-a-mb-for-got_token.patch
netfilter-nf_tables-support-auto-loading-for-inet-na.patch
drm-amd-display-no-audio-endpoint-for-dell-mst-displ.patch
drm-amd-display-clock-does-not-lower-in-updateplanes.patch
drm-amd-display-wait-for-backlight-programming-compl.patch
drm-amd-display-fix-dmcu-hang-when-going-into-modern.patch
drm-amd-display-use-encoder-s-engine-id-to-find-matc.patch
drm-amd-display-put-back-front-end-initialization-se.patch
drm-amd-display-allocate-4-ddc-engines-for-rv2.patch
drm-amd-display-fix-dc_create-failure-handling-and-6.patch
drm-amd-display-only-enable-audio-if-speaker-allocat.patch
drm-amd-display-increase-size-of-audios-array.patch
iscsi_ibft-make-iscsi_ibft-dependson-acpi-instead-of.patch
nl80211-fix-nl80211_he_max_capability_len.patch
mac80211-fix-possible-memory-leak-in-ieee80211_assig.patch
mac80211-don-t-warn-about-cw-params-when-not-using-t.patch
allocate_flower_entry-should-check-for-null-deref.patch
hwmon-occ-fix-division-by-zero-issue.patch
hwmon-nct6775-fix-register-address-and-added-missed-.patch
arm-dts-imx6ul-fix-clock-frequency-property-name-of-.patch
powerpc-papr_scm-force-a-scm-unbind-if-initial-scm-b.patch
arm64-force-ssbs-on-context-switch.patch
arm64-entry-sp-alignment-fault-doesn-t-write-to-far_.patch
iommu-vt-d-check-if-domain-pgd-was-allocated.patch
drm-msm-dpu-correct-dpu-encoder-spinlock-initializat.patch
drm-silence-variable-conn-set-but-not-used.patch
arm64-dts-imx8mm-correct-sai3-rxc-txfs-pin-s-mux-opt.patch
arm64-dts-imx8mq-fix-sai-compatible.patch
cpufreq-pasemi-fix-use-after-free-in-pas_cpufreq_cpu.patch
s390-qdio-add-sanity-checks-to-the-fast-requeue-path.patch
alsa-compress-fix-regression-on-compressed-capture-s.patch
alsa-compress-prevent-bypasses-of-set_params.patch
alsa-compress-don-t-allow-paritial-drain-operations-.patch
alsa-compress-be-more-restrictive-about-when-a-drain.patch
perf-script-fix-off-by-one-in-brstackinsn-ipc-comput.patch
perf-tools-fix-proper-buffer-size-for-feature-proces.patch
perf-stat-fix-segfault-for-event-group-in-repeat-mod.patch
perf-session-fix-loading-of-compressed-data-split-ac.patch
perf-probe-avoid-calling-freeing-routine-multiple-ti.patch
drbd-dynamically-allocate-shash-descriptor.patch
acpi-iort-fix-off-by-one-check-in-iort_dev_find_its_.patch
nvme-ignore-subnqn-for-adata-sx6000lnp.patch
nvme-fix-memory-leak-caused-by-incorrect-subsystem-f.patch
arm-davinci-fix-sleep.s-build-error-on-armv4.patch
arm-dts-bcm-bcm47094-add-missing-cells-for-mdio-bus-.patch
scsi-megaraid_sas-fix-panic-on-loading-firmware-cras.patch
scsi-ibmvfc-fix-warn_on-during-event-pool-release.patch
scsi-scsi_dh_alua-always-use-a-2-second-delay-before.patch
test_firmware-fix-a-memory-leak-bug.patch
tty-ldsem-locking-rwsem-add-missing-acquire-to-read_.patch
perf-x86-intel-fix-slots-pebs-event-constraint.patch
perf-x86-intel-fix-invalid-bit-13-for-icelake-msr_of.patch
perf-x86-apply-more-accurate-check-on-hypervisor-pla.patch
perf-core-fix-creating-kernel-counters-for-pmus-that.patch
s390-dma-provide-proper-arch_zone_dma_bits-value.patch
gen_compile_commands-lower-the-entry-count-threshold.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ audit: audit testsuite test [14]
✅ httpd: mod_ssl smoke sanity [15]
✅ iotop: sanity [16]
✅ tuned: tune-processes-through-perf [17]
✅ pciutils: update pci ids test [18]
✅ Usex - version 1.9-29 [19]
✅ storage: SCSI VPD [20]
ppc64le:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
x86_64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [1]
✅ selinux-policy: serge-testsuite [2]
✅ lvm thinp sanity [3]
✅ storage: software RAID testing [4]
🚧 ✅ Storage blktests [5]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [6]
✅ Podman system integration test (as user) [6]
✅ LTP lite [7]
✅ Loopdev Sanity [8]
✅ jvm test suite [9]
✅ AMTU (Abstract Machine Test Utility) [10]
✅ LTP: openposix test suite [11]
✅ Ethernet drivers sanity [12]
✅ Networking socket: fuzz [13]
✅ audit: audit testsuite test [14]
✅ httpd: mod_ssl smoke sanity [15]
✅ iotop: sanity [16]
✅ tuned: tune-processes-through-perf [17]
✅ pciutils: sanity smoke test [22]
✅ pciutils: update pci ids test [18]
✅ Usex - version 1.9-29 [19]
✅ storage: SCSI VPD [20]
✅ stress: stress-ng [21]
Test source:
💚 Pull requests are welcome for new tests or improvements to existing tests!
[0]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[1]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/filesystems…
[2]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/packages/se…
[3]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/lvm/…
[4]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/swra…
[5]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/blk
[6]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/container/p…
[7]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[8]: https://github.com/CKI-project/tests-beaker/archive/master.zip#filesystems/…
[9]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/jvm
[10]: https://github.com/CKI-project/tests-beaker/archive/master.zip#misc/amtu
[11]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[12]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[13]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/networking/…
[14]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/aud…
[15]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/htt…
[16]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/iot…
[17]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/tun…
[18]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/upd…
[19]: https://github.com/CKI-project/tests-beaker/archive/master.zip#standards/us…
[20]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/scsi…
[21]: https://github.com/CKI-project/tests-beaker/archive/master.zip#stress/stres…
[22]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/san…
Waived tests (marked with 🚧)
-----------------------------
This test run included waived tests. Such tests are executed but their results
are not taken into account. Tests are waived when their results are not
reliable enough, e.g. when they're just introduced or are being fixed.
On Tue, Aug 13, 2019 at 05:55:26PM +0000, Sasha Levin wrote:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
>
> The bot has tested the following trees: v5.2.8, v4.19.66, v4.14.138, v4.9.189, v4.4.189.
>
> v5.2.8: Build OK!
> v4.19.66: Failed to apply! Possible dependencies:
> 86c71d3deefa ("mt76: move eeprom utility routines in mt76x02_eeprom.h")
> d6500cf3700f ("mt76x0: add quirk to disable 2.4GHz band for Archer T1U")
> eef40d209ad0 ("mt76: move common eeprom definitions in mt76x02-lib module")
<snip>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
mt76x0e support was added in 4.20 , so it's fine just to apply this
commit in 5.2 .
Stanislaw
Since the role_store() uses strncmp(), it's possible to refer
out-of-memory if the sysfs data size is smaller than strlen("host").
This patch fixes it by using sysfs_streq() instead of strncmp().
Reported-by: Pavel Machek <pavel(a)denx.de>
Fixes: 9bb86777fb71 ("phy: rcar-gen3-usb2: add sysfs for usb role swap")
Cc: <stable(a)vger.kernel.org> # v4.10+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
---
Just a record. The role_store() doesn't need to check the count because
the sysfs_streq() checks the first argument is NULL or not.
On "if (sysfs_streq(buf, "host"))"
Example 1: echo ho > role
--> In this case, the count is 3 and the buf has "ho" + NULL.
So, the third character differs between NULL and 's'.
Example 2: echo host-is-not-used > role
--> In this case, the count is 17 and the buf has "host-is-not-used" + NULL.
So, the fifth character differs between '-' and NULL.
drivers/phy/renesas/phy-rcar-gen3-usb2.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/phy/renesas/phy-rcar-gen3-usb2.c b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
index 1322185..cc18970 100644
--- a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
+++ b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
@@ -20,6 +20,7 @@
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
#include <linux/regulator/consumer.h>
+#include <linux/string.h>
#include <linux/usb/of.h>
#include <linux/workqueue.h>
@@ -317,9 +318,9 @@ static ssize_t role_store(struct device *dev, struct device_attribute *attr,
if (!ch->is_otg_channel || !rcar_gen3_is_any_rphy_initialized(ch))
return -EIO;
- if (!strncmp(buf, "host", strlen("host")))
+ if (sysfs_streq(buf, "host"))
new_mode = PHY_MODE_USB_HOST;
- else if (!strncmp(buf, "peripheral", strlen("peripheral")))
+ else if (sysfs_streq(buf, "peripheral"))
new_mode = PHY_MODE_USB_DEVICE;
else
return -EINVAL;
--
2.7.4
From: Joerg Roedel <jroedel(a)suse.de>
Backport commits from upstream to fix a data corruption
issue that gets exposed when using PTI on x86-32.
Please consider them for inclusion into stable-5.2.
Joerg Roedel (3):
x86/mm: Check for pfn instead of page in vmalloc_sync_one()
x86/mm: Sync also unmappings in vmalloc_sync_all()
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
arch/x86/mm/fault.c | 15 ++++++---------
mm/vmalloc.c | 9 +++++++++
2 files changed, 15 insertions(+), 9 deletions(-)
--
2.16.4
We have 3 new lenovo laptops which have conexant codec 0x14f11f86,
these 3 laptops also have the noise issue when rebooting, after
letting the codec enter D3 before rebooting or poweroff, the noise
disappers.
Instead of adding a new ID again in the reboot_notify(), let us make
this function apply to all conexant codec. In theory make codec enter
D3 before rebooting or poweroff is harmless, and I tested this change
on a couple of other Lenovo laptops which have different conexant
codecs, there is no side effect so far.
Cc: stable(a)vger.kernel.org
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
---
sound/pci/hda/patch_conexant.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
index f299f137eaea..93a303676aea 100644
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -163,15 +163,6 @@ static void cx_auto_reboot_notify(struct hda_codec *codec)
{
struct conexant_spec *spec = codec->spec;
- switch (codec->core.vendor_id) {
- case 0x14f12008: /* CX8200 */
- case 0x14f150f2: /* CX20722 */
- case 0x14f150f4: /* CX20724 */
- break;
- default:
- return;
- }
-
/* Turn the problematic codec into D3 to avoid spurious noises
from the internal speaker during (and after) reboot */
cx_auto_turn_eapd(codec, spec->num_eapds, spec->eapds, false);
--
2.17.1
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: d36a8d2fb62c - Linux 5.2.8
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/98975
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out the following commit:
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: d36a8d2fb62c - Linux 5.2.8
We grabbed the eedd757abb9e commit of the stable queue repository.
We then merged the patchset with `git am`:
revert-pci-add-missing-link-delays-required-by-the-pcie-spec.patch
iio-ingenic-jz47xx-set-clock-divider-on-probe.patch
iio-cros_ec_accel_legacy-fix-incorrect-channel-setting.patch
iio-imu-mpu6050-add-missing-available-scan-masks.patch
iio-adc-gyroadc-fix-uninitialized-return-code.patch
iio-adc-max9611-fix-misuse-of-genmask-macro.patch
staging-gasket-apex-fix-copy-paste-typo.patch
staging-wilc1000-flush-the-workqueue-before-deinit-the-host.patch
staging-android-ion-bail-out-upon-sigkill-when-allocating-memory.patch
staging-fbtft-fix-probing-of-gpio-descriptor.patch
staging-fbtft-fix-reset-assertion-when-using-gpio-descriptor.patch
crypto-ccp-fix-oops-by-properly-managing-allocated-structures.patch
crypto-ccp-add-support-for-valid-authsize-values-less-than-16.patch
crypto-ccp-ignore-tag-length-when-decrypting-gcm-ciphertext.patch
driver-core-platform-return-enxio-for-missing-gpioint.patch
usb-usbfs-fix-double-free-of-usb-memory-upon-submiturb-error.patch
revert-usb-rio500-simplify-locking.patch
usb-iowarrior-fix-deadlock-on-disconnect.patch
sound-fix-a-memory-leak-bug.patch
mmc-cavium-set-the-correct-dma-max-segment-size-for-mmc_host.patch
mmc-cavium-add-the-missing-dma-unmap-when-the-dma-has-finished.patch
loop-set-pf_memalloc_noio-for-the-worker-thread.patch
bdev-fixup-error-handling-in-blkdev_get.patch
input-usbtouchscreen-initialize-pm-mutex-before-using-it.patch
input-elantech-enable-smbus-on-new-2018-systems.patch
input-synaptics-enable-rmi-mode-for-hp-spectre-x360.patch
x86-mm-check-for-pfn-instead-of-page-in-vmalloc_sync_one.patch
x86-mm-sync-also-unmappings-in-vmalloc_sync_all.patch
mm-vmalloc-sync-unmappings-in-__purge_vmap_area_lazy.patch
coresight-fix-debug_locks_warn_on-for-uninitialized-attribute.patch
perf-annotate-fix-s390-gap-between-kernel-end-and-module-start.patch
perf-db-export-fix-thread__exec_comm.patch
perf-record-fix-module-size-on-s390.patch
x86-purgatory-do-not-use-__builtin_memcpy-and-__builtin_memset.patch
x86-purgatory-use-cflags_remove-rather-than-reset-kbuild_cflags.patch
genirq-affinity-create-affinity-mask-for-single-vector.patch
gfs2-gfs2_walk_metadata-fix.patch
usb-host-xhci-rcar-fix-timeout-in-xhci_suspend.patch
usb-yurex-fix-use-after-free-in-yurex_delete.patch
usb-typec-ucsi-ccg-fix-uninitilized-symbol-error.patch
usb-typec-tcpm-free-log-buf-memory-when-remove-debug-file.patch
usb-typec-tcpm-remove-tcpm-dir-if-no-children.patch
usb-typec-tcpm-add-null-check-before-dereferencing-config.patch
usb-typec-tcpm-ignore-unsupported-unknown-alternate-mode-requests.patch
can-rcar_canfd-fix-possible-irq-storm-on-high-load.patch
can-flexcan-fix-stop-mode-acknowledgment.patch
can-flexcan-fix-an-use-after-free-in-flexcan_setup_stop_mode.patch
can-peak_usb-fix-potential-double-kfree_skb.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test [0]
✅ Podman system integration test (as root) [1]
✅ Podman system integration test (as user) [1]
✅ LTP lite [2]
✅ Loopdev Sanity [3]
✅ jvm test suite [4]
✅ AMTU (Abstract Machine Test Utility) [5]
✅ LTP: openposix test suite [6]
✅ audit: audit testsuite test [7]
✅ httpd: mod_ssl smoke sanity [8]
✅ iotop: sanity [9]
✅ tuned: tune-processes-through-perf [10]
✅ pciutils: update pci ids test [11]
✅ Usex - version 1.9-29 [12]
✅ stress: stress-ng [13]
Host 2:
✅ Boot test [0]
✅ xfstests: xfs [14]
✅ selinux-policy: serge-testsuite [15]
🚧 ✅ Storage blktests [16]
ppc64le:
Host 1:
✅ Boot test [0]
✅ Podman system integration test (as root) [1]
✅ Podman system integration test (as user) [1]
✅ LTP lite [2]
✅ Loopdev Sanity [3]
✅ jvm test suite [4]
✅ AMTU (Abstract Machine Test Utility) [5]
✅ LTP: openposix test suite [6]
✅ audit: audit testsuite test [7]
✅ httpd: mod_ssl smoke sanity [8]
✅ iotop: sanity [9]
✅ tuned: tune-processes-through-perf [10]
✅ pciutils: update pci ids test [11]
✅ Usex - version 1.9-29 [12]
Host 2:
✅ Boot test [0]
✅ xfstests: xfs [14]
✅ selinux-policy: serge-testsuite [15]
🚧 ✅ Storage blktests [16]
x86_64:
Host 1:
✅ Boot test [0]
✅ xfstests: xfs [14]
✅ selinux-policy: serge-testsuite [15]
🚧 ✅ Storage blktests [16]
Host 2:
✅ Boot test [0]
✅ Podman system integration test (as root) [1]
✅ Podman system integration test (as user) [1]
✅ LTP lite [2]
✅ Loopdev Sanity [3]
✅ jvm test suite [4]
✅ AMTU (Abstract Machine Test Utility) [5]
✅ LTP: openposix test suite [6]
✅ audit: audit testsuite test [7]
✅ httpd: mod_ssl smoke sanity [8]
✅ iotop: sanity [9]
✅ tuned: tune-processes-through-perf [10]
✅ pciutils: sanity smoke test [17]
✅ pciutils: update pci ids test [11]
✅ Usex - version 1.9-29 [12]
✅ stress: stress-ng [13]
Test source:
💚 Pull requests are welcome for new tests or improvements to existing tests!
[0]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[1]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/container/p…
[2]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[3]: https://github.com/CKI-project/tests-beaker/archive/master.zip#filesystems/…
[4]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/jvm
[5]: https://github.com/CKI-project/tests-beaker/archive/master.zip#misc/amtu
[6]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[7]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/aud…
[8]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/htt…
[9]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/iot…
[10]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/tun…
[11]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/upd…
[12]: https://github.com/CKI-project/tests-beaker/archive/master.zip#standards/us…
[13]: https://github.com/CKI-project/tests-beaker/archive/master.zip#stress/stres…
[14]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/filesystems…
[15]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/packages/se…
[16]: https://github.com/CKI-project/tests-beaker/archive/master.zip#storage/blk
[17]: https://github.com/CKI-project/tests-beaker/archive/master.zip#pciutils/san…
Waived tests (marked with 🚧)
-----------------------------
This test run included waived tests. Such tests are executed but their results
are not taken into account. Tests are waived when their results are not
reliable enough, e.g. when they're just introduced or are being fixed.
From: Andrea Arcangeli <aarcange(a)redhat.com>
[ Upstream commit 03800e0526ee25ed7c843ca1e57b69ac2a5af642 ]
25078dc1f74be16b858e914f52cc8f4d03c2271a first introduced an off by
one error in the ZONE_DMA initialization of PPC_BOOK3E_64=y and since
9739ab7eda459f0669ec9807e0d9be5020bab88c the off by one applies to
PPC32=y too. This simply corrects the off by one and should resolve
crashes like below:
[ 65.179101] page 0x7fff outside node 0 zone DMA [ 0x0 - 0x7fff ]
Unfortunately in various MM places "max" means a non inclusive end of
range. free_area_init_nodes max_zone_pfn parameter is one case and
MAX_ORDER is another one (unrelated) that comes by memory.
Reported-by: Zorro Lang <zlang(a)redhat.com>
Fixes: 25078dc1f74b ("powerpc: use mm zones more sensibly")
Fixes: 9739ab7eda45 ("powerpc: enable a 30-bit ZONE_DMA for 32-bit pmac")
Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20190625141727.2883-1-aarcange@redhat.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/powerpc/mm/mem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 2540d3b2588c3..2eda1ec36f552 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -249,7 +249,7 @@ void __init paging_init(void)
#ifdef CONFIG_ZONE_DMA
max_zone_pfns[ZONE_DMA] = min(max_low_pfn,
- ((1UL << ARCH_ZONE_DMA_BITS) - 1) >> PAGE_SHIFT);
+ 1UL << (ARCH_ZONE_DMA_BITS - PAGE_SHIFT));
#endif
max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
#ifdef CONFIG_HIGHMEM
--
2.20.1
From: Mel Gorman <mgorman(a)techsingularity.net>
Subject: mm, vmscan: do not special-case slab reclaim when watermarks are boosted
Dave Chinner reported a problem pointing a finger at commit 1c30844d2dfe
("mm: reclaim small amounts of memory when an external fragmentation event
occurs"). The report is extensive (see
https://lore.kernel.org/linux-mm/20190807091858.2857-1-david@fromorbit.com/)
and it's worth recording the most relevant parts (colorful language and
typos included).
When running a simple, steady state 4kB file creation test to
simulate extracting tarballs larger than memory full of small
files into the filesystem, I noticed that once memory fills up
the cache balance goes to hell.
The workload is creating one dirty cached inode for every dirty
page, both of which should require a single IO each to clean and
reclaim, and creation of inodes is throttled by the rate at which
dirty writeback runs at (via balance dirty pages). Hence the ingest
rate of new cached inodes and page cache pages is identical and
steady. As a result, memory reclaim should quickly find a steady
balance between page cache and inode caches.
The moment memory fills, the page cache is reclaimed at a much
faster rate than the inode cache, and evidence suggests that
the inode cache shrinker is not being called when large batches
of pages are being reclaimed. In roughly the same time period
that it takes to fill memory with 50% pages and 50% slab caches,
memory reclaim reduces the page cache down to just dirty pages
and slab caches fill the entirety of memory.
The LRU is largely full of dirty pages, and we're getting spikes
of random writeback from memory reclaim so it's all going to shit.
Behaviour never recovers, the page cache remains pinned at just
dirty pages, and nothing I could tune would make any difference.
vfs_cache_pressure makes no difference - I would set it so high
it should trim the entire inode caches in a single pass, yet it
didn't do anything. It was clear from tracing and live telemetry
that the shrinkers were pretty much not running except when
there was absolutely no memory free at all, and then they did
the minimum necessary to free memory to make progress.
So I went looking at the code, trying to find places where pages
got reclaimed and the shrinkers weren't called. There's only one
- kswapd doing boosted reclaim as per commit 1c30844d2dfe ("mm:
reclaim small amounts of memory when an external fragmentation
event occurs").
The watermark boosting introduced by the commit is triggered in response
to an allocation "fragmentation event". The boosting was not intended to
target THP specifically and triggers even if THP is disabled. However,
with Dave's perfectly reasonable workload, fragmentation events can be
very common given the ratio of slab to page cache allocations so boosting
remains active for long periods of time.
As high-order allocations might use compaction and compaction cannot move
slab pages the decision was made in the commit to special-case kswapd when
watermarks are boosted -- kswapd avoids reclaiming slab as reclaiming slab
does not directly help compaction.
As Dave notes, this decision means that slab can be artificially protected
for long periods of time and messes up the balance with slab and page
caches.
Removing the special casing can still indirectly help avoid
fragmentation by avoiding fragmentation-causing events due to slab
allocation as pages from a slab pageblock will have some slab objects
freed. Furthermore, with the special casing, reclaim behaviour is
unpredictable as kswapd sometimes examines slab and sometimes does not
in a manner that is tricky to tune or analyse.
This patch removes the special casing. The downside is that this is not a
universal performance win. Some benchmarks that depend on the residency
of data when rereading metadata may see a regression when slab reclaim is
restored to its original behaviour. Similarly, some benchmarks that only
read-once or write-once may perform better when page reclaim is too
aggressive. The primary upside is that slab shrinker is less surprising
(arguably more sane but that's a matter of opinion), behaves consistently
regardless of the fragmentation state of the system and properly obeys VM
sysctls.
A fsmark benchmark configuration was constructed similar to what Dave
reported and is codified by the mmtest configuration
config-io-fsmark-small-file-stream. It was evaluated on a 1-socket
machine to avoid dealing with NUMA-related issues and the timing of
reclaim. The storage was an SSD Samsung Evo and a fresh trimmed XFS
filesystem was used for the test data.
This is not an exact replication of Dave's setup. The configuration
scales its parameters depending on the memory size of the SUT to behave
similarly across machines. The parameters mean the first sample reported
by fs_mark is using 50% of RAM which will barely be throttled and look
like a big outlier. Dave used fake NUMA to have multiple kswapd instances
which I didn't replicate. Finally, the number of iterations differ from
Dave's test as the target disk was not large enough. While not identical,
it should be representative.
fsmark
5.3.0-rc3 5.3.0-rc3
vanilla shrinker-v1r1
Min 1-files/sec 4444.80 ( 0.00%) 4765.60 ( 7.22%)
1st-qrtle 1-files/sec 5005.10 ( 0.00%) 5091.70 ( 1.73%)
2nd-qrtle 1-files/sec 4917.80 ( 0.00%) 4855.60 ( -1.26%)
3rd-qrtle 1-files/sec 4667.40 ( 0.00%) 4831.20 ( 3.51%)
Max-1 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Max-5 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Max-10 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Max-90 1-files/sec 4649.60 ( 0.00%) 4780.70 ( 2.82%)
Max-95 1-files/sec 4491.00 ( 0.00%) 4768.20 ( 6.17%)
Max-99 1-files/sec 4491.00 ( 0.00%) 4768.20 ( 6.17%)
Max 1-files/sec 11421.50 ( 0.00%) 9999.30 ( -12.45%)
Hmean 1-files/sec 5004.75 ( 0.00%) 5075.96 ( 1.42%)
Stddev 1-files/sec 1778.70 ( 0.00%) 1369.66 ( 23.00%)
CoeffVar 1-files/sec 33.70 ( 0.00%) 26.05 ( 22.71%)
BHmean-99 1-files/sec 5053.72 ( 0.00%) 5101.52 ( 0.95%)
BHmean-95 1-files/sec 5053.72 ( 0.00%) 5101.52 ( 0.95%)
BHmean-90 1-files/sec 5107.05 ( 0.00%) 5131.41 ( 0.48%)
BHmean-75 1-files/sec 5208.45 ( 0.00%) 5206.68 ( -0.03%)
BHmean-50 1-files/sec 5405.53 ( 0.00%) 5381.62 ( -0.44%)
BHmean-25 1-files/sec 6179.75 ( 0.00%) 6095.14 ( -1.37%)
5.3.0-rc3 5.3.0-rc3
vanillashrinker-v1r1
Duration User 501.82 497.29
Duration System 4401.44 4424.08
Duration Elapsed 8124.76 8358.05
This is showing a slight skew for the max result representing a large
outlier for the 1st, 2nd and 3rd quartile are similar indicating that the
bulk of the results show little difference. Note that an earlier version
of the fsmark configuration showed a regression but that included more
samples taken while memory was still filling.
Note that the elapsed time is higher. Part of this is that the
configuration included time to delete all the test files when the test
completes -- the test automation handles the possibility of testing fsmark
with multiple thread counts. Without the patch, many of these objects
would be memory resident which is part of what the patch is addressing.
There are other important observations that justify the patch.
1. With the vanilla kernel, the number of dirty pages in the system
is very low for much of the test. With this patch, dirty pages
is generally kept at 10% which matches vm.dirty_background_ratio
which is normal expected historical behaviour.
2. With the vanilla kernel, the ratio of Slab/Pagecache is close to
0.95 for much of the test i.e. Slab is being left alone and dominating
memory consumption. With the patch applied, the ratio varies between
0.35 and 0.45 with the bulk of the measured ratios roughly half way
between those values. This is a different balance to what Dave reported
but it was at least consistent.
3. Slabs are scanned throughout the entire test with the patch applied.
The vanille kernel has periods with no scan activity and then relatively
massive spikes.
4. Without the patch, kswapd scan rates are very variable. With the patch,
the scan rates remain quite stead.
4. Overall vmstats are closer to normal expectations
5.3.0-rc3 5.3.0-rc3
vanilla shrinker-v1r1
Ops Direct pages scanned 99388.00 328410.00
Ops Kswapd pages scanned 45382917.00 33451026.00
Ops Kswapd pages reclaimed 30869570.00 25239655.00
Ops Direct pages reclaimed 74131.00 5830.00
Ops Kswapd efficiency % 68.02 75.45
Ops Kswapd velocity 5585.75 4002.25
Ops Page reclaim immediate 1179721.00 430927.00
Ops Slabs scanned 62367361.00 73581394.00
Ops Direct inode steals 2103.00 1002.00
Ops Kswapd inode steals 570180.00 5183206.00
o Vanilla kernel is hitting direct reclaim more frequently,
not very much in absolute terms but the fact the patch
reduces it is interesting
o "Page reclaim immediate" in the vanilla kernel indicates
dirty pages are being encountered at the tail of the LRU.
This is generally bad and means in this case that the LRU
is not long enough for dirty pages to be cleaned by the
background flush in time. This is much reduced by the
patch.
o With the patch, kswapd is reclaiming 10 times more slab
pages than with the vanilla kernel. This is indicative
of the watermark boosting over-protecting slab
A more complete set of tests were run that were part of the basis
for introducing boosting and while there are some differences, they
are well within tolerances.
Bottom line, the special casing kswapd to avoid slab behaviour is
unpredictable and can lead to abnormal results for normal workloads. This
patch restores the expected behaviour that slab and page cache is balanced
consistently for a workload with a steady allocation ratio of
slab/pagecache pages. It also means that if there are workloads that
favour the preservation of slab over pagecache that it can be tuned via
vm.vfs_cache_pressure where as the vanilla kernel effectively ignores the
parameter when boosting is active.
Link: http://lkml.kernel.org/r/20190808182946.GM2739@techsingularity.net
Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
Signed-off-by: Mel Gorman <mgorman(a)techsingularity.net>
Reviewed-by: Dave Chinner <dchinner(a)redhat.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [5.0+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-do-not-special-case-slab-reclaim-when-watermarks-are-boosted
+++ a/mm/vmscan.c
@@ -88,9 +88,6 @@ struct scan_control {
/* Can pages be swapped as part of reclaim? */
unsigned int may_swap:1;
- /* e.g. boosted watermark reclaim leaves slabs alone */
- unsigned int may_shrinkslab:1;
-
/*
* Cgroups are not reclaimed below their configured memory.low,
* unless we threaten to OOM. If any cgroups are skipped due to
@@ -2714,10 +2711,8 @@ static bool shrink_node(pg_data_t *pgdat
shrink_node_memcg(pgdat, memcg, sc, &lru_pages);
node_lru_pages += lru_pages;
- if (sc->may_shrinkslab) {
- shrink_slab(sc->gfp_mask, pgdat->node_id,
- memcg, sc->priority);
- }
+ shrink_slab(sc->gfp_mask, pgdat->node_id, memcg,
+ sc->priority);
/* Record the group's reclaim efficiency */
vmpressure(sc->gfp_mask, memcg, false,
@@ -3194,7 +3189,6 @@ unsigned long try_to_free_pages(struct z
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = 1,
- .may_shrinkslab = 1,
};
/*
@@ -3238,7 +3232,6 @@ unsigned long mem_cgroup_shrink_node(str
.may_unmap = 1,
.reclaim_idx = MAX_NR_ZONES - 1,
.may_swap = !noswap,
- .may_shrinkslab = 1,
};
unsigned long lru_pages;
@@ -3286,7 +3279,6 @@ unsigned long try_to_free_mem_cgroup_pag
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = may_swap,
- .may_shrinkslab = 1,
};
set_task_reclaim_state(current, &sc.reclaim_state);
@@ -3598,7 +3590,6 @@ restart:
*/
sc.may_writepage = !laptop_mode && !nr_boost_reclaim;
sc.may_swap = !nr_boost_reclaim;
- sc.may_shrinkslab = !nr_boost_reclaim;
/*
* Do some background aging of the anon list, to give
_
From: NeilBrown <neilb(a)suse.com>
Subject: seq_file: fix problem when seeking mid-record
If you use lseek or similar (e.g. pread) to access a location in a
seq_file file that is within a record, rather than at a record boundary,
then the first read will return the remainder of the record, and the
second read will return the whole of that same record (instead of the next
record). When seeking to a record boundary, the next record is correctly
returned.
This bug was introduced by a recent patch (identified below). Before that
patch, seq_read() would increment m->index when the last of the buffer was
returned (m->count == 0). After that patch, we rely on ->next to
increment m->index after filling the buffer - but there was one place
where that didn't happen.
Link: https://lkml.kernel.org/lkml/877e7xl029.fsf@notabene.neil.brown.name/
Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface")
Signed-off-by: NeilBrown <neilb(a)suse.com>
Reported-by: Sergei Turchanov <turchanov(a)farpost.com>
Tested-by: Sergei Turchanov <turchanov(a)farpost.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Markus Elfring <Markus.Elfring(a)web.de>
Cc: <stable(a)vger.kernel.org> [4.19+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/seq_file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/seq_file.c~seq_file-fix-problem-when-seeking-mid-record
+++ a/fs/seq_file.c
@@ -119,6 +119,7 @@ static int traverse(struct seq_file *m,
}
if (seq_has_overflowed(m))
goto Eoverflow;
+ p = m->op->next(m, p, &m->index);
if (pos + m->count > offset) {
m->from = offset - pos;
m->count -= m->from;
@@ -126,7 +127,6 @@ static int traverse(struct seq_file *m,
}
pos += m->count;
m->count = 0;
- p = m->op->next(m, p, &m->index);
if (pos == offset)
break;
}
_
From: "Isaac J. Manjarres" <isaacm(a)codeaurora.org>
Subject: mm/usercopy: use memory range to be accessed for wraparound check
Currently, when checking to see if accessing n bytes starting at address
"ptr" will cause a wraparound in the memory addresses, the check in
check_bogus_address() adds an extra byte, which is incorrect, as the range
of addresses that will be accessed is [ptr, ptr + (n - 1)].
This can lead to incorrectly detecting a wraparound in the memory address,
when trying to read 4 KB from memory that is mapped to the the last
possible page in the virtual address space, when in fact, accessing that
range of memory would not cause a wraparound to occur.
Use the memory range that will actually be accessed when considering if
accessing a certain amount of bytes will cause the memory address to wrap
around.
Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeauror…
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Co-developed-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Reviewed-by: William Kucharski <william.kucharski(a)oracle.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Trilok Soni <tsoni(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/usercopy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/usercopy.c~mm-usercopy-use-memory-range-to-be-accessed-for-wraparound-check
+++ a/mm/usercopy.c
@@ -147,7 +147,7 @@ static inline void check_bogus_address(c
bool to_user)
{
/* Reject if object wraps past end of memory. */
- if (ptr + n < ptr)
+ if (ptr + (n - 1) < ptr)
usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n);
/* Reject if NULL or ZERO-allocation. */
_
From: Henry Burns <henryburns(a)google.com>
Subject: mm/z3fold.c: fix z3fold_destroy_pool() race condition
The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.
Calling z3fold_deregister_migration() before the workqueues are drained
means that there can be allocated pages referencing a freed inode,
causing any thread in compaction to be able to trip over the bad
pointer in PageMovable().
Link: http://lkml.kernel.org/r/20190726224810.79660-2-henryburns@google.com
Fixes: 1f862989b04a ("mm/z3fold.c: support page migration")
Signed-off-by: Henry Burns <henryburns(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Jonathan Adams <jwadams(a)google.com>
Cc: Vitaly Vul <vitaly.vul(a)sony.com>
Cc: Vitaly Wool <vitalywool(a)gmail.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Henry Burns <henrywolfeburns(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/z3fold.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/z3fold.c~mm-z3foldc-fix-z3fold_destroy_pool-race-condition
+++ a/mm/z3fold.c
@@ -817,16 +817,19 @@ out:
static void z3fold_destroy_pool(struct z3fold_pool *pool)
{
kmem_cache_destroy(pool->c_handle);
- z3fold_unregister_migration(pool);
/*
* We need to destroy pool->compact_wq before pool->release_wq,
* as any pending work on pool->compact_wq will call
* queue_work(pool->release_wq, &pool->work).
+ *
+ * There are still outstanding pages until both workqueues are drained,
+ * so we cannot unregister migration until then.
*/
destroy_workqueue(pool->compact_wq);
destroy_workqueue(pool->release_wq);
+ z3fold_unregister_migration(pool);
kfree(pool);
}
_
From: Henry Burns <henryburns(a)google.com>
Subject: mm/z3fold.c: fix z3fold_destroy_pool() ordering
The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.
If there is work queued on pool->compact_workqueue when it is called,
z3fold_destroy_pool() will do:
z3fold_destroy_pool()
destroy_workqueue(pool->release_wq)
destroy_workqueue(pool->compact_wq)
drain_workqueue(pool->compact_wq)
do_compact_page(zhdr)
kref_put(&zhdr->refcount)
__release_z3fold_page(zhdr, ...)
queue_work_on(pool->release_wq, &pool->work) *BOOM*
So compact_wq needs to be destroyed before release_wq.
Link: http://lkml.kernel.org/r/20190726224810.79660-1-henryburns@google.com
Fixes: 5d03a6613957 ("mm/z3fold.c: use kref to prevent page free/compact race")
Signed-off-by: Henry Burns <henryburns(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Jonathan Adams <jwadams(a)google.com>
Cc: Vitaly Vul <vitaly.vul(a)sony.com>
Cc: Vitaly Wool <vitalywool(a)gmail.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Al Viro <viro(a)zeniv.linux.org.uk
Cc: Henry Burns <henrywolfeburns(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/z3fold.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/mm/z3fold.c~mm-z3foldc-fix-z3fold_destroy_pool-ordering
+++ a/mm/z3fold.c
@@ -818,8 +818,15 @@ static void z3fold_destroy_pool(struct z
{
kmem_cache_destroy(pool->c_handle);
z3fold_unregister_migration(pool);
- destroy_workqueue(pool->release_wq);
+
+ /*
+ * We need to destroy pool->compact_wq before pool->release_wq,
+ * as any pending work on pool->compact_wq will call
+ * queue_work(pool->release_wq, &pool->work).
+ */
+
destroy_workqueue(pool->compact_wq);
+ destroy_workqueue(pool->release_wq);
kfree(pool);
}
_
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Subject: mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind
When running syzkaller internally, we ran into the below bug on 4.9.x
kernel:
kernel BUG at mm/huge_memory.c:2124!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 1518 Comm: syz-executor107 Not tainted 4.9.168+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
task: ffff880067b34900 task.stack: ffff880068998000
RIP: 0010:[<ffffffff81895d6b>] [<ffffffff81895d6b>] split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
RSP: 0018:ffff88006899f980 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffffea00018f1700 RCX: 0000000000000000
RDX: 1ffffd400031e2e7 RSI: 0000000000000001 RDI: ffffea00018f1738
RBP: ffff88006899f9e8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: fffffbfff0d8b13e R12: ffffea00018f1400
R13: ffffea00018f1400 R14: ffffea00018f1720 R15: ffffea00018f1401
FS: 00007fa333996740(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000040 CR3: 0000000066b9c000 CR4: 00000000000606f0
Stack:
0000000000000246 ffff880067b34900 0000000000000000 ffff88007ffdc000
0000000000000000 ffff88006899f9e8 ffffffff812b4015 ffff880064c64e18
ffffea00018f1401 dffffc0000000000 ffffea00018f1700 0000000020ffd000
Call Trace:
[<ffffffff818490f1>] split_huge_page include/linux/huge_mm.h:100 [inline]
[<ffffffff818490f1>] queue_pages_pte_range+0x7e1/0x1480 mm/mempolicy.c:538
[<ffffffff817ed0da>] walk_pmd_range mm/pagewalk.c:50 [inline]
[<ffffffff817ed0da>] walk_pud_range mm/pagewalk.c:90 [inline]
[<ffffffff817ed0da>] walk_pgd_range mm/pagewalk.c:116 [inline]
[<ffffffff817ed0da>] __walk_page_range+0x44a/0xdb0 mm/pagewalk.c:208
[<ffffffff817edb94>] walk_page_range+0x154/0x370 mm/pagewalk.c:285
[<ffffffff81844515>] queue_pages_range+0x115/0x150 mm/mempolicy.c:694
[<ffffffff8184f493>] do_mbind mm/mempolicy.c:1241 [inline]
[<ffffffff8184f493>] SYSC_mbind+0x3c3/0x1030 mm/mempolicy.c:1370
[<ffffffff81850146>] SyS_mbind+0x46/0x60 mm/mempolicy.c:1352
[<ffffffff810097e2>] do_syscall_64+0x1d2/0x600 arch/x86/entry/common.c:282
[<ffffffff82ff6f93>] entry_SYSCALL_64_after_swapgs+0x5d/0xdb
Code: c7 80 1c 02 00 e8 26 0a 76 01 <0f> 0b 48 c7 c7 40 46 45 84 e8 4c
RIP [<ffffffff81895d6b>] split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
RSP <ffff88006899f980>
with the below test:
---8<---
uint64_t r[1] = {0xffffffffffffffff};
int main(void)
{
syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
intptr_t res = 0;
res = syscall(__NR_socket, 0x11, 3, 0x300);
if (res != -1)
r[0] = res;
*(uint32_t*)0x20000040 = 0x10000;
*(uint32_t*)0x20000044 = 1;
*(uint32_t*)0x20000048 = 0xc520;
*(uint32_t*)0x2000004c = 1;
syscall(__NR_setsockopt, r[0], 0x107, 0xd, 0x20000040, 0x10);
syscall(__NR_mmap, 0x20fed000, 0x10000, 0, 0x8811, r[0], 0);
*(uint64_t*)0x20000340 = 2;
syscall(__NR_mbind, 0x20ff9000, 0x4000, 0x4002, 0x20000340,
0x45d4, 3);
return 0;
}
---8<---
Actually the test does:
mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000
socket(AF_PACKET, SOCK_RAW, 768) = 3
setsockopt(3, SOL_PACKET, PACKET_TX_RING, {block_size=65536, block_nr=1, frame_size=50464, frame_nr=1}, 16) = 0
mmap(0x20fed000, 65536, PROT_NONE, MAP_SHARED|MAP_FIXED|MAP_POPULATE|MAP_DENYWRITE, 3, 0) = 0x20fed000
mbind(..., MPOL_MF_STRICT|MPOL_MF_MOVE) = 0
The setsockopt() would allocate compound pages (16 pages in this test) for
packet tx ring, then the mmap() would call packet_mmap() to map the pages
into the user address space specified by the mmap() call.
When calling mbind(), it would scan the vma to queue the pages for
migration to the new node. It would split any huge page since 4.9 doesn't
support THP migration, however, the packet tx ring compound pages are not
THP and even not movable. So, the above bug is triggered.
However, the later kernel is not hit by this issue due to
d44d363f65780f2ac2 ("mm: don't assume anonymous pages have SwapBacked
flag"), which just removes the PageSwapBacked check for a different
reason.
But, there is a deeper issue. According to the semantic of mbind(), it
should return -EIO if MPOL_MF_MOVE or MPOL_MF_MOVE_ALL was specified and
MPOL_MF_STRICT was also specified, but the kernel was unable to move all
existing pages in the range. The tx ring of the packet socket is
definitely not movable, however, mbind() returns success for this case.
Although the most socket file associates with non-movable pages, but XDP
may have movable pages from gup. So, it sounds not fine to just check the
underlying file type of vma in vma_migratable().
Change migrate_page_add() to check if the page is movable or not, if it is
unmovable, just return -EIO. But do not abort pte walk immediately, since
there may be pages off LRU temporarily. We should migrate other pages if
MPOL_MF_MOVE* is specified. Set has_unmovable flag if some paged could
not be not moved, then return -EIO for mbind() eventually.
With this change the above test would return -EIO as expected.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-3-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-3-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
--- a/mm/mempolicy.c~mm-mempolicy-handle-vma-with-unmovable-pages-mapped-correctly-in-mbind
+++ a/mm/mempolicy.c
@@ -403,7 +403,7 @@ static const struct mempolicy_operations
},
};
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags);
struct queue_pages {
@@ -463,12 +463,11 @@ static int queue_pages_pmd(pmd_t *pmd, s
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(walk->vma)) {
+ if (!vma_migratable(walk->vma) ||
+ migrate_page_add(page, qp->pagelist, flags)) {
ret = 1;
goto unlock;
}
-
- migrate_page_add(page, qp->pagelist, flags);
} else
ret = -EIO;
unlock:
@@ -532,7 +531,14 @@ static int queue_pages_pte_range(pmd_t *
has_unmovable = true;
break;
}
- migrate_page_add(page, qp->pagelist, flags);
+
+ /*
+ * Do not abort immediately since there may be
+ * temporary off LRU pages in the range. Still
+ * need migrate other LRU pages.
+ */
+ if (migrate_page_add(page, qp->pagelist, flags))
+ has_unmovable = true;
} else
break;
}
@@ -961,7 +967,7 @@ static long do_get_mempolicy(int *policy
/*
* page migration, thp tail pages can be passed.
*/
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
struct page *head = compound_head(page);
@@ -974,8 +980,19 @@ static void migrate_page_add(struct page
mod_node_page_state(page_pgdat(head),
NR_ISOLATED_ANON + page_is_file_cache(head),
hpage_nr_pages(head));
+ } else if (flags & MPOL_MF_STRICT) {
+ /*
+ * Non-movable page may reach here. And, there may be
+ * temporary off LRU pages or non-LRU movable pages.
+ * Treat them as unmovable pages since they can't be
+ * isolated, so they can't be moved at the moment. It
+ * should return -EIO for this case too.
+ */
+ return -EIO;
}
}
+
+ return 0;
}
/* page allocation callback for NUMA node migration */
@@ -1178,9 +1195,10 @@ static struct page *new_page(struct page
}
#else
-static void migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_page_add(struct page *page, struct list_head *pagelist,
unsigned long flags)
{
+ return -EIO;
}
int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
_
From: Yang Shi <yang.shi(a)linux.alibaba.com>
Subject: mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified
When both MPOL_MF_MOVE* and MPOL_MF_STRICT was specified, mbind() should
try best to migrate misplaced pages, if some of the pages could not be
migrated, then return -EIO.
There are three different sub-cases:
1. vma is not migratable
2. vma is migratable, but there are unmovable pages
3. vma is migratable, pages are movable, but migrate_pages() fails
If #1 happens, kernel would just abort immediately, then return -EIO,
after a7f40cfe3b7ada ("mm: mempolicy: make mbind() return -EIO when
MPOL_MF_STRICT is specified").
If #3 happens, kernel would set policy and migrate pages with best-effort,
but won't rollback the migrated pages and reset the policy back.
Before that commit, they behaves in the same way. It'd better to keep
their behavior consistent. But, rolling back the migrated pages and
resetting the policy back sounds not feasible, so just make #1 behave as
same as #3.
Userspace will know that not everything was successfully migrated (via
-EIO), and can take whatever steps it deems necessary - attempt rollback,
determine which exact page(s) are violating the policy, etc.
Make queue_pages_range() return 1 to indicate there are unmovable pages or
vma is not migratable.
The #2 is not handled correctly in the current kernel, the following patch
will fix it.
[yang.shi(a)linux.alibaba.com: fix review comments from Vlastimil]
Link: http://lkml.kernel.org/r/1563556862-54056-2-git-send-email-yang.shi@linux.a…
Link: http://lkml.kernel.org/r/1561162809-59140-2-git-send-email-yang.shi@linux.a…
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 68 +++++++++++++++++++++++++++++++++--------------
1 file changed, 48 insertions(+), 20 deletions(-)
--- a/mm/mempolicy.c~mm-mempolicy-make-the-behavior-consistent-when-mpol_mf_move-and-mpol_mf_strict-were-specified
+++ a/mm/mempolicy.c
@@ -429,11 +429,14 @@ static inline bool queue_pages_required(
}
/*
- * queue_pages_pmd() has three possible return values:
- * 1 - pages are placed on the right node or queued successfully.
- * 0 - THP was split.
- * -EIO - is migration entry or MPOL_MF_STRICT was specified and an existing
- * page was already on a node that does not follow the policy.
+ * queue_pages_pmd() has four possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 2 - THP was split.
+ * -EIO - is migration entry or only MPOL_MF_STRICT was specified and an
+ * existing page was already on a node that does not follow the
+ * policy.
*/
static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -451,19 +454,17 @@ static int queue_pages_pmd(pmd_t *pmd, s
if (is_huge_zero_page(page)) {
spin_unlock(ptl);
__split_huge_pmd(walk->vma, pmd, addr, false, NULL);
+ ret = 2;
goto out;
}
- if (!queue_pages_required(page, qp)) {
- ret = 1;
+ if (!queue_pages_required(page, qp))
goto unlock;
- }
- ret = 1;
flags = qp->flags;
/* go to thp migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
if (!vma_migratable(walk->vma)) {
- ret = -EIO;
+ ret = 1;
goto unlock;
}
@@ -479,6 +480,13 @@ out:
/*
* Scan through pages checking if pages follow certain conditions,
* and move them to the pagelist if they do.
+ *
+ * queue_pages_pte_range() has three possible return values:
+ * 0 - pages are placed on the right node or queued successfully.
+ * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * -EIO - only MPOL_MF_STRICT was specified and an existing page was already
+ * on a node that does not follow the policy.
*/
static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, struct mm_walk *walk)
@@ -488,17 +496,17 @@ static int queue_pages_pte_range(pmd_t *
struct queue_pages *qp = walk->private;
unsigned long flags = qp->flags;
int ret;
+ bool has_unmovable = false;
pte_t *pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
if (ptl) {
ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
- if (ret > 0)
- return 0;
- else if (ret < 0)
+ if (ret != 2)
return ret;
}
+ /* THP was split, fall through to pte walk */
if (pmd_trans_unstable(pmd))
return 0;
@@ -519,14 +527,21 @@ static int queue_pages_pte_range(pmd_t *
if (!queue_pages_required(page, qp))
continue;
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
- if (!vma_migratable(vma))
+ /* MPOL_MF_STRICT must be specified if we get here */
+ if (!vma_migratable(vma)) {
+ has_unmovable = true;
break;
+ }
migrate_page_add(page, qp->pagelist, flags);
} else
break;
}
pte_unmap_unlock(pte - 1, ptl);
cond_resched();
+
+ if (has_unmovable)
+ return 1;
+
return addr != end ? -EIO : 0;
}
@@ -639,7 +654,13 @@ static int queue_pages_test_walk(unsigne
*
* If pages found in a given range are on a set of nodes (determined by
* @nodes and @flags,) it's isolated and queued to the pagelist which is
- * passed via @private.)
+ * passed via @private.
+ *
+ * queue_pages_range() has three possible return values:
+ * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * specified.
+ * 0 - queue pages successfully or no misplaced page.
+ * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.
*/
static int
queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
@@ -1182,6 +1203,7 @@ static long do_mbind(unsigned long start
struct mempolicy *new;
unsigned long end;
int err;
+ int ret;
LIST_HEAD(pagelist);
if (flags & ~(unsigned long)MPOL_MF_VALID)
@@ -1243,10 +1265,15 @@ static long do_mbind(unsigned long start
if (err)
goto mpol_out;
- err = queue_pages_range(mm, start, end, nmask,
+ ret = queue_pages_range(mm, start, end, nmask,
flags | MPOL_MF_INVERT, &pagelist);
- if (!err)
- err = mbind_range(mm, start, end, new);
+
+ if (ret < 0) {
+ err = -EIO;
+ goto up_out;
+ }
+
+ err = mbind_range(mm, start, end, new);
if (!err) {
int nr_failed = 0;
@@ -1259,13 +1286,14 @@ static long do_mbind(unsigned long start
putback_movable_pages(&pagelist);
}
- if (nr_failed && (flags & MPOL_MF_STRICT))
+ if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))
err = -EIO;
} else
putback_movable_pages(&pagelist);
+up_out:
up_write(&mm->mmap_sem);
- mpol_out:
+mpol_out:
mpol_put(new);
return err;
}
_
From: Ralph Campbell <rcampbell(a)nvidia.com>
Subject: mm/hmm: fix bad subpage pointer in try_to_unmap_one
When migrating an anonymous private page to a ZONE_DEVICE private page,
the source page->mapping and page->index fields are copied to the
destination ZONE_DEVICE struct page and the page_mapcount() is increased.
This is so rmap_walk() can be used to unmap and migrate the page back to
system memory. However, try_to_unmap_one() computes the subpage pointer
from a swap pte which computes an invalid page pointer and a kernel panic
results such as:
BUG: unable to handle page fault for address: ffffea1fffffffc8
Currently, only single pages can be migrated to device private memory so
no subpage computation is needed and it can be set to "page".
[rcampbell(a)nvidia.com: add comment]
Link: http://lkml.kernel.org/r/20190724232700.23327-4-rcampbell@nvidia.com
Link: http://lkml.kernel.org/r/20190719192955.30462-4-rcampbell@nvidia.com
Fixes: a5430dda8a3a1c ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: "Jérôme Glisse" <jglisse(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Lai Jiangshan <jiangshanlai(a)gmail.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/rmap.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/mm/rmap.c~mm-hmm-fix-bad-subpage-pointer-in-try_to_unmap_one
+++ a/mm/rmap.c
@@ -1475,7 +1475,15 @@ static bool try_to_unmap_one(struct page
/*
* No need to invalidate here it will synchronize on
* against the special swap migration pte.
+ *
+ * The assignment to subpage above was computed from a
+ * swap PTE which results in an invalid pointer.
+ * Since only PAGE_SIZE pages can currently be
+ * migrated, just set it to page. This will need to be
+ * changed when hugepage migrations to device private
+ * memory are supported.
*/
+ subpage = page;
goto discard;
}
_
From: Ralph Campbell <rcampbell(a)nvidia.com>
Subject: mm/hmm: fix ZONE_DEVICE anon page mapping reuse
When a ZONE_DEVICE private page is freed, the page->mapping field can be
set. If this page is reused as an anonymous page, the previous value can
prevent the page from being inserted into the CPU's anon rmap table. For
example, when migrating a pte_none() page to device memory:
migrate_vma(ops, vma, start, end, src, dst, private)
migrate_vma_collect()
src[] = MIGRATE_PFN_MIGRATE
migrate_vma_prepare()
/* no page to lock or isolate so OK */
migrate_vma_unmap()
/* no page to unmap so OK */
ops->alloc_and_copy()
/* driver allocates ZONE_DEVICE page for dst[] */
migrate_vma_pages()
migrate_vma_insert_page()
page_add_new_anon_rmap()
__page_set_anon_rmap()
/* This check sees the page's stale mapping field */
if (PageAnon(page))
return
/* page->mapping is not updated */
The result is that the migration appears to succeed but a subsequent CPU
fault will be unable to migrate the page back to system memory or worse.
Clear the page->mapping field when freeing the ZONE_DEVICE page so stale
pointer data doesn't affect future page use.
Link: http://lkml.kernel.org/r/20190719192955.30462-3-rcampbell@nvidia.com
Fixes: b7a523109fb5c9d2d6dd ("mm: don't clear ->mapping in hmm_devmem_free")
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Jan Kara <jack(a)suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: "Jérôme Glisse" <jglisse(a)redhat.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai(a)gmail.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memremap.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
--- a/mm/memremap.c~mm-hmm-fix-zone_device-anon-page-mapping-reuse
+++ a/mm/memremap.c
@@ -403,6 +403,30 @@ void __put_devmap_managed_page(struct pa
mem_cgroup_uncharge(page);
+ /*
+ * When a device_private page is freed, the page->mapping field
+ * may still contain a (stale) mapping value. For example, the
+ * lower bits of page->mapping may still identify the page as
+ * an anonymous page. Ultimately, this entire field is just
+ * stale and wrong, and it will cause errors if not cleared.
+ * One example is:
+ *
+ * migrate_vma_pages()
+ * migrate_vma_insert_page()
+ * page_add_new_anon_rmap()
+ * __page_set_anon_rmap()
+ * ...checks page->mapping, via PageAnon(page) call,
+ * and incorrectly concludes that the page is an
+ * anonymous page. Therefore, it incorrectly,
+ * silently fails to set up the new anon rmap.
+ *
+ * For other types of ZONE_DEVICE pages, migration is either
+ * handled differently or not done at all, so there is no need
+ * to clear page->mapping.
+ */
+ if (is_device_private_page(page))
+ page->mapping = NULL;
+
page->pgmap->ops->page_free(page);
} else if (!count)
__put_page(page);
_
On Tue, Aug 13, 2019 at 02:31:17PM -0700, Andrew Morton wrote:
> On Mon, 12 Aug 2019 16:37:54 -0700 Roman Gushchin <guro(a)fb.com> wrote:
>
> > Similar to vmstats, percpu caching of local vmevents leads to an
> > accumulation of errors on non-leaf levels. This happens because
> > some leftovers may remain in percpu caches, so that they are
> > never propagated up by the cgroup tree and just disappear into
> > nonexistence with on releasing of the memory cgroup.
> >
> > To fix this issue let's accumulate and propagate percpu vmevents
> > values before releasing the memory cgroup similar to what we're
> > doing with vmstats.
> >
> > Since on cpu hotplug we do flush percpu vmstats anyway, we can
> > iterate only over online cpus.
> >
> > Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
>
> No cc:stable?
>
Here too, cc:stable is definitely missing. Adding now. Thanks!
The patch titled
Subject: mm, page_alloc: move_freepages should not examine struct page of reserved memory
has been added to the -mm tree. Its filename is
mm-page_alloc-move_freepages-should-not-examine-struct-page-of-reserved-memory.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-move_freepages-shoul…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-move_freepages-shoul…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: David Rientjes <rientjes(a)google.com>
Subject: mm, page_alloc: move_freepages should not examine struct page of reserved memory
After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"),
struct page of reserved memory is zeroed. This causes page->flags to be 0
and fixes issues related to reading /proc/kpageflags, for example, of
reserved memory.
The VM_BUG_ON() in move_freepages_block(), however, assumes that
page_zone() is meaningful even for reserved memory. That assumption is no
longer true after the aforementioned commit.
There's no reason why move_freepages_block() should be testing the
legitimacy of page_zone() for reserved memory; its scope is limited only
to pages on the zone's freelist.
Note that pfn_valid() can be true for reserved memory: there is a backing
struct page. The check for page_to_nid(page) is also buggy but reserved
memory normally only appears on node 0 so the zeroing doesn't affect this.
Move the debug checks to after verifying PageBuddy is true. This isolates
the scope of the checks to only be for buddy pages which are on the zone's
freelist which move_freepages_block() is operating on. In this case, an
incorrect node or zone is a bug worthy of being warned about (and the
examination of struct page is acceptable bcause this memory is not
reserved).
Why does move_freepages_block() gets called on reserved memory? It's
simply math after finding a valid free page from the per-zone free area to
use as fallback. We find the beginning and end of the pageblock of the
valid page and that can bring us into memory that was reserved per the
e820. pfn_valid() is still true (it's backed by a struct page), but since
it's zero'd we shouldn't make any inferences here about comparing its node
or zone. The current node check just happens to succeed most of the time
by luck because reserved memory typically appears on node 0.
The fix here is to validate that we actually have buddy pages before
testing if there's any type of zone or node strangeness going on.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp…
Fixes: 907ec5fca3dc ("mm: zero remaining unavailable struct pages")
Signed-off-by: David Rientjes <rientjes(a)google.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Masayoshi Mizuma <m.mizuma(a)jp.fujitsu.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Pavel Tatashin <pavel.tatashin(a)microsoft.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 19 ++++---------------
1 file changed, 4 insertions(+), 15 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-move_freepages-should-not-examine-struct-page-of-reserved-memory
+++ a/mm/page_alloc.c
@@ -2238,27 +2238,12 @@ static int move_freepages(struct zone *z
unsigned int order;
int pages_moved = 0;
-#ifndef CONFIG_HOLES_IN_ZONE
- /*
- * page_zone is not safe to call in this context when
- * CONFIG_HOLES_IN_ZONE is set. This bug check is probably redundant
- * anyway as we check zone boundaries in move_freepages_block().
- * Remove at a later date when no bug reports exist related to
- * grouping pages by mobility
- */
- VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) &&
- pfn_valid(page_to_pfn(end_page)) &&
- page_zone(start_page) != page_zone(end_page));
-#endif
for (page = start_page; page <= end_page;) {
if (!pfn_valid_within(page_to_pfn(page))) {
page++;
continue;
}
- /* Make sure we are not inadvertently changing nodes */
- VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
-
if (!PageBuddy(page)) {
/*
* We assume that pages that could be isolated for
@@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *z
continue;
}
+ /* Make sure we are not inadvertently changing nodes */
+ VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
+ VM_BUG_ON_PAGE(page_zone(page) != zone, page);
+
order = page_order(page);
move_to_free_area(page, &zone->free_area[order], migratetype);
page += 1 << order;
_
Patches currently in -mm which might be from rientjes(a)google.com are
mm-page_alloc-move_freepages-should-not-examine-struct-page-of-reserved-memory.patch
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b059f801a937d164e03b33c1848bb3dca67c0b04 Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers(a)google.com>
Date: Wed, 7 Aug 2019 15:15:33 -0700
Subject: [PATCH] x86/purgatory: Use CFLAGS_REMOVE rather than reset
KBUILD_CFLAGS
KBUILD_CFLAGS is very carefully built up in the top level Makefile,
particularly when cross compiling or using different build tools.
Resetting KBUILD_CFLAGS via := assignment is an antipattern.
The comment above the reset mentions that -pg is problematic. Other
Makefiles use `CFLAGS_REMOVE_file.o = $(CC_FLAGS_FTRACE)` when
CONFIG_FUNCTION_TRACER is set. Prefer that pattern to wiping out all of
the important KBUILD_CFLAGS then manually having to re-add them. Seems
also that __stack_chk_fail references are generated when using
CONFIG_STACKPROTECTOR or CONFIG_STACKPROTECTOR_STRONG.
Fixes: 8fc5b4d4121c ("purgatory: core purgatory functionality")
Reported-by: Vaibhav Rustagi <vaibhavrustagi(a)google.com>
Suggested-by: Peter Zijlstra <peterz(a)infradead.org>
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Vaibhav Rustagi <vaibhavrustagi(a)google.com>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20190807221539.94583-2-ndesaulniers@google.com
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 91ef244026d2..8901a1f89cf5 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -20,11 +20,34 @@ KCOV_INSTRUMENT := n
# Default KBUILD_CFLAGS can have -pg option set when FTRACE is enabled. That
# in turn leaves some undefined symbols like __fentry__ in purgatory and not
-# sure how to relocate those. Like kexec-tools, use custom flags.
-
-KBUILD_CFLAGS := -fno-strict-aliasing -Wall -Wstrict-prototypes -fno-zero-initialized-in-bss -fno-builtin -ffreestanding -c -Os -mcmodel=large
-KBUILD_CFLAGS += -m$(BITS)
-KBUILD_CFLAGS += $(call cc-option,-fno-PIE)
+# sure how to relocate those.
+ifdef CONFIG_FUNCTION_TRACER
+CFLAGS_REMOVE_sha256.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_purgatory.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_string.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_kexec-purgatory.o += $(CC_FLAGS_FTRACE)
+endif
+
+ifdef CONFIG_STACKPROTECTOR
+CFLAGS_REMOVE_sha256.o += -fstack-protector
+CFLAGS_REMOVE_purgatory.o += -fstack-protector
+CFLAGS_REMOVE_string.o += -fstack-protector
+CFLAGS_REMOVE_kexec-purgatory.o += -fstack-protector
+endif
+
+ifdef CONFIG_STACKPROTECTOR_STRONG
+CFLAGS_REMOVE_sha256.o += -fstack-protector-strong
+CFLAGS_REMOVE_purgatory.o += -fstack-protector-strong
+CFLAGS_REMOVE_string.o += -fstack-protector-strong
+CFLAGS_REMOVE_kexec-purgatory.o += -fstack-protector-strong
+endif
+
+ifdef CONFIG_RETPOLINE
+CFLAGS_REMOVE_sha256.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_purgatory.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_string.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_kexec-purgatory.o += $(RETPOLINE_CFLAGS)
+endif
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d4b890aec4bea7334ca2ca56fd3b12fb48a00cd1 Mon Sep 17 00:00:00 2001
From: Nikita Yushchenko <nikita.yoush(a)cogentembedded.com>
Date: Wed, 26 Jun 2019 16:08:48 +0300
Subject: [PATCH] can: rcar_canfd: fix possible IRQ storm on high load
We have observed rcar_canfd driver entering IRQ storm under high load,
with following scenario:
- rcar_canfd_global_interrupt() in entered due to Rx available,
- napi_schedule_prep() is called, and sets NAPIF_STATE_SCHED in state
- Rx fifo interrupts are masked,
- rcar_canfd_global_interrupt() is entered again, this time due to
error interrupt (e.g. due to overflow),
- since scheduled napi poller has not yet executed, condition for calling
napi_schedule_prep() from rcar_canfd_global_interrupt() remains true,
thus napi_schedule_prep() gets called and sets NAPIF_STATE_MISSED flag
in state,
- later, napi poller function rcar_canfd_rx_poll() gets executed, and
calls napi_complete_done(),
- due to NAPIF_STATE_MISSED flag in state, this call does not clear
NAPIF_STATE_SCHED flag from state,
- on return from napi_complete_done(), rcar_canfd_rx_poll() unmasks Rx
interrutps,
- Rx interrupt happens, rcar_canfd_global_interrupt() gets called
and calls napi_schedule_prep(),
- since NAPIF_STATE_SCHED is set in state at this time, this call
returns false,
- due to that false return, rcar_canfd_global_interrupt() returns
without masking Rx interrupt
- and this results into IRQ storm: unmasked Rx interrupt happens again
and again is misprocessed in the same way.
This patch fixes that scenario by unmasking Rx interrupts only when
napi_complete_done() returns true, which means it has cleared
NAPIF_STATE_SCHED in state.
Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Signed-off-by: Nikita Yushchenko <nikita.yoush(a)cogentembedded.com>
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index 05410008aa6b..de34a4b82d4a 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1508,10 +1508,11 @@ static int rcar_canfd_rx_poll(struct napi_struct *napi, int quota)
/* All packets processed */
if (num_pkts < quota) {
- napi_complete_done(napi, num_pkts);
- /* Enable Rx FIFO interrupts */
- rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
- RCANFD_RFCC_RFIE);
+ if (napi_complete_done(napi, num_pkts)) {
+ /* Enable Rx FIFO interrupts */
+ rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
+ RCANFD_RFCC_RFIE);
+ }
}
return num_pkts;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b059f801a937d164e03b33c1848bb3dca67c0b04 Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers(a)google.com>
Date: Wed, 7 Aug 2019 15:15:33 -0700
Subject: [PATCH] x86/purgatory: Use CFLAGS_REMOVE rather than reset
KBUILD_CFLAGS
KBUILD_CFLAGS is very carefully built up in the top level Makefile,
particularly when cross compiling or using different build tools.
Resetting KBUILD_CFLAGS via := assignment is an antipattern.
The comment above the reset mentions that -pg is problematic. Other
Makefiles use `CFLAGS_REMOVE_file.o = $(CC_FLAGS_FTRACE)` when
CONFIG_FUNCTION_TRACER is set. Prefer that pattern to wiping out all of
the important KBUILD_CFLAGS then manually having to re-add them. Seems
also that __stack_chk_fail references are generated when using
CONFIG_STACKPROTECTOR or CONFIG_STACKPROTECTOR_STRONG.
Fixes: 8fc5b4d4121c ("purgatory: core purgatory functionality")
Reported-by: Vaibhav Rustagi <vaibhavrustagi(a)google.com>
Suggested-by: Peter Zijlstra <peterz(a)infradead.org>
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Vaibhav Rustagi <vaibhavrustagi(a)google.com>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20190807221539.94583-2-ndesaulniers@google.com
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 91ef244026d2..8901a1f89cf5 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -20,11 +20,34 @@ KCOV_INSTRUMENT := n
# Default KBUILD_CFLAGS can have -pg option set when FTRACE is enabled. That
# in turn leaves some undefined symbols like __fentry__ in purgatory and not
-# sure how to relocate those. Like kexec-tools, use custom flags.
-
-KBUILD_CFLAGS := -fno-strict-aliasing -Wall -Wstrict-prototypes -fno-zero-initialized-in-bss -fno-builtin -ffreestanding -c -Os -mcmodel=large
-KBUILD_CFLAGS += -m$(BITS)
-KBUILD_CFLAGS += $(call cc-option,-fno-PIE)
+# sure how to relocate those.
+ifdef CONFIG_FUNCTION_TRACER
+CFLAGS_REMOVE_sha256.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_purgatory.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_string.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_kexec-purgatory.o += $(CC_FLAGS_FTRACE)
+endif
+
+ifdef CONFIG_STACKPROTECTOR
+CFLAGS_REMOVE_sha256.o += -fstack-protector
+CFLAGS_REMOVE_purgatory.o += -fstack-protector
+CFLAGS_REMOVE_string.o += -fstack-protector
+CFLAGS_REMOVE_kexec-purgatory.o += -fstack-protector
+endif
+
+ifdef CONFIG_STACKPROTECTOR_STRONG
+CFLAGS_REMOVE_sha256.o += -fstack-protector-strong
+CFLAGS_REMOVE_purgatory.o += -fstack-protector-strong
+CFLAGS_REMOVE_string.o += -fstack-protector-strong
+CFLAGS_REMOVE_kexec-purgatory.o += -fstack-protector-strong
+endif
+
+ifdef CONFIG_RETPOLINE
+CFLAGS_REMOVE_sha256.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_purgatory.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_string.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_kexec-purgatory.o += $(RETPOLINE_CFLAGS)
+endif
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b059f801a937d164e03b33c1848bb3dca67c0b04 Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers(a)google.com>
Date: Wed, 7 Aug 2019 15:15:33 -0700
Subject: [PATCH] x86/purgatory: Use CFLAGS_REMOVE rather than reset
KBUILD_CFLAGS
KBUILD_CFLAGS is very carefully built up in the top level Makefile,
particularly when cross compiling or using different build tools.
Resetting KBUILD_CFLAGS via := assignment is an antipattern.
The comment above the reset mentions that -pg is problematic. Other
Makefiles use `CFLAGS_REMOVE_file.o = $(CC_FLAGS_FTRACE)` when
CONFIG_FUNCTION_TRACER is set. Prefer that pattern to wiping out all of
the important KBUILD_CFLAGS then manually having to re-add them. Seems
also that __stack_chk_fail references are generated when using
CONFIG_STACKPROTECTOR or CONFIG_STACKPROTECTOR_STRONG.
Fixes: 8fc5b4d4121c ("purgatory: core purgatory functionality")
Reported-by: Vaibhav Rustagi <vaibhavrustagi(a)google.com>
Suggested-by: Peter Zijlstra <peterz(a)infradead.org>
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Vaibhav Rustagi <vaibhavrustagi(a)google.com>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20190807221539.94583-2-ndesaulniers@google.com
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 91ef244026d2..8901a1f89cf5 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -20,11 +20,34 @@ KCOV_INSTRUMENT := n
# Default KBUILD_CFLAGS can have -pg option set when FTRACE is enabled. That
# in turn leaves some undefined symbols like __fentry__ in purgatory and not
-# sure how to relocate those. Like kexec-tools, use custom flags.
-
-KBUILD_CFLAGS := -fno-strict-aliasing -Wall -Wstrict-prototypes -fno-zero-initialized-in-bss -fno-builtin -ffreestanding -c -Os -mcmodel=large
-KBUILD_CFLAGS += -m$(BITS)
-KBUILD_CFLAGS += $(call cc-option,-fno-PIE)
+# sure how to relocate those.
+ifdef CONFIG_FUNCTION_TRACER
+CFLAGS_REMOVE_sha256.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_purgatory.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_string.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_kexec-purgatory.o += $(CC_FLAGS_FTRACE)
+endif
+
+ifdef CONFIG_STACKPROTECTOR
+CFLAGS_REMOVE_sha256.o += -fstack-protector
+CFLAGS_REMOVE_purgatory.o += -fstack-protector
+CFLAGS_REMOVE_string.o += -fstack-protector
+CFLAGS_REMOVE_kexec-purgatory.o += -fstack-protector
+endif
+
+ifdef CONFIG_STACKPROTECTOR_STRONG
+CFLAGS_REMOVE_sha256.o += -fstack-protector-strong
+CFLAGS_REMOVE_purgatory.o += -fstack-protector-strong
+CFLAGS_REMOVE_string.o += -fstack-protector-strong
+CFLAGS_REMOVE_kexec-purgatory.o += -fstack-protector-strong
+endif
+
+ifdef CONFIG_RETPOLINE
+CFLAGS_REMOVE_sha256.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_purgatory.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_string.o += $(RETPOLINE_CFLAGS)
+CFLAGS_REMOVE_kexec-purgatory.o += $(RETPOLINE_CFLAGS)
+endif
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9f00baf74e4b6f79a3a3dfab44fb7bb2e797b551 Mon Sep 17 00:00:00 2001
From: Gary R Hook <gary.hook(a)amd.com>
Date: Tue, 30 Jul 2019 16:05:24 +0000
Subject: [PATCH] crypto: ccp - Add support for valid authsize values less than
16
AES GCM encryption allows for authsize values of 4, 8, and 12-16 bytes.
Validate the requested authsize, and retain it to save in the request
context.
Fixes: 36cf515b9bbe2 ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook(a)amd.com>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
diff --git a/drivers/crypto/ccp/ccp-crypto-aes-galois.c b/drivers/crypto/ccp/ccp-crypto-aes-galois.c
index d22631cb2bb3..02eba84028b3 100644
--- a/drivers/crypto/ccp/ccp-crypto-aes-galois.c
+++ b/drivers/crypto/ccp/ccp-crypto-aes-galois.c
@@ -58,6 +58,19 @@ static int ccp_aes_gcm_setkey(struct crypto_aead *tfm, const u8 *key,
static int ccp_aes_gcm_setauthsize(struct crypto_aead *tfm,
unsigned int authsize)
{
+ switch (authsize) {
+ case 16:
+ case 15:
+ case 14:
+ case 13:
+ case 12:
+ case 8:
+ case 4:
+ break;
+ default:
+ return -EINVAL;
+ }
+
return 0;
}
@@ -104,6 +117,7 @@ static int ccp_aes_gcm_crypt(struct aead_request *req, bool encrypt)
memset(&rctx->cmd, 0, sizeof(rctx->cmd));
INIT_LIST_HEAD(&rctx->cmd.entry);
rctx->cmd.engine = CCP_ENGINE_AES;
+ rctx->cmd.u.aes.authsize = crypto_aead_authsize(tfm);
rctx->cmd.u.aes.type = ctx->u.aes.type;
rctx->cmd.u.aes.mode = ctx->u.aes.mode;
rctx->cmd.u.aes.action = encrypt;
diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c
index 59f9849c3662..ef723e2722a8 100644
--- a/drivers/crypto/ccp/ccp-ops.c
+++ b/drivers/crypto/ccp/ccp-ops.c
@@ -622,6 +622,7 @@ static int ccp_run_aes_gcm_cmd(struct ccp_cmd_queue *cmd_q,
unsigned long long *final;
unsigned int dm_offset;
+ unsigned int authsize;
unsigned int jobid;
unsigned int ilen;
bool in_place = true; /* Default value */
@@ -643,6 +644,21 @@ static int ccp_run_aes_gcm_cmd(struct ccp_cmd_queue *cmd_q,
if (!aes->key) /* Gotta have a key SGL */
return -EINVAL;
+ /* Zero defaults to 16 bytes, the maximum size */
+ authsize = aes->authsize ? aes->authsize : AES_BLOCK_SIZE;
+ switch (authsize) {
+ case 16:
+ case 15:
+ case 14:
+ case 13:
+ case 12:
+ case 8:
+ case 4:
+ break;
+ default:
+ return -EINVAL;
+ }
+
/* First, decompose the source buffer into AAD & PT,
* and the destination buffer into AAD, CT & tag, or
* the input into CT & tag.
@@ -657,7 +673,7 @@ static int ccp_run_aes_gcm_cmd(struct ccp_cmd_queue *cmd_q,
p_tag = scatterwalk_ffwd(sg_tag, p_outp, ilen);
} else {
/* Input length for decryption includes tag */
- ilen = aes->src_len - AES_BLOCK_SIZE;
+ ilen = aes->src_len - authsize;
p_tag = scatterwalk_ffwd(sg_tag, p_inp, ilen);
}
@@ -839,19 +855,19 @@ static int ccp_run_aes_gcm_cmd(struct ccp_cmd_queue *cmd_q,
if (aes->action == CCP_AES_ACTION_ENCRYPT) {
/* Put the ciphered tag after the ciphertext. */
- ccp_get_dm_area(&final_wa, 0, p_tag, 0, AES_BLOCK_SIZE);
+ ccp_get_dm_area(&final_wa, 0, p_tag, 0, authsize);
} else {
/* Does this ciphered tag match the input? */
- ret = ccp_init_dm_workarea(&tag, cmd_q, AES_BLOCK_SIZE,
+ ret = ccp_init_dm_workarea(&tag, cmd_q, authsize,
DMA_BIDIRECTIONAL);
if (ret)
goto e_tag;
- ret = ccp_set_dm_area(&tag, 0, p_tag, 0, AES_BLOCK_SIZE);
+ ret = ccp_set_dm_area(&tag, 0, p_tag, 0, authsize);
if (ret)
goto e_tag;
ret = crypto_memneq(tag.address, final_wa.address,
- AES_BLOCK_SIZE) ? -EBADMSG : 0;
+ authsize) ? -EBADMSG : 0;
ccp_dm_free(&tag);
}
diff --git a/include/linux/ccp.h b/include/linux/ccp.h
index 7e9c991c95e0..43ed9e77cf81 100644
--- a/include/linux/ccp.h
+++ b/include/linux/ccp.h
@@ -173,6 +173,8 @@ struct ccp_aes_engine {
enum ccp_aes_mode mode;
enum ccp_aes_action action;
+ u32 authsize;
+
struct scatterlist *key;
u32 key_len; /* In bytes */
Hi,
[This is an automated email]
This commit has been processed because it contains a "Fixes:" tag,
fixing commit: d600ad1f2bdb NFS41: pop some layoutget errors to application.
The bot has tested the following trees: v5.2.8, v4.19.66, v4.14.138, v4.9.189, v4.4.189.
v5.2.8: Build OK!
v4.19.66: Failed to apply! Possible dependencies:
078b5fd92c49 ("NFS: Clean up list moves of struct nfs_page")
v4.14.138: Failed to apply! Possible dependencies:
078b5fd92c49 ("NFS: Clean up list moves of struct nfs_page")
v4.9.189: Failed to apply! Possible dependencies:
078b5fd92c49 ("NFS: Clean up list moves of struct nfs_page")
v4.4.189: Failed to apply! Possible dependencies:
078b5fd92c49 ("NFS: Clean up list moves of struct nfs_page")
6272dcc6beeb ("NFS: Simplify nfs_request_add_commit_list() arguments")
b20135d0b243 ("NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid")
c18b96a1b862 ("nfs: clean up rest of reqs when failing to add one")
f57dcf4c7211 ("NFS: Fix I/O request leakages")
fe238e601d25 ("NFS: Save struct inode COPYING CREDITS Documentation Kbuild Kconfig MAINTAINERS Makefile README REPORTING-BUGS arch block certs crypto drivers firmware fs include init ipc kernel lib mm net samples scripts security sound tools usr virt inside nfs_commit_info to clarify usage of i_lock")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
--
Thanks,
Sasha
On 2019-08-13 05:48, Sasha Levin wrote:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: ee7e5516be4f generic: per-device coherent dma allocator.
>
> The bot has tested the following trees: v5.2.8, v4.19.66, v4.14.138,
> v4.9.189, v4.4.189.
>
> v5.2.8: Build OK!
> v4.19.66: Failed to apply! Possible dependencies:
> Unable to calculate
>
> v4.14.138: Failed to apply! Possible dependencies:
> Unable to calculate
>
> v4.9.189: Failed to apply! Possible dependencies:
> 43fc509c3efb ("dma-coherent: introduce interface for default DMA
> pool")
> 92f66f84d969 ("arm64: Fix the DMA mmap and get_sgtable API with
> DMA_ATTR_FORCE_CONTIGUOUS")
> 93228b44c33a ("drivers: dma-coherent: Introduce default DMA pool")
> c41f9ea998f3 ("drivers: dma-coherent: Account dma_pfn_offset when
> used with device tree")
>
> v4.4.189: Failed to apply! Possible dependencies:
> 052c96dbe33b ("arc: convert to dma_map_ops")
> 20d666e41166 ("dma-mapping: remove <asm-generic/dma-coherent.h>")
> 340f3039acd6 ("m68k: convert to dma_map_ops")
> 4605f04b2893 ("c6x: convert to dma_map_ops")
> 5348c1e9e0dc ("metag: convert to dma_map_ops")
> 5a1a67f1d7fe ("nios2: convert to dma_map_ops")
> 6f62097583e7 ("blackfin: convert to dma_map_ops")
> 79387179e2e4 ("parisc: convert to dma_map_ops")
> a34a517ac96c ("avr32: convert to dma_map_ops")
> e1c7e324539a ("dma-mapping: always provide the dma_map_ops based
> implementation")
> e20dd88995df ("cris: convert to dma_map_ops")
> eae075196305 ("frv: convert to dma_map_ops")
> f151341ca00e ("mn10300: convert to dma_map_ops")
>
>
> NOTE: The patch will not be queued to stable trees until it is
> upstream.
>
> How should we proceed with this patch?
>
> --
> Thanks,
> Sasha
If everyone is okay with this patch, then we can just apply it on the
mainline kernel.
The change can be backported to the older kernels at a later point in
time.
Thanks,
Isaac
From: Joerg Roedel <jroedel(a)suse.de>
Backport commits from upstream to fix a data corruption
issue that gets exposed when using PTI on x86-32.
Please consider them for inclusion into stable-4.19.
Joerg Roedel (3):
x86/mm: Check for pfn instead of page in vmalloc_sync_one()
x86/mm: Sync also unmappings in vmalloc_sync_all()
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
arch/x86/mm/fault.c | 15 ++++++---------
mm/vmalloc.c | 9 +++++++++
2 files changed, 15 insertions(+), 9 deletions(-)
--
2.16.4
Hi Sasha,
On Tue, Aug 13, 2019 at 5:48 AM Sasha Levin <sashal(a)kernel.org> wrote:
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
>
> The bot has tested the following trees: v5.2.8, v4.19.66, v4.14.138, v4.9.189, v4.4.189.
>
> v5.2.8: Build OK!
> v4.19.66: Build OK!
> v4.14.138: Build OK!
> v4.4.189: Failed to apply! Possible dependencies:
> 4f2056873ff0 ("xtensa: extract common CPU reset code into separate function")
> bf15f86b343e ("xtensa: initialize MMU before jumping to reset vector")
> ea951c34ea95 ("xtensa: fix icountlevel setting in cpu_reset")
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
It should be applied to stable trees for linux versions 4.10 and newer.
--
Thanks.
-- Max
The G700 suffers from the same issue than the G502:
when plugging it in, the driver tries to contact it but it fails.
This timeout is problematic as it introduce a delay in the boot,
and having only the mouse event node means that the hardware
macros keys can not be relayed to the userspace.
Link: https://github.com/libratbag/libratbag/issues/797
Fixes: 91cf9a98ae41 ("HID: logitech-hidpp: make .probe usbhid capable")
Cc: stable(a)vger.kernel.org # v5.2
Signed-off-by: Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
---
drivers/hid/hid-logitech-hidpp.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c
index 343052b117a9..0179f7ed77e5 100644
--- a/drivers/hid/hid-logitech-hidpp.c
+++ b/drivers/hid/hid-logitech-hidpp.c
@@ -3751,8 +3751,6 @@ static const struct hid_device_id hidpp_devices[] = {
{ /* Logitech G403 Wireless Gaming Mouse over USB */
HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC082) },
- { /* Logitech G700 Gaming Mouse over USB */
- HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC06B) },
{ /* Logitech G703 Gaming Mouse over USB */
HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC087) },
{ /* Logitech G703 Hero Gaming Mouse over USB */
--
2.19.2
Dave Jones <davej(a)codemonkey.org.uk> 于2019年8月13日周二 上午1:05写道:
>
> On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
> > Commit: 4b663366246be1d1d4b1b8b01245b2e88ad9e706
> > Parent: 16b2084a8afa1432d14ba72b7c97d7908e178178
> > Web: https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
> > Author: Alexis Bauvin <abauvin(a)scaleway.com>
> > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
> >
> > tun: mark small packets as owned by the tap sock
> >
> > - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
> >
> > Small packets going out of a tap device go through an optimized code
> > path that uses build_skb() rather than sock_alloc_send_pskb(). The
> > latter calls skb_set_owner_w(), but the small packet code path does not.
> >
> > The net effect is that small packets are not owned by the userland
> > application's socket (e.g. QEMU), while large packets are.
> > This can be seen with a TCP session, where packets are not owned when
> > the window size is small enough (around PAGE_SIZE), while they are once
> > the window grows (note that this requires the host to support virtio
> > tso for the guest to offload segmentation).
> > All this leads to inconsistent behaviour in the kernel, especially on
> > netfilter modules that uses sk->socket (e.g. xt_owner).
> >
> > Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
> > Signed-off-by: Alexis Bauvin <abauvin(a)scaleway.com>
> > Acked-by: Jason Wang <jasowang(a)redhat.com>
>
> This commit breaks ipv6 routing when I deployed on it a linode.
> It seems to work briefly after boot, and then silently all packets get
> dropped. (Presumably, it's dropping RA or ND packets)
>
> With this reverted, everything works as it did in rc3.
>
> Dave
>
Thanks for reporting, Dave.
+cc stable
Just noticed, the patch has been backported to 4.14,4.19, 5.2
Regards,
Jack Wang
commit e9e08a07385e08f1a7f85c5d1e345c21c9564963 upstream.
With CONFIG_LKDTM=y and make OBJCOPY=llvm-objcopy, llvm-objcopy errors:
llvm-objcopy: error: --set-section-flags=.text conflicts with
--rename-section=.text=.rodata
Rather than support setting flags then renaming sections vs renaming
then setting flags, it's simpler to just change both at the same time
via --rename-section. Adding the load flag is required for GNU objcopy
to mark .rodata Type as PROGBITS after the rename.
This can be verified with:
$ readelf -S drivers/misc/lkdtm/rodata_objcopy.o
...
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[ 1] .rodata PROGBITS 0000000000000000 00000040
0000000000000004 0000000000000000 A 0 0 4
...
Which shows that .text is now renamed .rodata, the alloc flag A is set,
the type is PROGBITS, and the section is not flagged as writeable W.
Cc: stable(a)vger.kernel.org
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=24554
Link: https://github.com/ClangBuiltLinux/linux/issues/448
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Suggested-by: Alan Modra <amodra(a)gmail.com>
Suggested-by: Jordan Rupprect <rupprecht(a)google.com>
Suggested-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
This backport is for 4.14. It should allow us to use llvm-objcopy for
RELR relocations on arm64 in Android.
drivers/misc/Makefile | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index ad0e64fdba34..76f6a4f628b3 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -69,8 +69,7 @@ KCOV_INSTRUMENT_lkdtm_rodata.o := n
OBJCOPYFLAGS :=
OBJCOPYFLAGS_lkdtm_rodata_objcopy.o := \
- --set-section-flags .text=alloc,readonly \
- --rename-section .text=.rodata
+ --rename-section .text=.rodata,alloc,readonly,load
targets += lkdtm_rodata.o lkdtm_rodata_objcopy.o
$(obj)/lkdtm_rodata_objcopy.o: $(obj)/lkdtm_rodata.o FORCE
$(call if_changed,objcopy)
--
2.23.0.rc1.153.gdeed80330f-goog
I -thought- I had fixed this entirely, but it looks like that I didn't
test this thoroughly enough as we apparently still make one big mistake
with nv50_msto_atomic_check() - we don't handle the following scenario:
* CRTC #1 has n VCPI allocated to it, is attached to connector DP-4
which is attached to encoder #1. enabled=y active=n
* CRTC #1 is changed from DP-4 to DP-5, causing:
* DP-4 crtc=#1→NULL (VCPI n→0)
* DP-5 crtc=NULL→#1
* CRTC #1 steals encoder #1 back from DP-4 and gives it to DP-5
* CRTC #1 maintains the same mode as before, just with a different
connector
* mode_changed=n connectors_changed=y
(we _SHOULD_ do VCPI 0→n here, but don't)
Once the above scenario is repeated once, we'll attempt freeing VCPI
from the connector that we didn't allocate due to the connectors
changing, but the mode staying the same. Sigh.
Since nv50_msto_atomic_check() has broken a few times now, let's rethink
things a bit to be more careful: limit both VCPI/PBN allocations to
mode_changed || connectors_changed, since neither VCPI or PBN should
ever need to change outside of routing and mode changes.
Changes since v1:
* Fix accidental reversal of clock and bpp arguments in
drm_dp_calc_pbn_mode() - William Lewis
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Reported-by: Bohdan Milar <bmilar(a)redhat.com>
Tested-by: Bohdan Milar <bmilar(a)redhat.com>
Fixes: 232c9eec417a ("drm/nouveau: Use atomic VCPI helpers for MST")
References: 412e85b60531 ("drm/nouveau: Only release VCPI slots on mode changes")
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: David Airlie <airlied(a)redhat.com>
Cc: Jerry Zuo <Jerry.Zuo(a)amd.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: Juston Li <juston.li(a)intel.com>
Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Cc: Karol Herbst <karolherbst(a)gmail.com>
Cc: Ilia Mirkin <imirkin(a)alum.mit.edu>
Cc: <stable(a)vger.kernel.org> # v5.1+
---
drivers/gpu/drm/nouveau/dispnv50/disp.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index 126703816794..5c36c75232e6 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -771,16 +771,20 @@ nv50_msto_atomic_check(struct drm_encoder *encoder,
struct nv50_head_atom *asyh = nv50_head_atom(crtc_state);
int slots;
- /* When restoring duplicated states, we need to make sure that the
- * bw remains the same and avoid recalculating it, as the connector's
- * bpc may have changed after the state was duplicated
- */
- if (!state->duplicated)
- asyh->dp.pbn =
- drm_dp_calc_pbn_mode(crtc_state->adjusted_mode.clock,
- connector->display_info.bpc * 3);
+ if (crtc_state->mode_changed || crtc_state->connectors_changed) {
+ /*
+ * When restoring duplicated states, we need to make sure that
+ * the bw remains the same and avoid recalculating it, as the
+ * connector's bpc may have changed after the state was
+ * duplicated
+ */
+ if (!state->duplicated) {
+ const int bpp = connector->display_info.bpc * 3;
+ const int clock = crtc_state->adjusted_mode.clock;
+
+ asyh->dp.pbn = drm_dp_calc_pbn_mode(clock, bpp);
+ }
- if (crtc_state->mode_changed) {
slots = drm_dp_atomic_find_vcpi_slots(state, &mstm->mgr,
mstc->port,
asyh->dp.pbn);
--
2.21.0
Sasha Levin <sashal(a)kernel.org> writes:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: ba41e1e1ccb9 powerpc/mce: Hookup derror (load/store) UE errors.
>
> The bot has tested the following trees: v5.2.8, v4.19.66.
>
> v5.2.8: Build OK!
> v4.19.66: Failed to apply! Possible dependencies:
> 360cae313702 ("KVM: PPC: Book3S HV: Nested guest entry via hypercall")
> 41f4e631daf8 ("KVM: PPC: Book3S HV: Extract PMU save/restore operations as C-callable functions")
> 884dfb722db8 ("KVM: PPC: Book3S HV: Simplify machine check handling")
> 89329c0be8bd ("KVM: PPC: Book3S HV: Clear partition table entry on vm teardown")
> 8e3f5fc1045d ("KVM: PPC: Book3S HV: Framework and hcall stubs for nested virtualization")
> 95a6432ce903 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests")
> a43c1590426c ("powerpc/pseries: Flush SLB contents on SLB MCE errors.")
> c05772018491 ("powerpc/64s: Better printing of machine check info for guest MCEs")
> d24ea8a7336a ("KVM: PPC: Book3S: Simplify external interrupt handling")
> df709a296ef7 ("KVM: PPC: Book3S HV: Simplify real-mode interrupt handling")
> f7035ce9f1df ("KVM: PPC: Book3S HV: Move interrupt delivery on guest entry to C code")
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
I will send a backport once this has been merged upstream.
Thanks,
Santosh
>
> --
> Thanks,
> Sasha
This is the start of the stable review cycle for the 4.19.65 release.
There are 74 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.65-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.65-rc1
Suganath Prabu <suganath-prabu.subramani(a)broadcom.com>
scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
Andy Lutomirski <luto(a)kernel.org>
x86/vdso: Prevent segfaults due to hoisted vclock reads
Linus Torvalds <torvalds(a)linux-foundation.org>
gcc-9: properly declare the {pv,hv}clock_page storage
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Support GCC 9 cold subfunction naming scheme
Eugeniy Paltsev <Eugeniy.Paltsev(a)synopsys.com>
ARC: enable uboot support unconditionally
Jean Delvare <jdelvare(a)suse.de>
eeprom: at24: make spd world-readable again
Xiaolin Zhang <xiaolin.zhang(a)intel.com>
drm/i915/gvt: fix incorrect cache entry for guest page mapping
John Fleck <john.fleck(a)intel.com>
IB/hfi1: Check for error on call to alloc_rsm_map_table
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Fix clean_mr() to work in the expected order
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Use direct mkey destroy command upon UMR unreg failure
Yishai Hadas <yishaih(a)mellanox.com>
IB/mlx5: Fix unreg_umr to ignore the mkey state
Juergen Gross <jgross(a)suse.com>
xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
Munehisa Kamata <kamatam(a)amazon.com>
nbd: replace kill_bdev() with __invalidate_device() again
Will Deacon <will(a)kernel.org>
arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
Will Deacon <will(a)kernel.org>
arm64: compat: Allow single-byte watchpoints on all addresses
Will Deacon <will(a)kernel.org>
drivers/perf: arm_pmu: Fix failure path in PM notifier
Helge Deller <deller(a)gmx.de>
parisc: Fix build of compressed kernel even with debug enabled
Chris Down <chris(a)chrisdown.name>
cgroup: kselftest: relax fs_spec checks
Stefan Haberland <sth(a)linux.ibm.com>
s390/dasd: fix endless loop after read unit address configuration
Yang Shi <yang.shi(a)linux.alibaba.com>
mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker
Samuel Thibault <samuel.thibault(a)ens-lyon.org>
ALSA: hda: Fix 1-minute detection delay when i915 module is not available
Ondrej Mosnacek <omosnace(a)redhat.com>
selinux: fix memory leak in policydb_init()
Marco Felsch <m.felsch(a)pengutronix.de>
mtd: rawnand: micron: handle on-die "ECC-off" devices correctly
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
IB/hfi1: Fix Spectre v1 vulnerability
Michael Wu <michael.wu(a)vatics.com>
gpiolib: fix incorrect IRQ requesting of an active-low lineevent
Joe Perches <joe(a)perches.com>
mmc: meson-mx-sdio: Fix misuse of GENMASK macro
Douglas Anderson <dianders(a)chromium.org>
mmc: dw_mmc: Fix occasional hang after tuning on eMMC
Filipe Manana <fdmanana(a)suse.com>
Btrfs: fix race leading to fs corruption after transaction abort
Filipe Manana <fdmanana(a)suse.com>
Btrfs: fix incremental send failure after deduplication
Masahiro Yamada <yamada.masahiro(a)socionext.com>
kbuild: initialize CLANG_FLAGS correctly in the top Makefile
M. Vefa Bicakci <m.v.b(a)runbox.com>
kconfig: Clear "written" flag to avoid data loss
Yongxin Liu <yongxin.liu(a)windriver.com>
drm/nouveau: fix memory leak in nouveau_conn_reset()
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
x86, boot: Remove multiple copy of static function sanitize_boot_params()
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/paravirt: Fix callee-saved function ELF sizes
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/kvm: Don't call kvm_spurious_fault() from .fixup
Zhenzhong Duan <zhenzhong.duan(a)oracle.com>
xen/pv: Fix a boot up hang revealed by int3 self test
Petr Machata <petrm(a)mellanox.com>
mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed
Kees Cook <keescook(a)chromium.org>
ipc/mqueue.c: only perform resource calculation if user valid
Dan Carpenter <dan.carpenter(a)oracle.com>
drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
Mikko Rapeli <mikko.rapeli(a)iki.fi>
uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
Sam Protsenko <semen.protsenko(a)linaro.org>
coda: fix build using bare-metal toolchain
Zhouyang Jia <jiazhouyang09(a)gmail.com>
coda: add error handling for fget
Peter Rosin <peda(a)axentia.se>
lib/test_string.c: avoid masking memset16/32/64 failures
Kees Cook <keescook(a)chromium.org>
lib/test_overflow.c: avoid tainting the kernel and fix wrap size
Doug Berger <opendmb(a)gmail.com>
mm/cma.c: fail if fixed declaration can't be honored
Arnd Bergmann <arnd(a)arndb.de>
x86: math-emu: Hide clang warnings for 16-bit overflow
Qian Cai <cai(a)lca.pw>
x86/apic: Silence -Wtype-limits compiler warnings
Benjamin Poirier <bpoirier(a)suse.com>
be2net: Signal that the device cannot transmit during reconfiguration
Arnd Bergmann <arnd(a)arndb.de>
ACPI: fix false-positive -Wuninitialized warning
Arnd Bergmann <arnd(a)arndb.de>
x86: kvm: avoid constant-conversion warning
Ravi Bangoria <ravi.bangoria(a)linux.ibm.com>
perf version: Fix segfault due to missing OPT_END()
Benjamin Block <bblock(a)linux.ibm.com>
scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
Arnd Bergmann <arnd(a)arndb.de>
ACPI: blacklist: fix clang warning for unused DMI table
Jeff Layton <jlayton(a)kernel.org>
ceph: return -ERANGE if virtual xattr value didn't fit in buffer
Andrea Parri <andrea.parri(a)amarulasolutions.com>
ceph: fix improper use of smp_mb__before_atomic()
Ronnie Sahlberg <lsahlber(a)redhat.com>
cifs: Fix a race condition with cifs_echo_request
Qu Wenruo <wqu(a)suse.com>
btrfs: qgroup: Don't hold qgroup_ioctl_lock in btrfs_qgroup_inherit()
David Sterba <dsterba(a)suse.com>
btrfs: fix minimum number of chunk errors for DUP
Chunyan Zhang <zhang.chunyan(a)linaro.org>
clk: sprd: Add check for return value of sprd_clk_regmap_init()
Russell King <rmk+kernel(a)armlinux.org.uk>
fs/adfs: super: fix use-after-free bug
JC Kuo <jckuo(a)nvidia.com>
clk: tegra210: fix PLLU and PLLU_OUT1
Geert Uytterhoeven <geert+renesas(a)glider.be>
dmaengine: rcar-dmac: Reject zero-length slave DMA requests
Petr Cvek <petrcvekcz(a)gmail.com>
MIPS: lantiq: Fix bitfield masking
Jean-Philippe Brucker <jean-philippe.brucker(a)arm.com>
firmware/psci: psci_checker: Park kthreads before stopping them
Prarit Bhargava <prarit(a)redhat.com>
kernel/module.c: Only return -EEXIST for modules that have finished loading
Helen Koike <helen.koike(a)collabora.com>
arm64: dts: rockchip: fix isp iommu clocks and power domain
Dmitry Osipenko <digetx(a)gmail.com>
dmaengine: tegra-apb: Error out if DMA_PREP_INTERRUPT flag is unset
Cheng Jian <cj.chengjian(a)huawei.com>
ftrace: Enable trampoline when rec count returns back to one
Douglas Anderson <dianders(a)chromium.org>
ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend
Douglas Anderson <dianders(a)chromium.org>
ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again
Douglas Anderson <dianders(a)chromium.org>
ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: riscpc: fix DMA
-------------
Diffstat:
Makefile | 7 +-
arch/arc/Kconfig | 13 ----
arch/arc/configs/nps_defconfig | 1 -
arch/arc/configs/vdk_hs38_defconfig | 1 -
arch/arc/configs/vdk_hs38_smp_defconfig | 2 -
arch/arc/kernel/head.S | 2 -
arch/arc/kernel/setup.c | 2 -
arch/arm/boot/dts/rk3288-veyron-mickey.dts | 4 --
arch/arm/boot/dts/rk3288-veyron-minnie.dts | 4 --
arch/arm/boot/dts/rk3288.dtsi | 1 +
arch/arm/mach-rpc/dma.c | 5 +-
arch/arm64/boot/dts/rockchip/rk3399.dtsi | 8 +--
arch/arm64/include/asm/cpufeature.h | 7 +-
arch/arm64/kernel/cpufeature.c | 8 ++-
arch/arm64/kernel/hw_breakpoint.c | 7 +-
arch/mips/lantiq/irq.c | 5 +-
arch/parisc/boot/compressed/vmlinux.lds.S | 4 +-
arch/x86/boot/compressed/misc.c | 1 +
arch/x86/boot/compressed/misc.h | 1 -
arch/x86/entry/entry_64.S | 1 -
arch/x86/entry/vdso/vclock_gettime.c | 19 ++++--
arch/x86/include/asm/apic.h | 2 +-
arch/x86/include/asm/kvm_host.h | 34 +++++-----
arch/x86/include/asm/paravirt.h | 1 +
arch/x86/include/asm/traps.h | 2 +-
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/kvm.c | 1 +
arch/x86/kvm/mmu.c | 6 +-
arch/x86/math-emu/fpu_emu.h | 2 +-
arch/x86/math-emu/reg_constant.c | 2 +-
arch/x86/xen/enlighten_pv.c | 2 +-
arch/x86/xen/xen-asm_64.S | 1 -
drivers/acpi/blacklist.c | 4 ++
drivers/block/nbd.c | 2 +-
drivers/clk/sprd/sc9860-clk.c | 5 +-
drivers/clk/tegra/clk-tegra210.c | 8 +--
drivers/dma/sh/rcar-dmac.c | 2 +-
drivers/dma/tegra20-apb-dma.c | 12 +++-
drivers/firmware/psci_checker.c | 10 +--
drivers/gpio/gpiolib.c | 6 +-
drivers/gpu/drm/i915/gvt/kvmgt.c | 12 ++++
drivers/gpu/drm/nouveau/nouveau_connector.c | 2 +-
drivers/infiniband/hw/hfi1/chip.c | 11 +++-
drivers/infiniband/hw/hfi1/verbs.c | 2 +
drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 +
drivers/infiniband/hw/mlx5/mr.c | 23 ++++---
drivers/infiniband/hw/mlx5/qp.c | 13 ++--
drivers/misc/eeprom/at24.c | 2 +-
drivers/mmc/host/dw_mmc.c | 3 +-
drivers/mmc/host/meson-mx-sdio.c | 2 +-
drivers/mtd/nand/raw/nand_micron.c | 14 +++-
drivers/net/ethernet/emulex/benet/be_main.c | 6 +-
drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c | 16 ++---
drivers/perf/arm_pmu.c | 2 +-
drivers/rapidio/devices/rio_mport_cdev.c | 2 +
drivers/s390/block/dasd_alias.c | 22 +++++--
drivers/s390/scsi/zfcp_erp.c | 7 ++
drivers/scsi/mpt3sas/mpt3sas_base.c | 12 ++--
drivers/xen/swiotlb-xen.c | 4 +-
fs/adfs/super.c | 5 +-
fs/btrfs/qgroup.c | 24 ++++++-
fs/btrfs/send.c | 77 +++++-----------------
fs/btrfs/transaction.c | 10 +++
fs/btrfs/volumes.c | 3 +-
fs/ceph/super.h | 7 +-
fs/ceph/xattr.c | 14 ++--
fs/cifs/connect.c | 8 +--
fs/coda/psdev.c | 5 +-
include/linux/acpi.h | 5 +-
include/linux/coda.h | 3 +-
include/linux/coda_psdev.h | 11 ++++
include/uapi/linux/coda_psdev.h | 13 ----
ipc/mqueue.c | 19 +++---
kernel/module.c | 6 +-
kernel/trace/ftrace.c | 28 ++++----
lib/test_overflow.c | 11 ++--
lib/test_string.c | 6 +-
mm/cma.c | 13 ++++
mm/vmscan.c | 9 ++-
scripts/kconfig/confdata.c | 4 ++
security/selinux/ss/policydb.c | 6 +-
sound/hda/hdac_i915.c | 10 +--
tools/objtool/elf.c | 2 +-
tools/perf/builtin-version.c | 1 +
tools/testing/selftests/cgroup/cgroup_util.c | 3 +-
85 files changed, 385 insertions(+), 281 deletions(-)
ITLB entry modifications must be followed by the isync instruction
before the new entries are possibly used. cpu_reset lacks one isync
between ITLB way 6 initialization and jump to the identity mapping.
Add missing isync to xtensa cpu_reset.
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc(a)gmail.com>
---
arch/xtensa/kernel/setup.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/xtensa/kernel/setup.c b/arch/xtensa/kernel/setup.c
index 5cb8a62e091c..7c3106093c75 100644
--- a/arch/xtensa/kernel/setup.c
+++ b/arch/xtensa/kernel/setup.c
@@ -511,6 +511,7 @@ void cpu_reset(void)
"add %2, %2, %7\n\t"
"addi %0, %0, -1\n\t"
"bnez %0, 1b\n\t"
+ "isync\n\t"
/* Jump to identity mapping */
"jx %3\n"
"2:\n\t"
--
2.11.0
This is a note to let you know that I've just added the patch titled
iio: adc: max9611: Fix temperature reading in probe
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From b9ddd5091160793ee9fac10da765cf3f53d2aaf0 Mon Sep 17 00:00:00 2001
From: Jacopo Mondi <jacopo+renesas(a)jmondi.org>
Date: Mon, 5 Aug 2019 17:55:15 +0200
Subject: iio: adc: max9611: Fix temperature reading in probe
The max9611 driver reads the die temperature at probe time to validate
the communication channel. Use the actual read value to perform the test
instead of the read function return value, which was mistakenly used so
far.
The temperature reading test was only successful because the 0 return
value is in the range of supported temperatures.
Fixes: 69780a3bbc0b ("iio: adc: Add Maxim max9611 ADC driver")
Signed-off-by: Jacopo Mondi <jacopo+renesas(a)jmondi.org>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/max9611.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/max9611.c b/drivers/iio/adc/max9611.c
index 0e3c6529fc4c..da073d72f649 100644
--- a/drivers/iio/adc/max9611.c
+++ b/drivers/iio/adc/max9611.c
@@ -480,7 +480,7 @@ static int max9611_init(struct max9611_dev *max9611)
if (ret)
return ret;
- regval = ret & MAX9611_TEMP_MASK;
+ regval &= MAX9611_TEMP_MASK;
if ((regval > MAX9611_TEMP_MAX_POS &&
regval < MAX9611_TEMP_MIN_NEG) ||
--
2.22.0
This is a note to let you know that I've just added the patch titled
iio: frequency: adf4371: Fix output frequency setting
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 82a5008a341d301da3ab529ca888c64f529bd075 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nuno=20S=C3=A1?= <nuno.sa(a)analog.com>
Date: Mon, 5 Aug 2019 15:37:16 +0200
Subject: iio: frequency: adf4371: Fix output frequency setting
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The fract1 word was not being properly programmed on the device leading
to wrong output frequencies.
Fixes: 7f699bd14913 (iio: frequency: adf4371: Add support for ADF4371 PLL)
Signed-off-by: Nuno Sá <nuno.sa(a)analog.com>
Reviewed-by: Stefan Popa <stefan.popa(a)analog.com>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/frequency/adf4371.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/frequency/adf4371.c b/drivers/iio/frequency/adf4371.c
index e48f15cc9ab5..ff82863cbf42 100644
--- a/drivers/iio/frequency/adf4371.c
+++ b/drivers/iio/frequency/adf4371.c
@@ -276,11 +276,11 @@ static int adf4371_set_freq(struct adf4371_state *st, unsigned long long freq,
st->buf[0] = st->integer >> 8;
st->buf[1] = 0x40; /* REG12 default */
st->buf[2] = 0x00;
- st->buf[3] = st->fract2 & 0xFF;
- st->buf[4] = st->fract2 >> 7;
- st->buf[5] = st->fract2 >> 15;
+ st->buf[3] = st->fract1 & 0xFF;
+ st->buf[4] = st->fract1 >> 8;
+ st->buf[5] = st->fract1 >> 16;
st->buf[6] = ADF4371_FRAC2WORD_L(st->fract2 & 0x7F) |
- ADF4371_FRAC1WORD(st->fract1 >> 23);
+ ADF4371_FRAC1WORD(st->fract1 >> 24);
st->buf[7] = ADF4371_FRAC2WORD_H(st->fract2 >> 7);
st->buf[8] = st->mod2 & 0xFF;
st->buf[9] = ADF4371_MOD2WORD(st->mod2 >> 8);
--
2.22.0
This is a note to let you know that I've just added the patch titled
USB: core: Fix races in character device registration and
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 303911cfc5b95d33687d9046133ff184cf5043ff Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Mon, 12 Aug 2019 16:11:07 -0400
Subject: USB: core: Fix races in character device registration and
deregistraion
The syzbot fuzzer has found two (!) races in the USB character device
registration and deregistration routines. This patch fixes the races.
The first race results from the fact that usb_deregister_dev() sets
usb_minors[intf->minor] to NULL before calling device_destroy() on the
class device. This leaves a window during which another thread can
allocate the same minor number but will encounter a duplicate name
error when it tries to register its own class device. A typical error
message in the system log would look like:
sysfs: cannot create duplicate filename '/class/usbmisc/ldusb0'
The patch fixes this race by destroying the class device first.
The second race is in usb_register_dev(). When that routine runs, it
first allocates a minor number, then drops minor_rwsem, and then
creates the class device. If the device creation fails, the minor
number is deallocated and the whole routine returns an error. But
during the time while minor_rwsem was dropped, there is a window in
which the minor number is allocated and so another thread can
successfully open the device file. Typically this results in
use-after-free errors or invalid accesses when the other thread closes
its open file reference, because the kernel then tries to release
resources that were already deallocated when usb_register_dev()
failed. The patch fixes this race by keeping minor_rwsem locked
throughout the entire routine.
Reported-and-tested-by: syzbot+30cf45ebfe0b0c4847a1(a)syzkaller.appspotmail.com
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
CC: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1908121607590.1659-100000@iolanth…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/file.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/core/file.c b/drivers/usb/core/file.c
index 65de6f73b672..558890ada0e5 100644
--- a/drivers/usb/core/file.c
+++ b/drivers/usb/core/file.c
@@ -193,9 +193,10 @@ int usb_register_dev(struct usb_interface *intf,
intf->minor = minor;
break;
}
- up_write(&minor_rwsem);
- if (intf->minor < 0)
+ if (intf->minor < 0) {
+ up_write(&minor_rwsem);
return -EXFULL;
+ }
/* create a usb class device for this usb interface */
snprintf(name, sizeof(name), class_driver->name, minor - minor_base);
@@ -203,12 +204,11 @@ int usb_register_dev(struct usb_interface *intf,
MKDEV(USB_MAJOR, minor), class_driver,
"%s", kbasename(name));
if (IS_ERR(intf->usb_dev)) {
- down_write(&minor_rwsem);
usb_minors[minor] = NULL;
intf->minor = -1;
- up_write(&minor_rwsem);
retval = PTR_ERR(intf->usb_dev);
}
+ up_write(&minor_rwsem);
return retval;
}
EXPORT_SYMBOL_GPL(usb_register_dev);
@@ -234,12 +234,12 @@ void usb_deregister_dev(struct usb_interface *intf,
return;
dev_dbg(&intf->dev, "removing %d minor\n", intf->minor);
+ device_destroy(usb_class->class, MKDEV(USB_MAJOR, intf->minor));
down_write(&minor_rwsem);
usb_minors[intf->minor] = NULL;
up_write(&minor_rwsem);
- device_destroy(usb_class->class, MKDEV(USB_MAJOR, intf->minor));
intf->usb_dev = NULL;
intf->minor = -1;
destroy_usb_class();
--
2.22.0
Both the size parameter in the dma_alloc_from_[dev/global]_coherent()
functions and the size field in the dma_coherent_mem structure
are represented by a signed quantity, which makes it so that any
comparisons between these two quantities is a signed comparison.
When a reserved memory region is larger than or equal to 2GB in
size, this will cause the most significant bit to be set to 1,
thus, treating the size as a negative number in signed
comparisons.
This can result in allocation failures when an amount of
memory that is strictly less than 2 GB is requested from
this region. The allocation fails because the signed comparison
to prevent from allocating more memory than what is in the
region in __dma_alloc_from_coherent() evaluates to true since
the size of the region is treated as a negative number, but the
size of the request is treated as a positive number.
Thus, change the type of the size parameter in the allocation
functions to an unsigned type, and change the type of the
size field in the dma_coherent_mem structure to an unsigned type
as well, as it does not make sense for sizes to be represented
by signed quantities.
Fixes: ee7e5516be4f ("generic: per-device coherent dma allocator")
Signed-off-by: Isaac J. Manjarres <isaacm(a)codeaurora.org>
Cc: stable(a)vger.kernel.org
---
include/linux/dma-mapping.h | 6 +++---
kernel/dma/coherent.c | 8 ++++----
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f7d1eea..06d446d 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -159,14 +159,14 @@ static inline int is_device_dma_capable(struct device *dev)
* These three functions are only for dma allocator.
* Don't use them in device drivers.
*/
-int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
+int dma_alloc_from_dev_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, void **ret);
int dma_release_from_dev_coherent(struct device *dev, int order, void *vaddr);
int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, size_t size, int *ret);
-void *dma_alloc_from_global_coherent(ssize_t size, dma_addr_t *dma_handle);
+void *dma_alloc_from_global_coherent(size_t size, dma_addr_t *dma_handle);
int dma_release_from_global_coherent(int order, void *vaddr);
int dma_mmap_from_global_coherent(struct vm_area_struct *vma, void *cpu_addr,
size_t size, int *ret);
@@ -176,7 +176,7 @@ int dma_mmap_from_global_coherent(struct vm_area_struct *vma, void *cpu_addr,
#define dma_release_from_dev_coherent(dev, order, vaddr) (0)
#define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
-static inline void *dma_alloc_from_global_coherent(ssize_t size,
+static inline void *dma_alloc_from_global_coherent(size_t size,
dma_addr_t *dma_handle)
{
return NULL;
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 29fd659..c671d5c 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -13,7 +13,7 @@ struct dma_coherent_mem {
void *virt_base;
dma_addr_t device_base;
unsigned long pfn_base;
- int size;
+ unsigned long size;
unsigned long *bitmap;
spinlock_t spinlock;
bool use_dev_dma_pfn_offset;
@@ -136,7 +136,7 @@ void dma_release_declared_memory(struct device *dev)
EXPORT_SYMBOL(dma_release_declared_memory);
static void *__dma_alloc_from_coherent(struct dma_coherent_mem *mem,
- ssize_t size, dma_addr_t *dma_handle)
+ size_t size, dma_addr_t *dma_handle)
{
int order = get_order(size);
unsigned long flags;
@@ -179,7 +179,7 @@ static void *__dma_alloc_from_coherent(struct dma_coherent_mem *mem,
* Returns 0 if dma_alloc_coherent should continue with allocating from
* generic memory areas, or !0 if dma_alloc_coherent should return @ret.
*/
-int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
+int dma_alloc_from_dev_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, void **ret)
{
struct dma_coherent_mem *mem = dev_get_coherent_memory(dev);
@@ -191,7 +191,7 @@ int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
return 1;
}
-void *dma_alloc_from_global_coherent(ssize_t size, dma_addr_t *dma_handle)
+void *dma_alloc_from_global_coherent(size_t size, dma_addr_t *dma_handle)
{
if (!dma_coherent_default_memory)
return NULL;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Processing of SDIO IRQs must obviously be prevented while the card is
system suspended, otherwise we may end up trying to communicate with an
uninitialized SDIO card.
Reports throughout the years shows that this is not only a theoretical
problem, but a real issue. So, let's finally fix this problem, by keeping
track of the state for the card and bail out before processing the SDIO
IRQ, in case the card is suspended.
Cc: stable(a)vger.kernel.org
Reported-by: Douglas Anderson <dianders(a)chromium.org>
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
---
This has only been compile tested so far, any help for real test on HW is
greatly appreciated.
Note that, this is only the initial part of what is needed to make power
management of SDIO card more robust, but let's start somewhere and continue to
improve things.
The next step I am looking at right now, is to make sure the SDIO IRQ is turned
off during system suspend, unless it's supported as a system wakeup (and enabled
to be used).
---
drivers/mmc/core/sdio.c | 7 +++++++
drivers/mmc/core/sdio_irq.c | 4 ++++
2 files changed, 11 insertions(+)
diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
index d1aa1c7577bb..9951295d3220 100644
--- a/drivers/mmc/core/sdio.c
+++ b/drivers/mmc/core/sdio.c
@@ -937,6 +937,10 @@ static int mmc_sdio_pre_suspend(struct mmc_host *host)
*/
static int mmc_sdio_suspend(struct mmc_host *host)
{
+ /* Prevent processing of SDIO IRQs in suspended state. */
+ mmc_card_set_suspended(host->card);
+ cancel_delayed_work_sync(&host->sdio_irq_work);
+
mmc_claim_host(host);
if (mmc_card_keep_power(host) && mmc_card_wake_sdio_irq(host))
@@ -985,6 +989,9 @@ static int mmc_sdio_resume(struct mmc_host *host)
err = sdio_enable_4bit_bus(host->card);
}
+ /* Allow SDIO IRQs to be processed again. */
+ mmc_card_clr_suspended(host->card);
+
if (!err && host->sdio_irqs) {
if (!(host->caps2 & MMC_CAP2_SDIO_IRQ_NOTHREAD))
wake_up_process(host->sdio_irq_thread);
diff --git a/drivers/mmc/core/sdio_irq.c b/drivers/mmc/core/sdio_irq.c
index 931e6226c0b3..9f54a259a1b3 100644
--- a/drivers/mmc/core/sdio_irq.c
+++ b/drivers/mmc/core/sdio_irq.c
@@ -34,6 +34,10 @@ static int process_sdio_pending_irqs(struct mmc_host *host)
unsigned char pending;
struct sdio_func *func;
+ /* Don't process SDIO IRQs if the card is suspended. */
+ if (mmc_card_suspended(card))
+ return 0;
+
/*
* Optimization, if there is only 1 function interrupt registered
* and we know an IRQ was signaled then call irq handler directly.
--
2.17.1
During release of the syncpt, we remove it from the list of syncpt and
the tree, but only if it is not already been removed. However, during
signaling, we first remove the syncpt from the list. So, if we
concurrently free and signal the syncpt, the free may decide that it is
not part of the tree and immediately free itself -- meanwhile the
signaler goes onto to use the now freed datastructure.
In particular, we get struct by commit 0e2f733addbf ("dma-buf: make
dma_fence structure a bit smaller v2") as the cb_list is immediately
clobbered by the kfree_rcu.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111381
Fixes: d3862e44daa7 ("dma-buf/sw-sync: Fix locking around sync_timeline lists")
References: 0e2f733addbf ("dma-buf: make dma_fence structure a bit smaller v2")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Sean Paul <seanpaul(a)chromium.org>
Cc: Gustavo Padovan <gustavo(a)padovan.org>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v4.14+
---
drivers/dma-buf/sw_sync.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 051f6c2873c7..27b1d549ed38 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -132,17 +132,14 @@ static void timeline_fence_release(struct dma_fence *fence)
{
struct sync_pt *pt = dma_fence_to_sync_pt(fence);
struct sync_timeline *parent = dma_fence_parent(fence);
+ unsigned long flags;
+ spin_lock_irqsave(fence->lock, flags);
if (!list_empty(&pt->link)) {
- unsigned long flags;
-
- spin_lock_irqsave(fence->lock, flags);
- if (!list_empty(&pt->link)) {
- list_del(&pt->link);
- rb_erase(&pt->node, &parent->pt_tree);
- }
- spin_unlock_irqrestore(fence->lock, flags);
+ list_del(&pt->link);
+ rb_erase(&pt->node, &parent->pt_tree);
}
+ spin_unlock_irqrestore(fence->lock, flags);
sync_timeline_put(parent);
dma_fence_free(fence);
--
2.23.0.rc1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hello there,
The CKI is pleased to announce that we are enabling automated reports again for the stable mailing list! We now upload more logs to help with troubleshooting failures.
Build logs from a failed kernel compile are uploaded to our artifacts server instead of being sent with the report email as an attachment.
The logs from our testing phase (when we boot the kernel on a host and run tests) are also available on the artifacts server. This should make it easier to understand why a particular test failed. Take a look at a recent stable kernel test[0] to see the new test logs.
Our testing is focused heavily on the latest kernel updates, so we will keep tracking the latest stable kernel version. Testing against 4.19 has been challenging for us since it's not our main focus, so we dropped testing for 4.19 from our system.
Our team is eager to get your feedback! If you find something that you like, or if you find something that could be improved, please let us know anytime at cki-project(a)redhat.com.
[0] https://artifacts.cki-project.org/pipelines/94406/
- --
Major Hayden
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEG/mSZJWWADNpjCUrc3BR4MEBH7EFAl1RexgACgkQc3BR4MEB
H7Fahw/+ODp9o56oa9RPxDUEBIHuviJ3/EcT6Y8wzmTwgLlBUbrtYuAHfRC8xG6q
piNTHa3bJpBbnXmL30BOaQnVgmpkdwFGGbBniGd6oLCmFBSWh60MO1f2lpVfeOfS
H0I5IyFT1o1diZ+BQCj9ptasuiVUH2OYmqdxfzA/mRP0wU7npyz70RdzCQ3lBxJb
wztD5gltZ3JpdDrZ6R5uhurM0FB6zsh5bY8MdX+dBIVvAnISlIxgcgfI21Q4b3x+
9DwG++idg1KdIyJGbEaAfwqB+7iJqhOKgtbISqyqNQXfbj6DeoicrP90Mb4Rwpmt
/yEOBkympp3o0/xsBLFjQxfoqLANimHakCh7PwJADFdcQXmE02qM4caLjujpLQyP
85ZyReJBqICIo/pA4OnuyB5zsvLQLemGIHaBhtw8loKgl4zEx3oAa6WoiuASGTWZ
Kpi42Eukahs44jo7BV7KSfuKRegUQ9Ut6FjoiD8VFlJWccsL/M+2Tb8VzcC+1jKh
iKaQTHStQO9n8ANNrvwnSLBBxTzbnqVNmaHy0lHzD926iRwukvIwf2qV9xD/UA24
3lnbglDCGZxm9LQTYUo4GVBl+b/FbhMSvnJe3+Ci2hdDalK+4/hBMohDNdOKg4wv
/FfvCg6VGz/0Bm/29j+XPySXS5WI1wjGtOlC+IbSCyu3kIBU6hE=
=jj+9
-----END PGP SIGNATURE-----
This is a note to let you know that I've just added the patch titled
staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From b4d98bc3fc93ec3a58459948a2c0e0c9b501cd88 Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Mon, 12 Aug 2019 12:15:17 +0100
Subject: staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
In `dt3k_ns_to_timer()` the following lines near the end of the function
result in a signed integer overflow:
prescale = 15;
base = timer_base * (1 << prescale);
divider = 65535;
*nanosec = divider * base;
(`divider`, `base` and `prescale` are type `int`, `timer_base` and
`*nanosec` are type `unsigned int`. The value of `timer_base` will be
either 50 or 100.)
The main reason for the overflow is that the calculation for `base` is
completely wrong. It should be:
base = timer_base * (prescale + 1);
which matches an earlier instance of this calculation in the same
function.
Reported-by: David Binderman <dcb314(a)hotmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20190812111517.26803-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/comedi/drivers/dt3000.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/comedi/drivers/dt3000.c b/drivers/staging/comedi/drivers/dt3000.c
index 2edf3ee91300..4ad176fc14ad 100644
--- a/drivers/staging/comedi/drivers/dt3000.c
+++ b/drivers/staging/comedi/drivers/dt3000.c
@@ -368,7 +368,7 @@ static int dt3k_ns_to_timer(unsigned int timer_base, unsigned int *nanosec,
}
prescale = 15;
- base = timer_base * (1 << prescale);
+ base = timer_base * (prescale + 1);
divider = 65535;
*nanosec = divider * base;
return (prescale << 16) | (divider);
--
2.22.0
This is a note to let you know that I've just added the patch titled
staging: comedi: dt3000: Fix rounding up of timer divisor
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 8e2a589a3fc36ce858d42e767c3bcd8fc62a512b Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Mon, 12 Aug 2019 13:08:14 +0100
Subject: staging: comedi: dt3000: Fix rounding up of timer divisor
`dt3k_ns_to_timer()` determines the prescaler and divisor to use to
produce a desired timing period. It is influenced by a rounding mode
and can round the divisor up, down, or to the nearest value. However,
the code for rounding up currently does the same as rounding down! Fix
ir by using the `DIV_ROUND_UP()` macro to calculate the divisor when
rounding up.
Also, change the types of the `divider`, `base` and `prescale` variables
from `int` to `unsigned int` to avoid mixing signed and unsigned types
in the calculations.
Also fix a typo in a nearby comment: "improvment" => "improvement".
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20190812120814.21188-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/comedi/drivers/dt3000.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/comedi/drivers/dt3000.c b/drivers/staging/comedi/drivers/dt3000.c
index 4ad176fc14ad..caf4d4df4bd3 100644
--- a/drivers/staging/comedi/drivers/dt3000.c
+++ b/drivers/staging/comedi/drivers/dt3000.c
@@ -342,9 +342,9 @@ static irqreturn_t dt3k_interrupt(int irq, void *d)
static int dt3k_ns_to_timer(unsigned int timer_base, unsigned int *nanosec,
unsigned int flags)
{
- int divider, base, prescale;
+ unsigned int divider, base, prescale;
- /* This function needs improvment */
+ /* This function needs improvement */
/* Don't know if divider==0 works. */
for (prescale = 0; prescale < 16; prescale++) {
@@ -358,7 +358,7 @@ static int dt3k_ns_to_timer(unsigned int timer_base, unsigned int *nanosec,
divider = (*nanosec) / base;
break;
case CMDF_ROUND_UP:
- divider = (*nanosec) / base;
+ divider = DIV_ROUND_UP(*nanosec, base);
break;
}
if (divider < 65536) {
--
2.22.0
This is a note to let you know that I've just added the patch titled
usb: gadget: udc: renesas_usb3: Fix sysfs interface of "role"
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 5dac665cf403967bb79a7aeb8c182a621fe617ff Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Wed, 31 Jul 2019 19:15:43 +0900
Subject: usb: gadget: udc: renesas_usb3: Fix sysfs interface of "role"
Since the role_store() uses strncmp(), it's possible to refer
out-of-memory if the sysfs data size is smaller than strlen("host").
This patch fixes it by using sysfs_streq() instead of strncmp().
Fixes: cc995c9ec118 ("usb: gadget: udc: renesas_usb3: add support for usb role swap")
Cc: <stable(a)vger.kernel.org> # v4.12+
Reviewed-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi(a)linux.intel.com>
---
drivers/usb/gadget/udc/renesas_usb3.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/gadget/udc/renesas_usb3.c b/drivers/usb/gadget/udc/renesas_usb3.c
index 87062d22134d..1f4c3fbd1df8 100644
--- a/drivers/usb/gadget/udc/renesas_usb3.c
+++ b/drivers/usb/gadget/udc/renesas_usb3.c
@@ -19,6 +19,7 @@
#include <linux/pm_runtime.h>
#include <linux/sizes.h>
#include <linux/slab.h>
+#include <linux/string.h>
#include <linux/sys_soc.h>
#include <linux/uaccess.h>
#include <linux/usb/ch9.h>
@@ -2450,9 +2451,9 @@ static ssize_t role_store(struct device *dev, struct device_attribute *attr,
if (usb3->forced_b_device)
return -EBUSY;
- if (!strncmp(buf, "host", strlen("host")))
+ if (sysfs_streq(buf, "host"))
new_mode_is_host = true;
- else if (!strncmp(buf, "peripheral", strlen("peripheral")))
+ else if (sysfs_streq(buf, "peripheral"))
new_mode_is_host = false;
else
return -EINVAL;
--
2.22.0
In `dt3k_ns_to_timer()` the following lines near the end of the function
result in a signed integer overflow:
prescale = 15;
base = timer_base * (1 << prescale);
divider = 65535;
*nanosec = divider * base;
(`divider`, `base` and `prescale` are type `int`, `timer_base` and
`*nanosec` are type `unsigned int`. The value of `timer_base` will be
either 50 or 100.)
The main reason for the overflow is that the calculation for `base` is
completely wrong. It should be:
base = timer_base * (prescale + 1);
which matches an earlier instance of this calculation in the same
function.
Reported-by: David Binderman <dcb314(a)hotmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
---
N.B. Greg: The original report suggested an actual build error, so
may be prudent to queue this on your 'staging-linus' queue.
drivers/staging/comedi/drivers/dt3000.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/comedi/drivers/dt3000.c b/drivers/staging/comedi/drivers/dt3000.c
index 2edf3ee91300..4ad176fc14ad 100644
--- a/drivers/staging/comedi/drivers/dt3000.c
+++ b/drivers/staging/comedi/drivers/dt3000.c
@@ -368,7 +368,7 @@ static int dt3k_ns_to_timer(unsigned int timer_base, unsigned int *nanosec,
}
prescale = 15;
- base = timer_base * (1 << prescale);
+ base = timer_base * (prescale + 1);
divider = 65535;
*nanosec = divider * base;
return (prescale << 16) | (divider);
--
2.20.1
From: Alastair D'Silva <alastair(a)d-silva.org>
When calling flush_icache_range with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.
This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.
Heads-up for backporters: the old version of flush_dcache_range is
subject to a similar bug (this has since been replaced with a C
implementation).
Signed-off-by: Alastair D'Silva <alastair(a)d-silva.org>
---
arch/powerpc/kernel/misc_64.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index b55a7b4cb543..9bc0aa9aeb65 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -82,7 +82,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
subf r8,r6,r4 /* compute length */
add r8,r8,r5 /* ensure we get enough */
lwz r9,DCACHEL1LOGBLOCKSIZE(r10) /* Get log-2 of cache block size */
- srw. r8,r8,r9 /* compute line count */
+ srd. r8,r8,r9 /* compute line count */
beqlr /* nothing to do? */
mtctr r8
1: dcbst 0,r6
@@ -98,7 +98,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
subf r8,r6,r4 /* compute length */
add r8,r8,r5
lwz r9,ICACHEL1LOGBLOCKSIZE(r10) /* Get log-2 of Icache block size */
- srw. r8,r8,r9 /* compute line count */
+ srd. r8,r8,r9 /* compute line count */
beqlr /* nothing to do? */
mtctr r8
2: icbi 0,r6
--
2.21.0
The patch below does not apply to the 5.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c6303c5d52d5ec3e5bce2e6a5480fa2a1baa45e6 Mon Sep 17 00:00:00 2001
From: Baolin Wang <baolin.wang(a)linaro.org>
Date: Thu, 25 Jul 2019 11:14:22 +0800
Subject: [PATCH] mmc: sdhci-sprd: Fix the incorrect soft reset operation when
runtime resuming
The SD host controller specification defines 3 types software reset:
software reset for data line, software reset for command line and software
reset for all. Software reset for all means this reset affects the entire
Host controller except for the card detection circuit.
In sdhci_runtime_resume_host() we always do a software "reset for all",
which causes the Spreadtrum variant controller to work abnormally after
resuming. To fix the problem, let's do a software reset for the data and
the command part, rather than "for all".
However, as sdhci_runtime_resume() is a common sdhci function and we don't
want to change the behaviour for other variants, let's introduce a new
in-parameter for it. This enables the caller to decide if a "reset for all"
shall be done or not.
Signed-off-by: Baolin Wang <baolin.wang(a)linaro.org>
Fixes: fb8bd90f83c4 ("mmc: sdhci-sprd: Add Spreadtrum's initial host controller")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/host/sdhci-acpi.c b/drivers/mmc/host/sdhci-acpi.c
index b3a130a9ee23..1604f512c7bd 100644
--- a/drivers/mmc/host/sdhci-acpi.c
+++ b/drivers/mmc/host/sdhci-acpi.c
@@ -883,7 +883,7 @@ static int sdhci_acpi_runtime_resume(struct device *dev)
sdhci_acpi_byt_setting(&c->pdev->dev);
- return sdhci_runtime_resume_host(c->host);
+ return sdhci_runtime_resume_host(c->host, 0);
}
#endif
diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c
index c391510e9ef4..776a94216248 100644
--- a/drivers/mmc/host/sdhci-esdhc-imx.c
+++ b/drivers/mmc/host/sdhci-esdhc-imx.c
@@ -1705,7 +1705,7 @@ static int sdhci_esdhc_runtime_resume(struct device *dev)
esdhc_pltfm_set_clock(host, imx_data->actual_clock);
}
- err = sdhci_runtime_resume_host(host);
+ err = sdhci_runtime_resume_host(host, 0);
if (err)
goto disable_ipg_clk;
diff --git a/drivers/mmc/host/sdhci-of-at91.c b/drivers/mmc/host/sdhci-of-at91.c
index e377b9bc55a4..d4e7e8b7be77 100644
--- a/drivers/mmc/host/sdhci-of-at91.c
+++ b/drivers/mmc/host/sdhci-of-at91.c
@@ -289,7 +289,7 @@ static int sdhci_at91_runtime_resume(struct device *dev)
}
out:
- return sdhci_runtime_resume_host(host);
+ return sdhci_runtime_resume_host(host, 0);
}
#endif /* CONFIG_PM */
diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
index 4041878eb0f3..7d06e2860c36 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -167,7 +167,7 @@ static int sdhci_pci_runtime_suspend_host(struct sdhci_pci_chip *chip)
err_pci_runtime_suspend:
while (--i >= 0)
- sdhci_runtime_resume_host(chip->slots[i]->host);
+ sdhci_runtime_resume_host(chip->slots[i]->host, 0);
return ret;
}
@@ -181,7 +181,7 @@ static int sdhci_pci_runtime_resume_host(struct sdhci_pci_chip *chip)
if (!slot)
continue;
- ret = sdhci_runtime_resume_host(slot->host);
+ ret = sdhci_runtime_resume_host(slot->host, 0);
if (ret)
return ret;
}
diff --git a/drivers/mmc/host/sdhci-pxav3.c b/drivers/mmc/host/sdhci-pxav3.c
index 3ddecf479295..e55037ceda73 100644
--- a/drivers/mmc/host/sdhci-pxav3.c
+++ b/drivers/mmc/host/sdhci-pxav3.c
@@ -554,7 +554,7 @@ static int sdhci_pxav3_runtime_resume(struct device *dev)
if (!IS_ERR(pxa->clk_core))
clk_prepare_enable(pxa->clk_core);
- return sdhci_runtime_resume_host(host);
+ return sdhci_runtime_resume_host(host, 0);
}
#endif
diff --git a/drivers/mmc/host/sdhci-s3c.c b/drivers/mmc/host/sdhci-s3c.c
index 8e4a8ba33f05..f5753aef7151 100644
--- a/drivers/mmc/host/sdhci-s3c.c
+++ b/drivers/mmc/host/sdhci-s3c.c
@@ -745,7 +745,7 @@ static int sdhci_s3c_runtime_resume(struct device *dev)
clk_prepare_enable(busclk);
if (ourhost->cur_clk >= 0)
clk_prepare_enable(ourhost->clk_bus[ourhost->cur_clk]);
- ret = sdhci_runtime_resume_host(host);
+ ret = sdhci_runtime_resume_host(host, 0);
return ret;
}
#endif
diff --git a/drivers/mmc/host/sdhci-sprd.c b/drivers/mmc/host/sdhci-sprd.c
index 603a5d9f045a..83a4767ca680 100644
--- a/drivers/mmc/host/sdhci-sprd.c
+++ b/drivers/mmc/host/sdhci-sprd.c
@@ -696,7 +696,7 @@ static int sdhci_sprd_runtime_resume(struct device *dev)
if (ret)
goto clk_disable;
- sdhci_runtime_resume_host(host);
+ sdhci_runtime_resume_host(host, 1);
return 0;
clk_disable:
diff --git a/drivers/mmc/host/sdhci-xenon.c b/drivers/mmc/host/sdhci-xenon.c
index 8a18f14cf842..1dea1ba66f7b 100644
--- a/drivers/mmc/host/sdhci-xenon.c
+++ b/drivers/mmc/host/sdhci-xenon.c
@@ -638,7 +638,7 @@ static int xenon_runtime_resume(struct device *dev)
priv->restore_needed = false;
}
- ret = sdhci_runtime_resume_host(host);
+ ret = sdhci_runtime_resume_host(host, 0);
if (ret)
goto out;
return 0;
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 59acf8e3331e..a5dc5aae973e 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -3320,7 +3320,7 @@ int sdhci_runtime_suspend_host(struct sdhci_host *host)
}
EXPORT_SYMBOL_GPL(sdhci_runtime_suspend_host);
-int sdhci_runtime_resume_host(struct sdhci_host *host)
+int sdhci_runtime_resume_host(struct sdhci_host *host, int soft_reset)
{
struct mmc_host *mmc = host->mmc;
unsigned long flags;
@@ -3331,7 +3331,7 @@ int sdhci_runtime_resume_host(struct sdhci_host *host)
host->ops->enable_dma(host);
}
- sdhci_init(host, 0);
+ sdhci_init(host, soft_reset);
if (mmc->ios.power_mode != MMC_POWER_UNDEFINED &&
mmc->ios.power_mode != MMC_POWER_OFF) {
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 89fd96596a1f..902f855efe8f 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -781,7 +781,7 @@ void sdhci_adma_write_desc(struct sdhci_host *host, void **desc,
int sdhci_suspend_host(struct sdhci_host *host);
int sdhci_resume_host(struct sdhci_host *host);
int sdhci_runtime_suspend_host(struct sdhci_host *host);
-int sdhci_runtime_resume_host(struct sdhci_host *host);
+int sdhci_runtime_resume_host(struct sdhci_host *host, int soft_reset);
#endif
void sdhci_cqe_enable(struct mmc_host *mmc);
The patch below does not apply to the 5.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5511c0c309db4c526a6e9f8b2b8a1483771574bc Mon Sep 17 00:00:00 2001
From: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Date: Thu, 1 Aug 2019 11:23:23 -0600
Subject: [PATCH] coresight: Fix DEBUG_LOCKS_WARN_ON for uninitialized
attribute
While running the linux-next with CONFIG_DEBUG_LOCKS_ALLOC enabled,
I get the following splat.
BUG: key ffffcb5636929298 has not been registered!
------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 1 PID: 53 at kernel/locking/lockdep.c:3669 lockdep_init_map+0x164/0x1f0
CPU: 1 PID: 53 Comm: kworker/1:1 Tainted: G W 5.2.0-next-20190712-00015-g00ad4634222e-dirty #603
Workqueue: events amba_deferred_retry_func
pstate: 60c00005 (nZCv daif +PAN +UAO)
pc : lockdep_init_map+0x164/0x1f0
lr : lockdep_init_map+0x164/0x1f0
[ trimmed ]
Call trace:
lockdep_init_map+0x164/0x1f0
__kernfs_create_file+0x9c/0x158
sysfs_add_file_mode_ns+0xa8/0x1d0
sysfs_add_file_to_group+0x88/0xd8
etm_perf_add_symlink_sink+0xcc/0x138
coresight_register+0x110/0x280
tmc_probe+0x160/0x420
[ trimmed ]
---[ end trace ab4cc669615ba1b0 ]---
Fix this by initialising the dynamically allocated attribute properly.
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Fixes: bb8e370bdc14 ("coresight: perf: Add "sinks" group to PMU directory")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
[Fixed a typograhic error in the changelog]
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Link: https://lore.kernel.org/r/20190801172323.18359-2-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 5c1ca0df5cb0..84f1dcb69827 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -544,6 +544,7 @@ int etm_perf_add_symlink_sink(struct coresight_device *csdev)
/* See function coresight_get_sink_by_id() to know where this is used */
hash = hashlen_hash(hashlen_string(NULL, name));
+ sysfs_attr_init(&ea->attr.attr);
ea->attr.attr.name = devm_kstrdup(dev, name, GFP_KERNEL);
if (!ea->attr.attr.name)
return -ENOMEM;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 90c6260c1905a68fb596844087f2223bd4657fee Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Thu, 18 Jul 2019 15:57:49 +0200
Subject: [PATCH] iio: adc: gyroadc: fix uninitialized return code
gcc-9 complains about a blatant uninitialized variable use that
all earlier compiler versions missed:
drivers/iio/adc/rcar-gyroadc.c:510:5: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
Return -EINVAL instead here and a few lines above it where
we accidentally return 0 on failure.
Cc: stable(a)vger.kernel.org
Fixes: 059c53b32329 ("iio: adc: Add Renesas GyroADC driver")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/adc/rcar-gyroadc.c b/drivers/iio/adc/rcar-gyroadc.c
index 2d685730f867..c37f201294b2 100644
--- a/drivers/iio/adc/rcar-gyroadc.c
+++ b/drivers/iio/adc/rcar-gyroadc.c
@@ -382,7 +382,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Only %i channels supported with %pOFn, but reg = <%i>.\n",
num_channels, child, reg);
- return ret;
+ return -EINVAL;
}
}
@@ -391,7 +391,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Channel %i uses different ADC mode than the rest.\n",
reg);
- return ret;
+ return -EINVAL;
}
/* Channel is valid, grab the regulator. */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 90c6260c1905a68fb596844087f2223bd4657fee Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Thu, 18 Jul 2019 15:57:49 +0200
Subject: [PATCH] iio: adc: gyroadc: fix uninitialized return code
gcc-9 complains about a blatant uninitialized variable use that
all earlier compiler versions missed:
drivers/iio/adc/rcar-gyroadc.c:510:5: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
Return -EINVAL instead here and a few lines above it where
we accidentally return 0 on failure.
Cc: stable(a)vger.kernel.org
Fixes: 059c53b32329 ("iio: adc: Add Renesas GyroADC driver")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas(a)sang-engineering.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/adc/rcar-gyroadc.c b/drivers/iio/adc/rcar-gyroadc.c
index 2d685730f867..c37f201294b2 100644
--- a/drivers/iio/adc/rcar-gyroadc.c
+++ b/drivers/iio/adc/rcar-gyroadc.c
@@ -382,7 +382,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Only %i channels supported with %pOFn, but reg = <%i>.\n",
num_channels, child, reg);
- return ret;
+ return -EINVAL;
}
}
@@ -391,7 +391,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Channel %i uses different ADC mode than the rest.\n",
reg);
- return ret;
+ return -EINVAL;
}
/* Channel is valid, grab the regulator. */
This is the start of the stable review cycle for the 4.9.189 release.
There are 32 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun 11 Aug 2019 01:38:45 PM UTC.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.189-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.189-rc1
Thomas Gleixner <tglx(a)linutronix.de>
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/entry/64: Use JMP instead of JMPQ
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Enable Spectre v1 swapgs mitigations
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
Ben Hutchings <ben(a)decadent.org.uk>
x86: cpufeatures: Sort feature word 7
Lukas Wunner <lukas(a)wunner.de>
spi: bcm2835: Fix 3-wire mode if DMA is enabled
xiao jin <jin.xiao(a)intel.com>
block: blk_init_allocated_queue() set q->fq as NULL in the fail case
Sudarsana Reddy Kalluru <skalluru(a)marvell.com>
bnx2x: Disable multi-cos feature.
Cong Wang <xiyou.wangcong(a)gmail.com>
ife: error out when nla attributes are empty
Haishuang Yan <yanhaishuang(a)cmss.chinamobile.com>
ip6_tunnel: fix possible use-after-free on xmit
Arnd Bergmann <arnd(a)arndb.de>
compat_ioctl: pppoe: fix PPPOEIOCSFWD handling
Taras Kondratiuk <takondra(a)cisco.com>
tipc: compat: allow tipc commands without arguments
Jia-Ju Bai <baijiaju1990(a)gmail.com>
net: sched: Fix a possible null-pointer dereference in dequeue_func()
Mark Zhang <markz(a)mellanox.com>
net/mlx5: Use reversed order when unregister devices
Jiri Pirko <jiri(a)mellanox.com>
net: fix ifindex collision during namespace removal
Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
net: bridge: mcast: don't delete permanent entries when fast leave is enabled
Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
net: bridge: delete local fdb on device init failure
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
atm: iphase: Fix Spectre v1 vulnerability
Ilya Dryomov <idryomov(a)gmail.com>
libceph: use kbasename() and kill ceph_file_part()
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Add rewind_stack_do_exit() to the noreturn list
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Add machine_real_restart() to the noreturn list
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
IB: directly cast the sockaddr union to aockaddr
Jason Gunthorpe <jgg(a)mellanox.com>
RDMA: Directly cast the sockaddr union to sockaddr
Sebastian Parschauer <s.parschauer(a)gmx.de>
HID: Add quirk for HP X1200 PIXART OEM mouse
Aaron Armstrong Skomra <skomra(a)gmail.com>
HID: wacom: fix bit shift for Cintiq Companion 2
Eric Dumazet <edumazet(a)google.com>
tcp: be more careful in tcp_fragment()
Will Deacon <will(a)kernel.org>
arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
Will Deacon <will.deacon(a)arm.com>
arm64: cpufeature: Fix CTR_EL0 field definitions
Adam Ford <aford173(a)gmail.com>
ARM: dts: logicpd-som-lv: Fix Audio Mute
Adam Ford <aford173(a)gmail.com>
ARM: dts: Add pinmuxing for i2c2 and i2c3 for LogicPD torpedo
Adam Ford <aford173(a)gmail.com>
ARM: dts: Add pinmuxing for i2c2 and i2c3 for LogicPD SOM-LV
Hannes Reinecke <hare(a)suse.de>
scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure
-------------
Diffstat:
Documentation/kernel-parameters.txt | 9 +-
Makefile | 4 +-
arch/arm/boot/dts/logicpd-som-lv.dtsi | 18 ++++
arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 16 ++++
arch/arm64/include/asm/cpufeature.h | 7 +-
arch/arm64/kernel/cpufeature.c | 14 +++-
arch/x86/entry/calling.h | 18 ++++
arch/x86/entry/entry_64.S | 21 ++++-
arch/x86/include/asm/cpufeatures.h | 8 +-
arch/x86/kernel/cpu/bugs.c | 105 ++++++++++++++++++++++--
arch/x86/kernel/cpu/common.c | 42 ++++++----
block/blk-core.c | 1 +
drivers/atm/iphase.c | 8 +-
drivers/hid/hid-ids.h | 1 +
drivers/hid/usbhid/hid-quirks.c | 1 +
drivers/hid/wacom_wac.c | 12 +--
drivers/infiniband/core/addr.c | 15 ++--
drivers/infiniband/core/sa_query.c | 10 +--
drivers/infiniband/hw/ocrdma/ocrdma_ah.c | 5 +-
drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 5 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/dev.c | 2 +-
drivers/net/ppp/pppoe.c | 3 +
drivers/net/ppp/pppox.c | 13 +++
drivers/net/ppp/pptp.c | 3 +
drivers/scsi/fcoe/fcoe_ctlr.c | 51 +++++-------
drivers/scsi/libfc/fc_rport.c | 5 +-
drivers/spi/spi-bcm2835.c | 3 +-
fs/compat_ioctl.c | 3 -
include/linux/ceph/ceph_debug.h | 6 +-
include/linux/if_pppox.h | 3 +
include/net/tcp.h | 17 ++++
include/scsi/libfcoe.h | 1 +
net/bridge/br_multicast.c | 3 +
net/bridge/br_vlan.c | 5 ++
net/ceph/ceph_common.c | 13 ---
net/core/dev.c | 2 +
net/ipv4/tcp_output.c | 11 ++-
net/ipv6/ip6_tunnel.c | 8 +-
net/l2tp/l2tp_ppp.c | 3 +
net/sched/act_ife.c | 3 +
net/sched/sch_codel.c | 6 +-
net/tipc/netlink_compat.c | 11 ++-
tools/objtool/check.c | 2 +
44 files changed, 363 insertions(+), 136 deletions(-)
This is the start of the stable review cycle for the 4.4.189 release.
There are 21 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun 11 Aug 2019 01:42:28 PM UTC.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.189-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.189-rc1
Thomas Gleixner <tglx(a)linutronix.de>
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/entry/64: Use JMP instead of JMPQ
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Enable Spectre v1 swapgs mitigations
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
Wanpeng Li <wanpeng.li(a)hotmail.com>
x86/entry/64: Fix context tracking state warning when load_gs_index fails
Ben Hutchings <ben(a)decadent.org.uk>
x86: cpufeatures: Sort feature word 7
Lukas Wunner <lukas(a)wunner.de>
spi: bcm2835: Fix 3-wire mode if DMA is enabled
xiao jin <jin.xiao(a)intel.com>
block: blk_init_allocated_queue() set q->fq as NULL in the fail case
Arnd Bergmann <arnd(a)arndb.de>
compat_ioctl: pppoe: fix PPPOEIOCSFWD handling
Sudarsana Reddy Kalluru <skalluru(a)marvell.com>
bnx2x: Disable multi-cos feature.
Mark Zhang <markz(a)mellanox.com>
net/mlx5: Use reversed order when unregister devices
Jia-Ju Bai <baijiaju1990(a)gmail.com>
net: sched: Fix a possible null-pointer dereference in dequeue_func()
Taras Kondratiuk <takondra(a)cisco.com>
tipc: compat: allow tipc commands without arguments
Jiri Pirko <jiri(a)mellanox.com>
net: fix ifindex collision during namespace removal
Nikolay Aleksandrov <nikolay(a)cumulusnetworks.com>
net: bridge: delete local fdb on device init failure
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
atm: iphase: Fix Spectre v1 vulnerability
Eric Dumazet <edumazet(a)google.com>
tcp: be more careful in tcp_fragment()
Sebastian Parschauer <s.parschauer(a)gmx.de>
HID: Add quirk for HP X1200 PIXART OEM mouse
Phil Turnbull <phil.turnbull(a)oracle.com>
netfilter: nfnetlink_acct: validate NFACCT_QUOTA parameter
Will Deacon <will(a)kernel.org>
arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
Will Deacon <will.deacon(a)arm.com>
arm64: cpufeature: Fix CTR_EL0 field definitions
-------------
Diffstat:
Documentation/kernel-parameters.txt | 7 +-
Makefile | 4 +-
arch/arm64/include/asm/cpufeature.h | 7 +-
arch/arm64/kernel/cpufeature.c | 14 +++-
arch/x86/entry/calling.h | 19 +++++
arch/x86/entry/entry_64.S | 25 +++++-
arch/x86/include/asm/cpufeatures.h | 10 ++-
arch/x86/kernel/cpu/bugs.c | 105 ++++++++++++++++++++++--
arch/x86/kernel/cpu/common.c | 42 ++++++----
block/blk-core.c | 1 +
drivers/atm/iphase.c | 8 +-
drivers/hid/hid-ids.h | 1 +
drivers/hid/usbhid/hid-quirks.c | 1 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 +-
drivers/net/ppp/pppoe.c | 3 +
drivers/net/ppp/pppox.c | 13 +++
drivers/net/ppp/pptp.c | 3 +
drivers/spi/spi-bcm2835.c | 3 +-
fs/compat_ioctl.c | 3 -
include/linux/if_pppox.h | 3 +
include/net/tcp.h | 17 ++++
net/bridge/br_vlan.c | 5 ++
net/core/dev.c | 2 +
net/ipv4/tcp_output.c | 11 ++-
net/l2tp/l2tp_ppp.c | 3 +
net/netfilter/nfnetlink_acct.c | 2 +
net/sched/sch_codel.c | 3 +-
net/tipc/netlink_compat.c | 11 ++-
29 files changed, 272 insertions(+), 58 deletions(-)