This is a note to let you know that I've just added the patch titled
zsmalloc: calling zs_map_object() from irq is a bug
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
zsmalloc-calling-zs_map_object-from-irq-is-a-bug.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: Sergey Senozhatsky <sergey.senozhatsky.work(a)gmail.com>
Date: Wed, 15 Nov 2017 17:34:03 -0800
Subject: zsmalloc: calling zs_map_object() from irq is a bug
From: Sergey Senozhatsky <sergey.senozhatsky.work(a)gmail.com>
[ Upstream commit 1aedcafbf32b3f232c159b14cd0d423fcfe2b861 ]
Use BUG_ON(in_interrupt()) in zs_map_object(). This is not a new
BUG_ON(), it's always been there, but was recently changed to
VM_BUG_ON(). There are several problems there. First, we use use
per-CPU mappings both in zsmalloc and in zram, and interrupt may easily
corrupt those buffers. Second, and more importantly, we believe it's
possible to start leaking sensitive information. Consider the following
case:
-> process P
swap out
zram
per-cpu mapping CPU1
compress page A
-> IRQ
swap out
zram
per-cpu mapping CPU1
compress page B
write page from per-cpu mapping CPU1 to zsmalloc pool
iret
-> process P
write page from per-cpu mapping CPU1 to zsmalloc pool [*]
return
* so we store overwritten data that actually belongs to another
page (task) and potentially contains sensitive data. And when
process P will page fault it's going to read (swap in) that
other task's data.
Link: http://lkml.kernel.org/r/20170929045140.4055-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Acked-by: Minchan Kim <minchan(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/zsmalloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1349,7 +1349,7 @@ void *zs_map_object(struct zs_pool *pool
* pools/users, we can't allow mapping in interrupt context
* because it can corrupt another users mappings.
*/
- WARN_ON_ONCE(in_interrupt());
+ BUG_ON(in_interrupt());
/* From now on, migration cannot move the object */
pin_tag(handle);
Patches currently in stable-queue which might be from sergey.senozhatsky.work(a)gmail.com are
queue-4.14/zsmalloc-calling-zs_map_object-from-irq-is-a-bug.patch
This is a note to let you know that I've just added the patch titled
xfs: fix forgotten rcu read unlock when skipping inode reclaim
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xfs-fix-forgotten-rcu-read-unlock-when-skipping-inode-reclaim.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: "Darrick J. Wong" <darrick.wong(a)oracle.com>
Date: Tue, 14 Nov 2017 16:34:44 -0800
Subject: xfs: fix forgotten rcu read unlock when skipping inode reclaim
From: "Darrick J. Wong" <darrick.wong(a)oracle.com>
[ Upstream commit 962cc1ad6caddb5abbb9f0a43e5abe7131a71f18 ]
In commit f2e9ad21 ("xfs: check for race with xfs_reclaim_inode"), we
skip an inode if we're racing with freeing the inode via
xfs_reclaim_inode, but we forgot to release the rcu read lock when
dumping the inode, with the result that we exit to userspace with a lock
held. Don't do that; generic/320 with a 1k block size fails this
very occasionally.
================================================
WARNING: lock held when returning to user space!
4.14.0-rc6-djwong #4 Tainted: G W
------------------------------------------------
rm/30466 is leaving the kernel with locks still held!
1 lock held by rm/30466:
#0: (rcu_read_lock){....}, at: [<ffffffffa01364d3>] xfs_ifree_cluster.isra.17+0x2c3/0x6f0 [xfs]
------------[ cut here ]------------
WARNING: CPU: 1 PID: 30466 at kernel/rcu/tree_plugin.h:329 rcu_note_context_switch+0x71/0x700
Modules linked in: deadline_iosched dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: scsi_debug]
CPU: 1 PID: 30466 Comm: rm Tainted: G W 4.14.0-rc6-djwong #4
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
task: ffff880037680000 task.stack: ffffc90001064000
RIP: 0010:rcu_note_context_switch+0x71/0x700
RSP: 0000:ffffc90001067e50 EFLAGS: 00010002
RAX: 0000000000000001 RBX: ffff880037680000 RCX: ffff88003e73d200
RDX: 0000000000000002 RSI: ffffffff819e53e9 RDI: ffffffff819f4375
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff880062c900d0
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880037680000
R13: 0000000000000000 R14: ffffc90001067eb8 R15: ffff880037680690
FS: 00007fa3b8ce8700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f69bf77c000 CR3: 000000002450a000 CR4: 00000000000006e0
Call Trace:
__schedule+0xb8/0xb10
schedule+0x40/0x90
exit_to_usermode_loop+0x6b/0xa0
prepare_exit_to_usermode+0x7a/0x90
retint_user+0x8/0x20
RIP: 0033:0x7fa3b87fda87
RSP: 002b:00007ffe41206568 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff02
RAX: 0000000000000000 RBX: 00000000010e88c0 RCX: 00007fa3b87fda87
RDX: 0000000000000000 RSI: 00000000010e89c8 RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000000
R10: 000000000000015e R11: 0000000000000246 R12: 00000000010c8060
R13: 00007ffe41206690 R14: 0000000000000000 R15: 0000000000000000
---[ end trace e88f83bf0cfbd07d ]---
Fixes: f2e9ad212def50bcf4c098c6288779dd97fff0f0
Cc: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/xfs/xfs_inode.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2378,6 +2378,7 @@ retry:
*/
if (ip->i_ino != inum + i) {
xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ rcu_read_unlock();
continue;
}
}
Patches currently in stable-queue which might be from darrick.wong(a)oracle.com are
queue-4.14/xfs-fix-forgotten-rcu-read-unlock-when-skipping-inode-reclaim.patch
This is a note to let you know that I've just added the patch titled
xfrm: Copy policy family in clone_policy
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xfrm-copy-policy-family-in-clone_policy.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: Herbert Xu <herbert(a)gondor.apana.org.au>
Date: Fri, 10 Nov 2017 14:14:06 +1100
Subject: xfrm: Copy policy family in clone_policy
From: Herbert Xu <herbert(a)gondor.apana.org.au>
[ Upstream commit 0e74aa1d79a5bbc663e03a2804399cae418a0321 ]
The syzbot found an ancient bug in the IPsec code. When we cloned
a socket policy (for example, for a child TCP socket derived from a
listening socket), we did not copy the family field. This results
in a live policy with a zero family field. This triggers a BUG_ON
check in the af_key code when the cloned policy is retrieved.
This patch fixes it by copying the family field over.
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/xfrm/xfrm_policy.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1306,6 +1306,7 @@ static struct xfrm_policy *clone_policy(
newp->xfrm_nr = old->xfrm_nr;
newp->index = old->index;
newp->type = old->type;
+ newp->family = old->family;
memcpy(newp->xfrm_vec, old->xfrm_vec,
newp->xfrm_nr*sizeof(struct xfrm_tmpl));
spin_lock_bh(&net->xfrm.xfrm_policy_lock);
Patches currently in stable-queue which might be from herbert(a)gondor.apana.org.au are
queue-4.14/xfrm-copy-policy-family-in-clone_policy.patch
queue-4.14/crypto-talitos-fix-aead-for-sha224-on-non-sha224-capable-chips.patch
queue-4.14/crypto-talitos-fix-memory-corruption-on-sec2.patch
queue-4.14/crypto-talitos-fix-use-of-sg_link_tbl_len.patch
queue-4.14/crypto-talitos-fix-setkey-to-check-key-weakness.patch
queue-4.14/crypto-talitos-fix-aead-test-failures.patch
queue-4.14/crypto-talitos-fix-ctr-aes-talitos.patch
This is a note to let you know that I've just added the patch titled
x86/mpx/selftests: Fix up weird arrays
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mpx-selftests-fix-up-weird-arrays.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Fri, 10 Nov 2017 16:12:29 -0800
Subject: x86/mpx/selftests: Fix up weird arrays
From: Dave Hansen <dave.hansen(a)linux.intel.com>
[ Upstream commit a6400120d042397675fcf694060779d21e9e762d ]
The MPX hardware data structurse are defined in a weird way: they define
their size in bytes and then union that with the type with which we want
to access them.
Yes, this is weird, but it does work. But, new GCC's complain that we
are accessing the array out of bounds. Just make it a zero-sized array
so gcc will stop complaining. There was not really a bug here.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Link: http://lkml.kernel.org/r/20171111001229.58A7933D@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/testing/selftests/x86/mpx-hw.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/x86/mpx-hw.h
+++ b/tools/testing/selftests/x86/mpx-hw.h
@@ -52,14 +52,14 @@
struct mpx_bd_entry {
union {
char x[MPX_BOUNDS_DIR_ENTRY_SIZE_BYTES];
- void *contents[1];
+ void *contents[0];
};
} __attribute__((packed));
struct mpx_bt_entry {
union {
char x[MPX_BOUNDS_TABLE_ENTRY_SIZE_BYTES];
- unsigned long contents[1];
+ unsigned long contents[0];
};
} __attribute__((packed));
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mpx-selftests-fix-up-weird-arrays.patch
queue-4.14/x86-pci-make-broadcom_postcore_init-check-acpi_disabled.patch
This is a note to let you know that I've just added the patch titled
tls: Use kzalloc for aead_request allocation
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tls-use-kzalloc-for-aead_request-allocation.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: Ilya Lesokhin <ilyal(a)mellanox.com>
Date: Mon, 13 Nov 2017 10:22:44 +0200
Subject: tls: Use kzalloc for aead_request allocation
From: Ilya Lesokhin <ilyal(a)mellanox.com>
[ Upstream commit 61ef6da622aa7b66bf92991bd272490eea6c712e ]
Use kzalloc for aead_request allocation as
we don't set all the bits in the request.
Fixes: 3c4d7559159b ('tls: kernel TLS support')
Signed-off-by: Ilya Lesokhin <ilyal(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tls/tls_sw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -219,7 +219,7 @@ static int tls_do_encryption(struct tls_
struct aead_request *aead_req;
int rc;
- aead_req = kmalloc(req_size, flags);
+ aead_req = kzalloc(req_size, flags);
if (!aead_req)
return -ENOMEM;
Patches currently in stable-queue which might be from ilyal(a)mellanox.com are
queue-4.14/tls-use-kzalloc-for-aead_request-allocation.patch
This is a note to let you know that I've just added the patch titled
sunrpc: Fix rpc_task_begin trace point
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sunrpc-fix-rpc_task_begin-trace-point.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 3 Nov 2017 13:46:06 -0400
Subject: sunrpc: Fix rpc_task_begin trace point
From: Chuck Lever <chuck.lever(a)oracle.com>
[ Upstream commit b2bfe5915d5fe7577221031a39ac722a0a2a1199 ]
The rpc_task_begin trace point always display a task ID of zero.
Move the trace point call site so that it picks up the new task ID.
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sunrpc/sched.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -274,10 +274,9 @@ static inline void rpc_task_set_debuginf
static void rpc_set_active(struct rpc_task *task)
{
- trace_rpc_task_begin(task->tk_client, task, NULL);
-
rpc_task_set_debuginfo(task);
set_bit(RPC_TASK_ACTIVE, &task->tk_runstate);
+ trace_rpc_task_begin(task->tk_client, task, NULL);
}
/*
Patches currently in stable-queue which might be from chuck.lever(a)oracle.com are
queue-4.14/sunrpc-fix-rpc_task_begin-trace-point.patch
This is a note to let you know that I've just added the patch titled
sparc64/mm: set fields in deferred pages
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sparc64-mm-set-fields-in-deferred-pages.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Date: Wed, 15 Nov 2017 17:36:18 -0800
Subject: sparc64/mm: set fields in deferred pages
From: Pavel Tatashin <pasha.tatashin(a)oracle.com>
[ Upstream commit 2a20aa171071a334d80c4e5d5af719d8374702fc ]
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().
With deferred struct page feature enabled there is a case where we set
some fields prior to initializing:
mem_init() {
register_page_bootmem_info();
free_all_bootmem();
...
}
When register_page_bootmem_info() is called only non-deferred struct
pages are initialized. But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.
mem_init
register_page_bootmem_info
register_page_bootmem_info_node
get_page_bootmem
.. setting fields here ..
such as: page->freelist = (void *)type;
free_all_bootmem()
free_low_memory_core_early()
for_each_reserved_mem_region()
reserve_bootmem_region()
init_reserved_page() <- Only if this is deferred reserved page
__init_single_pfn()
__init_single_page()
memset(0) <-- Loose the set fields here
We end up with similar issue as in the previous patch, where currently
we do not observe problem as memory is zeroed. But, if flag asserts are
changed we can start hitting issues.
Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.
The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.
Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Reviewed-by: Steven Sistare <steven.sistare(a)oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Reviewed-by: Bob Picco <bob.picco(a)oracle.com>
Acked-by: David S. Miller <davem(a)davemloft.net>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Will Deacon <will.deacon(a)arm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/sparc/mm/init_64.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2540,10 +2540,17 @@ void __init mem_init(void)
{
high_memory = __va(last_valid_pfn << PAGE_SHIFT);
- register_page_bootmem_info();
free_all_bootmem();
/*
+ * Must be done after boot memory is put on freelist, because here we
+ * might set fields in deferred struct pages that have not yet been
+ * initialized, and free_all_bootmem() initializes all the reserved
+ * deferred pages for us.
+ */
+ register_page_bootmem_info();
+
+ /*
* Set up the zero page, mark it reserved, so that page count
* is not manipulated when freeing the page from user ptes.
*/
Patches currently in stable-queue which might be from pasha.tatashin(a)oracle.com are
queue-4.14/sparc64-mm-set-fields-in-deferred-pages.patch
This is a note to let you know that I've just added the patch titled
slub: fix sysfs duplicate filename creation when slub_debug=O
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
slub-fix-sysfs-duplicate-filename-creation-when-slub_debug-o.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Dec 12 10:32:42 CET 2017
From: Miles Chen <miles.chen(a)mediatek.com>
Date: Wed, 15 Nov 2017 17:32:25 -0800
Subject: slub: fix sysfs duplicate filename creation when slub_debug=O
From: Miles Chen <miles.chen(a)mediatek.com>
[ Upstream commit 11066386efa692f77171484c32ea30f6e5a0d729 ]
When slub_debug=O is set. It is possible to clear debug flags for an
"unmergeable" slab cache in kmem_cache_open(). It makes the "unmergeable"
cache became "mergeable" in sysfs_slab_add().
These caches will generate their "unique IDs" by create_unique_id(), but
it is possible to create identical unique IDs. In my experiment,
sgpool-128, names_cache, biovec-256 generate the same ID ":Ft-0004096" and
the kernel reports "sysfs: cannot create duplicate filename
'/kernel/slab/:Ft-0004096'".
To repeat my experiment, set disable_higher_order_debug=1,
CONFIG_SLUB_DEBUG_ON=y in kernel-4.14.
Fix this issue by setting unmergeable=1 if slub_debug=O and the the
default slub_debug contains any no-merge flags.
call path:
kmem_cache_create()
__kmem_cache_alias() -> we set SLAB_NEVER_MERGE flags here
create_cache()
__kmem_cache_create()
kmem_cache_open() -> clear DEBUG_METADATA_FLAGS
sysfs_slab_add() -> the slab cache is mergeable now
sysfs: cannot create duplicate filename '/kernel/slab/:Ft-0004096'
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.14.0-rc7ajb-00131-gd4c2e9f-dirty #123
Hardware name: linux,dummy-virt (DT)
task: ffffffc07d4e0080 task.stack: ffffff8008008000
PC is at sysfs_warn_dup+0x60/0x7c
LR is at sysfs_warn_dup+0x60/0x7c
pc : lr : pstate: 60000145
Call trace:
sysfs_warn_dup+0x60/0x7c
sysfs_create_dir_ns+0x98/0xa0
kobject_add_internal+0xa0/0x294
kobject_init_and_add+0x90/0xb4
sysfs_slab_add+0x90/0x200
__kmem_cache_create+0x26c/0x438
kmem_cache_create+0x164/0x1f4
sg_pool_init+0x60/0x100
do_one_initcall+0x38/0x12c
kernel_init_freeable+0x138/0x1d4
kernel_init+0x10/0xfc
ret_from_fork+0x10/0x18
Link: http://lkml.kernel.org/r/1510365805-5155-1-git-send-email-miles.chen@mediat…
Signed-off-by: Miles Chen <miles.chen(a)mediatek.com>
Acked-by: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/slub.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5704,6 +5704,10 @@ static int sysfs_slab_add(struct kmem_ca
return 0;
}
+ if (!unmergeable && disable_higher_order_debug &&
+ (slub_debug & DEBUG_METADATA_FLAGS))
+ unmergeable = 1;
+
if (unmergeable) {
/*
* Slabcache can never be merged so we can use the name proper.
Patches currently in stable-queue which might be from miles.chen(a)mediatek.com are
queue-4.14/slub-fix-sysfs-duplicate-filename-creation-when-slub_debug-o.patch