From: Eric Biggers <ebiggers(a)google.com>
Subject: fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
After request_module(), nothing is stopping the module from being unloaded
until someone takes a reference to it via try_get_module().
The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace
running 'rmmod' concurrently.
Since WARN_ONCE() is for kernel bugs only, not for user-reachable
situations, downgrade this warning to pr_warn_once().
Keep it printed once only, since the intent of this warning is to detect a
bug in modprobe at boot time. Printing the warning more than once
wouldn't really provide any useful extra information.
Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.org
Fixes: 41124db869b7 ("fs: warn in case userspace lied about modprobe return")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Reviewed-by: Jessica Yu <jeyu(a)kernel.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv(a)google.com>
Cc: Jessica Yu <jeyu(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: NeilBrown <neilb(a)suse.com>
Cc: <stable(a)vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/filesystems.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/filesystems.c~fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once
+++ a/fs/filesystems.c
@@ -272,7 +272,9 @@ struct file_system_type *get_fs_type(con
fs = __get_fs_type(name, len);
if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
fs = __get_fs_type(name, len);
- WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
+ if (!fs)
+ pr_warn_once("request_module fs-%.*s succeeded, but still no fs?\n",
+ len, name);
}
if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
_
From: Eric Biggers <ebiggers(a)google.com>
Subject: kmod: make request_module() return an error when autoloading is disabled
Patch series "module autoloading fixes and cleanups", v5.
This series fixes a bug where request_module() was reporting success to
kernel code when module autoloading had been completely disabled via 'echo
> /proc/sys/kernel/modprobe'.
It also addresses the issues raised on the original thread
(https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/…)
by documenting the modprobe sysctl, adding a self-test for the empty path
case, and downgrading a user-reachable WARN_ONCE().
This patch (of 4):
It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string. This can be preferable to
setting it to a nonexistent file since it avoids the overhead of an
attempted execve(), avoids potential deadlocks, and avoids the call to
security_kernel_module_request() and thus on SELinux-based systems
eliminates the need to write SELinux rules to dontaudit module_request.
However, when module autoloading is disabled in this way, request_module()
returns 0. This is broken because callers expect 0 to mean that the
module was successfully loaded.
Apparently this was never noticed because this method of disabling module
autoloading isn't used much, and also most callers don't use the return
value of request_module() since it's always necessary to check whether the
module registered its functionality or not anyway. But improperly
returning 0 can indeed confuse a few callers, for example get_fs_type() in
fs/filesystems.c where it causes a WARNING to be hit:
if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
fs = __get_fs_type(name, len);
WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
}
This is easily reproduced with:
echo > /proc/sys/kernel/modprobe
mount -t NONEXISTENT none /
It causes:
request_module fs-NONEXISTENT succeeded, but still no fs?
WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
[...]
This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module.
Regardless, request_module() should correctly return an error when it
fails. So let's make it return -ENOENT, which matches the error when the
modprobe binary doesn't exist.
I've also sent patches to document and test this case.
Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Luis Chamberlain <mcgrof(a)kernel.org>
Reviewed-by: Jessica Yu <jeyu(a)kernel.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv(a)google.com>
Cc: Ben Hutchings <benh(a)debian.org>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/kmod.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/kmod.c~kmod-make-request_module-return-an-error-when-autoloading-is-disabled
+++ a/kernel/kmod.c
@@ -120,7 +120,7 @@ out:
* invoke it.
*
* If module auto-loading support is disabled then this function
- * becomes a no-operation.
+ * simply returns -ENOENT.
*/
int __request_module(bool wait, const char *fmt, ...)
{
@@ -137,7 +137,7 @@ int __request_module(bool wait, const ch
WARN_ON_ONCE(wait && current_is_async());
if (!modprobe_path[0])
- return 0;
+ return -ENOENT;
va_start(args, fmt);
ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
_
From: Changwei Ge <chge(a)linux.alibaba.com>
Subject: ocfs2: no need try to truncate file beyond i_size
Linux fallocate(2) with FALLOC_FL_PUNCH_HOLE mode set, its offset can
exceed inode size. Ocfs2 now does't allow that offset beyond inode size.
This restriction is not necessary and voilates fallocate(2) semantics.
If fallocate(2) offset is beyond inode size, just return success and do
nothing further.
Otherwise, ocfs2 will crash the kernel.
kernel BUG at fs/ocfs2//alloc.c:7264!
ocfs2_truncate_inline+0x20f/0x360 [ocfs2]
? ocfs2_read_blocks+0x2f3/0x5f0 [ocfs2]
ocfs2_remove_inode_range+0x23c/0xcb0 [ocfs2]
? ocfs2_read_inode_block+0x10/0x20 [ocfs2]
? ocfs2_allocate_extend_trans+0x1a0/0x1a0 [ocfs2]
__ocfs2_change_file_space+0x4a5/0x650 [ocfs2]
ocfs2_fallocate+0x83/0xa0 [ocfs2]
? __audit_syscall_entry+0xb8/0x100
? __sb_start_write+0x3b/0x70
vfs_fallocate+0x148/0x230
SyS_fallocate+0x48/0x80
do_syscall_64+0x79/0x170
Link: http://lkml.kernel.org/r/20200407082754.17565-1-chge@linux.alibaba.com
Signed-off-by: Changwei Ge <chge(a)linux.alibaba.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/alloc.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/ocfs2/alloc.c~ocfs2-no-need-try-to-truncate-file-beyond-i_size
+++ a/fs/ocfs2/alloc.c
@@ -7402,6 +7402,10 @@ int ocfs2_truncate_inline(struct inode *
struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
struct ocfs2_inline_data *idata = &di->id2.i_data;
+ /* No need to punch hole beyond i_size. */
+ if (start >= i_size_read(inode))
+ return 0;
+
if (end > i_size_read(inode))
end = i_size_read(inode);
_
From: Jakub Kicinski <kuba(a)kernel.org>
Subject: mm, memcg: do not high throttle allocators based on wraparound
If a cgroup violates its memory.high constraints, we may end up unduly
penalising it. For example, for the following hierarchy:
A: max high, 20 usage
A/B: 9 high, 10 usage
A/C: max high, 10 usage
We would end up doing the following calculation below when calculating
high delay for A/B:
A/B: 10 - 9 = 1...
A: 20 - PAGE_COUNTER_MAX = 21, so set max_overage to 21.
This gets worse with higher disparities in usage in the parent.
I have no idea how this disappeared from the final version of the patch,
but it is certainly Not Good(tm). This wasn't obvious in testing because,
for a simple cgroup hierarchy with only one child, the result is usually
roughly the same. It's only in more complex hierarchies that things go
really awry (although still, the effects are limited to a maximum of 2
seconds in schedule_timeout_killable at a maximum).
[chris(a)chrisdown.name: changelog]
Link: http://lkml.kernel.org/r/20200331152424.GA1019937@chrisdown.name
Fixes: e26733e0d0ec ("mm, memcg: throttle allocators based on ancestral memory.high")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org> [5.4.x]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/memcontrol.c~mm-memcg-do-not-high-throttle-allocators-based-on-wraparound
+++ a/mm/memcontrol.c
@@ -2336,6 +2336,9 @@ static unsigned long calculate_high_dela
usage = page_counter_read(&memcg->memory);
high = READ_ONCE(memcg->high);
+ if (usage <= high)
+ continue;
+
/*
* Prevent division by 0 in overage calculation by acting as if
* it was a threshold of 1 page
_
From: Simon Gander <simon(a)tuxera.com>
Subject: hfsplus: fix crash and filesystem corruption when deleting files
When removing files containing extended attributes, the hfsplus driver may
remove the wrong entries from the attributes b-tree, causing major
filesystem damage and in some cases even kernel crashes.
To remove a file, all its extended attributes have to be removed as well.
The driver does this by looking up all keys in the attributes b-tree with
the cnid of the file. Each of these entries then gets deleted using the
key used for searching, which doesn't contain the attribute's name when it
should. Since the key doesn't contain the name, the deletion routine will
not find the correct entry and instead remove the one in front of it. If
parent nodes have to be modified, these become corrupt as well. This
causes invalid links and unsorted entries that not even macOS's fsck_hfs
is able to fix.
To fix this, modify the search key before an entry is deleted from the
attributes b-tree by copying the found entry's key into the search key,
therefore ensuring that the correct entry gets removed from the tree.
Link: http://lkml.kernel.org/r/20200327155541.1521-1-simon@tuxera.com
Signed-off-by: Simon Gander <simon(a)tuxera.com>
Reviewed-by: Anton Altaparmakov <anton(a)tuxera.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hfsplus/attributes.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/hfsplus/attributes.c~hfsplus-fix-crash-and-filesystem-corruption-when-deleting-files
+++ a/fs/hfsplus/attributes.c
@@ -292,6 +292,10 @@ static int __hfsplus_delete_attr(struct
return -ENOENT;
}
+ /* Avoid btree corruption */
+ hfs_bnode_read(fd->bnode, fd->search_key,
+ fd->keyoffset, fd->keylength);
+
err = hfs_brec_remove(fd);
if (err)
return err;
_
Hi!
Please pick the backports for the following upstream commits:
4fbc0c711b24 "ceph: remove the extra slashes in the server path"
b27a939e8376 "ceph: canonicalize server path in place"
They fix an ancient bug that can be reproduced in kernels as old as 4.9 (I
couldn't reproduced it with 4.4).
Cheers,
--
Luis Henriques
Ilya Dryomov (1):
ceph: canonicalize server path in place
Xiubo Li (1):
ceph: remove the extra slashes in the server path
subject of the patch:
ARM: imx: Enable ARM_ERRATA_814220 for i.MX6UL and i.MX7D
the commit ID:
4562fa4c86c92a2df635fe0697c9e06379738741
why you think it should be applied:
ARM_ERRATA_814220 is already available in v5.4.x, but not yet automatically selected for i.MX6UL(L)
and what kernel version you wish it to be applied to.
linux-5.4.y
subject of the patch:
ARM: imx: only select ARM_ERRATA_814220 for ARMv7-A
the commit ID:
c74067a0f776c1d695a713a4388c3b6a094ee40a
why you think it should be applied:
This is a follow up for the previous patch.
and what kernel version you wish it to be applied to.
linux-5.4.y
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1ad53d9fa3f6168ebcf48a50e08b170432da2257 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook(a)chromium.org>
Date: Wed, 1 Apr 2020 21:04:23 -0700
Subject: [PATCH] slub: improve bit diffusion for freelist ptr obfuscation
Under CONFIG_SLAB_FREELIST_HARDENED=y, the obfuscation was relatively weak
in that the ptr and ptr address were usually so close that the first XOR
would result in an almost entirely 0-byte value[1], leaving most of the
"secret" number ultimately being stored after the third XOR. A single
blind memory content exposure of the freelist was generally sufficient to
learn the secret.
Add a swab() call to mix bits a little more. This is a cheap way (1
cycle) to make attacks need more than a single exposure to learn the
secret (or to know _where_ the exposure is in memory).
kmalloc-32 freelist walk, before:
ptr ptr_addr stored value secret
ffff90c22e019020@ffff90c22e019000 is 86528eb656b3b5bd (86528eb656b3b59d)
ffff90c22e019040@ffff90c22e019020 is 86528eb656b3b5fd (86528eb656b3b59d)
ffff90c22e019060@ffff90c22e019040 is 86528eb656b3b5bd (86528eb656b3b59d)
ffff90c22e019080@ffff90c22e019060 is 86528eb656b3b57d (86528eb656b3b59d)
ffff90c22e0190a0@ffff90c22e019080 is 86528eb656b3b5bd (86528eb656b3b59d)
...
after:
ptr ptr_addr stored value secret
ffff9eed6e019020@ffff9eed6e019000 is 793d1135d52cda42 (86528eb656b3b59d)
ffff9eed6e019040@ffff9eed6e019020 is 593d1135d52cda22 (86528eb656b3b59d)
ffff9eed6e019060@ffff9eed6e019040 is 393d1135d52cda02 (86528eb656b3b59d)
ffff9eed6e019080@ffff9eed6e019060 is 193d1135d52cdae2 (86528eb656b3b59d)
ffff9eed6e0190a0@ffff9eed6e019080 is f93d1135d52cdac2 (86528eb656b3b59d)
[1] https://blog.infosectcbr.com.au/2020/03/weaknesses-in-linux-kernel-heap.html
Fixes: 2482ddec670f ("mm: add SLUB free list pointer obfuscation")
Reported-by: Silvio Cesare <silvio.cesare(a)gmail.com>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: <stable(a)vger.kernel.org>
Link: http://lkml.kernel.org/r/202003051623.AF4F8CB@keescook
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/slub.c b/mm/slub.c
index fc911c222b11..bc949e3428c9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -259,7 +259,7 @@ static inline void *freelist_ptr(const struct kmem_cache *s, void *ptr,
* freepointer to be restored incorrectly.
*/
return (void *)((unsigned long)ptr ^ s->random ^
- (unsigned long)kasan_reset_tag((void *)ptr_addr));
+ swab((unsigned long)kasan_reset_tag((void *)ptr_addr)));
#else
return ptr;
#endif