From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
This reverts commit 536d3bf261a2fc3b05b3e91e7eef7383443015cf, as it can
cause writers to memory.high to get stuck in the kernel forever,
performing page reclaim and consuming excessive amounts of CPU cycles.
Before the patch, a write to memory.high would first put the new limit in
place for the workload, and then reclaim the requested delta. After the
patch, the kernel tries to reclaim the delta before putting the new limit
into place, in order to not overwhelm the workload with a sudden, large
excess over the limit. However, if reclaim is actively racing with new
allocations from the uncurbed workload, it can keep the write() working
inside the kernel indefinitely.
This is causing problems in Facebook production. A privileged
system-level daemon that adjusts memory.high for various workloads running
on a host can get unexpectedly stuck in the kernel and essentially turn
into a sort of involuntary kswapd for one of the workloads. We've
observed that daemon busy-spin in a write() for minutes at a time,
neglecting its other duties on the system, and expending privileged system
resources on behalf of a workload.
To remedy this, we have first considered changing the reclaim logic to
break out after a couple of loops - whether the workload has converged to
the new limit or not - and bound the write() call this way. However, the
root cause that inspired the sequence change in the first place has been
fixed through other means, and so a revert back to the proven
limit-setting sequence, also used by memory.max, is preferable.
The sequence was changed to avoid extreme latencies in the workload when
the limit was lowered: the sudden, large excess created by the limit
lowering would erroneously trigger the penalty sleeping code that is meant
to throttle excessive growth from below. Allocating threads could end up
sleeping long after the write() had already reclaimed the delta for which
they were being punished.
However, erroneous throttling also caused problems in other scenarios at
around the same time. This resulted in commit b3ff92916af3 ("mm, memcg:
reclaim more aggressively before high allocator throttling"), included in
the same release as the offending commit. When allocating threads now
encounter large excess caused by a racing write() to memory.high, instead
of entering punitive sleeps, they will simply be tasked with helping
reclaim down the excess, and will be held no longer than it takes to
accomplish that. This is in line with regular limit enforcement - i.e.
if the workload allocates up against or over an otherwise unchanged limit
from below.
With the patch breaking userspace, and the root cause addressed by other
means already, revert it again.
Link: https://lkml.kernel.org/r/20210122184341.292461-1-hannes@cmpxchg.org
Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Michal Koutný <mkoutny(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/memcontrol.c~revert-mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh
+++ a/mm/memcontrol.c
@@ -6271,6 +6271,8 @@ static ssize_t memory_high_write(struct
if (err)
return err;
+ page_counter_set_high(&memcg->memory, high);
+
for (;;) {
unsigned long nr_pages = page_counter_read(&memcg->memory);
unsigned long reclaimed;
@@ -6294,10 +6296,7 @@ static ssize_t memory_high_write(struct
break;
}
- page_counter_set_high(&memcg->memory, high);
-
memcg_wb_domain_size_changed(memcg);
-
return nbytes;
}
_
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
Currently there is an assumption in tmpfs that 64-bit architectures also
have a 64-bit ino_t. This is not true on s390 which has a 32-bit ino_t.
With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, but passing the "inode64" mount
option will fail. This leads to the following behavior:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.
So prevent CONFIG_TMPFS_INODE64 from being selected on s390.
Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-s390
+++ a/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT
+ depends on TMPFS && 64BIT && !S390
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in xattr id lookup
Sysbot has reported a warning where a kmalloc() attempt exceeds the
maximum limit. This has been identified as corruption of the xattr_ids
count when reading the xattr id lookup table.
This patch adds a number of additional sanity checks to detect this
corruption and others.
1. It checks for a corrupted xattr index read from the inode. This could
be because the metadata block is uncompressed, or because the
"compression" bit has been corrupted (turning a compressed block
into an uncompressed block). This would cause an out of bounds read.
2. It checks against corruption of the xattr_ids count. This can either
lead to the above kmalloc failure, or a smaller than expected
table to be read.
3. It checks the contents of the index table for corruption.
[phillip(a)squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/270245655.754655.1612770082682@webmail.123-reg.co…
Link: https://lkml.kernel.org/r/20210204130249.4495-5-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+2ccea6339d368360800d(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/xattr_id.c | 66 +++++++++++++++++++++++++++++++++------
1 file changed, 57 insertions(+), 9 deletions(-)
--- a/fs/squashfs/xattr_id.c~squashfs-add-more-sanity-checks-in-xattr-id-lookup
+++ a/fs/squashfs/xattr_id.c
@@ -31,10 +31,15 @@ int squashfs_xattr_lookup(struct super_b
struct squashfs_sb_info *msblk = sb->s_fs_info;
int block = SQUASHFS_XATTR_BLOCK(index);
int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index);
- u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]);
+ u64 start_block;
struct squashfs_xattr_id id;
int err;
+ if (index >= msblk->xattr_ids)
+ return -EINVAL;
+
+ start_block = le64_to_cpu(msblk->xattr_id_table[block]);
+
err = squashfs_read_metadata(sb, &id, &start_block, &offset,
sizeof(id));
if (err < 0)
@@ -50,13 +55,17 @@ int squashfs_xattr_lookup(struct super_b
/*
* Read uncompressed xattr id lookup table indexes from disk into memory
*/
-__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start,
+__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start,
u64 *xattr_table_start, int *xattr_ids)
{
- unsigned int len;
+ struct squashfs_sb_info *msblk = sb->s_fs_info;
+ unsigned int len, indexes;
struct squashfs_xattr_id_table *id_table;
+ __le64 *table;
+ u64 start, end;
+ int n;
- id_table = squashfs_read_table(sb, start, sizeof(*id_table));
+ id_table = squashfs_read_table(sb, table_start, sizeof(*id_table));
if (IS_ERR(id_table))
return (__le64 *) id_table;
@@ -70,13 +79,52 @@ __le64 *squashfs_read_xattr_id_table(str
if (*xattr_ids == 0)
return ERR_PTR(-EINVAL);
- /* xattr_table should be less than start */
- if (*xattr_table_start >= start)
+ len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
+ indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids);
+
+ /*
+ * The computed size of the index table (len bytes) should exactly
+ * match the table start and end points
+ */
+ start = table_start + sizeof(*id_table);
+ end = msblk->bytes_used;
+
+ if (len != (end - start))
return ERR_PTR(-EINVAL);
- len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
+ table = squashfs_read_table(sb, start, len);
+ if (IS_ERR(table))
+ return table;
+
+ /* table[0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed xattr id blocks. Each entry should be less than
+ * the next (i.e. table[0] < table[1]), and the difference between them
+ * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1]
+ * should be less than table_start, and again the difference
+ * shouls be SQUASHFS_METADATA_SIZE or less.
+ *
+ * Finally xattr_table_start should be less than table[0].
+ */
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
- TRACE("In read_xattr_index_table, length %d\n", len);
+ if (*xattr_table_start >= le64_to_cpu(table[0])) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
- return squashfs_read_table(sb, start + sizeof(*id_table), len);
+ return table;
}
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in inode lookup
Sysbot has reported an "slab-out-of-bounds read" error which has been
identified as being caused by a corrupted "ino_num" value read from the
inode. This could be because the metadata block is uncompressed, or
because the "compression" bit has been corrupted (turning a compressed
block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the following
corruption.
1. It checks against corruption of the inodes count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large inodes count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
[phillip(a)squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/527909353.754618.1612769948607@webmail.123-reg.co…
Link: https://lkml.kernel.org/r/20210204130249.4495-4-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+04419e3ff19d2970ea28(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/export.c | 41 +++++++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 8 deletions(-)
--- a/fs/squashfs/export.c~squashfs-add-more-sanity-checks-in-inode-lookup
+++ a/fs/squashfs/export.c
@@ -41,12 +41,17 @@ static long long squashfs_inode_lookup(s
struct squashfs_sb_info *msblk = sb->s_fs_info;
int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1);
int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1);
- u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]);
+ u64 start;
__le64 ino;
int err;
TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num);
+ if (ino_num == 0 || (ino_num - 1) >= msblk->inodes)
+ return -EINVAL;
+
+ start = le64_to_cpu(msblk->inode_lookup_table[blk]);
+
err = squashfs_read_metadata(sb, &ino, &start, &offset, sizeof(ino));
if (err < 0)
return err;
@@ -111,7 +116,10 @@ __le64 *squashfs_read_inode_lookup_table
u64 lookup_table_start, u64 next_table, unsigned int inodes)
{
unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes);
+ unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes);
+ int n;
__le64 *table;
+ u64 start, end;
TRACE("In read_inode_lookup_table, length %d\n", length);
@@ -121,20 +129,37 @@ __le64 *squashfs_read_inode_lookup_table
if (inodes == 0)
return ERR_PTR(-EINVAL);
- /* length bytes should not extend into the next table - this check
- * also traps instances where lookup_table_start is incorrectly larger
- * than the next table start
+ /*
+ * The computed size of the lookup table (length bytes) should exactly
+ * match the table start and end points
*/
- if (lookup_table_start + length > next_table)
+ if (length != (next_table - lookup_table_start))
return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, lookup_table_start, length);
+ if (IS_ERR(table))
+ return table;
/*
- * table[0] points to the first inode lookup table metadata block,
- * this should be less than lookup_table_start
+ * table0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed inode lookup blocks. Each entry should be
+ * less than the next (i.e. table[0] < table[1]), and the difference
+ * between them should be SQUASHFS_METADATA_SIZE or less.
+ * table[indexes - 1] should be less than lookup_table_start, and
+ * again the difference should be SQUASHFS_METADATA_SIZE or less
*/
- if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) {
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) {
kfree(table);
return ERR_PTR(-EINVAL);
}
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: add more sanity checks in id lookup
Sysbot has reported a number of "slab-out-of-bounds reads" and
"use-after-free read" errors which has been identified as being caused by
a corrupted index value read from the inode. This could be because the
metadata block is uncompressed, or because the "compression" bit has been
corrupted (turning a compressed block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the
following corruption.
1. It checks against corruption of the ids count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large ids count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: syzbot+b06d57ba83f604522af2(a)syzkaller.appspotmail.com
Reported-by: syzbot+c021ba012da41ee9807c(a)syzkaller.appspotmail.com
Reported-by: syzbot+5024636e8b5fd19f0f19(a)syzkaller.appspotmail.com
Reported-by: syzbot+bcbc661df46657d0fa4f(a)syzkaller.appspotmail.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/id.c | 40 ++++++++++++++++++++++++++-------
fs/squashfs/squashfs_fs_sb.h | 1
fs/squashfs/super.c | 6 ++--
fs/squashfs/xattr.h | 10 +++++++-
4 files changed, 45 insertions(+), 12 deletions(-)
--- a/fs/squashfs/id.c~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/id.c
@@ -35,10 +35,15 @@ int squashfs_get_id(struct super_block *
struct squashfs_sb_info *msblk = sb->s_fs_info;
int block = SQUASHFS_ID_BLOCK(index);
int offset = SQUASHFS_ID_BLOCK_OFFSET(index);
- u64 start_block = le64_to_cpu(msblk->id_table[block]);
+ u64 start_block;
__le32 disk_id;
int err;
+ if (index >= msblk->ids)
+ return -EINVAL;
+
+ start_block = le64_to_cpu(msblk->id_table[block]);
+
err = squashfs_read_metadata(sb, &disk_id, &start_block, &offset,
sizeof(disk_id));
if (err < 0)
@@ -56,7 +61,10 @@ __le64 *squashfs_read_id_index_table(str
u64 id_table_start, u64 next_table, unsigned short no_ids)
{
unsigned int length = SQUASHFS_ID_BLOCK_BYTES(no_ids);
+ unsigned int indexes = SQUASHFS_ID_BLOCKS(no_ids);
+ int n;
__le64 *table;
+ u64 start, end;
TRACE("In read_id_index_table, length %d\n", length);
@@ -67,20 +75,36 @@ __le64 *squashfs_read_id_index_table(str
return ERR_PTR(-EINVAL);
/*
- * length bytes should not extend into the next table - this check
- * also traps instances where id_table_start is incorrectly larger
- * than the next table start
+ * The computed size of the index table (length bytes) should exactly
+ * match the table start and end points
*/
- if (id_table_start + length > next_table)
+ if (length != (next_table - id_table_start))
return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, id_table_start, length);
+ if (IS_ERR(table))
+ return table;
/*
- * table[0] points to the first id lookup table metadata block, this
- * should be less than id_table_start
+ * table[0], table[1], ... table[indexes - 1] store the locations
+ * of the compressed id blocks. Each entry should be less than
+ * the next (i.e. table[0] < table[1]), and the difference between them
+ * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1]
+ * should be less than id_table_start, and again the difference
+ * should be SQUASHFS_METADATA_SIZE or less
*/
- if (!IS_ERR(table) && le64_to_cpu(table[0]) >= id_table_start) {
+ for (n = 0; n < (indexes - 1); n++) {
+ start = le64_to_cpu(table[n]);
+ end = le64_to_cpu(table[n + 1]);
+
+ if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {
+ kfree(table);
+ return ERR_PTR(-EINVAL);
+ }
+ }
+
+ start = le64_to_cpu(table[indexes - 1]);
+ if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) {
kfree(table);
return ERR_PTR(-EINVAL);
}
--- a/fs/squashfs/squashfs_fs_sb.h~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/squashfs_fs_sb.h
@@ -64,5 +64,6 @@ struct squashfs_sb_info {
unsigned int inodes;
unsigned int fragments;
int xattr_ids;
+ unsigned int ids;
};
#endif
--- a/fs/squashfs/super.c~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/super.c
@@ -166,6 +166,7 @@ static int squashfs_fill_super(struct su
msblk->directory_table = le64_to_cpu(sblk->directory_table_start);
msblk->inodes = le32_to_cpu(sblk->inodes);
msblk->fragments = le32_to_cpu(sblk->fragments);
+ msblk->ids = le16_to_cpu(sblk->no_ids);
flags = le16_to_cpu(sblk->flags);
TRACE("Found valid superblock on %pg\n", sb->s_bdev);
@@ -177,7 +178,7 @@ static int squashfs_fill_super(struct su
TRACE("Block size %d\n", msblk->block_size);
TRACE("Number of inodes %d\n", msblk->inodes);
TRACE("Number of fragments %d\n", msblk->fragments);
- TRACE("Number of ids %d\n", le16_to_cpu(sblk->no_ids));
+ TRACE("Number of ids %d\n", msblk->ids);
TRACE("sblk->inode_table_start %llx\n", msblk->inode_table);
TRACE("sblk->directory_table_start %llx\n", msblk->directory_table);
TRACE("sblk->fragment_table_start %llx\n",
@@ -236,8 +237,7 @@ static int squashfs_fill_super(struct su
allocate_id_index_table:
/* Allocate and read id index table */
msblk->id_table = squashfs_read_id_index_table(sb,
- le64_to_cpu(sblk->id_table_start), next_table,
- le16_to_cpu(sblk->no_ids));
+ le64_to_cpu(sblk->id_table_start), next_table, msblk->ids);
if (IS_ERR(msblk->id_table)) {
errorf(fc, "unable to read id index table");
err = PTR_ERR(msblk->id_table);
--- a/fs/squashfs/xattr.h~squashfs-add-more-sanity-checks-in-id-lookup
+++ a/fs/squashfs/xattr.h
@@ -17,8 +17,16 @@ extern int squashfs_xattr_lookup(struct
static inline __le64 *squashfs_read_xattr_id_table(struct super_block *sb,
u64 start, u64 *xattr_table_start, int *xattr_ids)
{
+ struct squashfs_xattr_id_table *id_table;
+
+ id_table = squashfs_read_table(sb, start, sizeof(*id_table));
+ if (IS_ERR(id_table))
+ return (__le64 *) id_table;
+
+ *xattr_table_start = le64_to_cpu(id_table->xattr_table_start);
+ kfree(id_table);
+
ERROR("Xattrs in filesystem, these will be ignored\n");
- *xattr_table_start = start;
return ERR_PTR(-ENOTSUPP);
}
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: avoid out of bounds writes in decompressors
Patch series "Squashfs: fix BIO migration regression and add sanity checks".
Patch [1/4] fixes a regression introduced by the "migrate from ll_rw_block
usage to BIO" patch, which has produced a number of Sysbot/Syzkaller
reports.
Patches [2/4], [3/4], and [4/4] fix a number of filesystem corruption
issues which have produced Sysbot reports in the id, inode and xattr
lookup code.
Each patch has been tested against the Sysbot reproducers using the given
kernel configuration. They have the appropriate "Reported-by:" lines
added.
Additionally, all of the reproducer filesystems are indirectly fixed by
patch [4/4] due to the fact they all have xattr corruption which is now
detected there.
Additional testing with other configurations and architectures (32bit, big
endian), and normal filesystems has also been done to trap any inadvertent
regressions caused by the additional sanity checks.
This patch (of 4):
This is a regression introduced by the patch "migrate from ll_rw_block
usage to BIO".
Sysbot/Syskaller has reported a number of "out of bounds writes" and
"unable to handle kernel paging request in squashfs_decompress" errors
which have been identified as a regression introduced by the above patch.
Specifically, the patch removed the following sanity check
if (length < 0 || length > output->length ||
(index + length) > msblk->bytes_used)
This check did two things:
1. It ensured any reads were not beyond the end of the filesystem
2. It ensured that the "length" field read from the filesystem
was within the expected maximum length. Without this any
corrupted values can over-run allocated buffers.
Link: https://lkml.kernel.org/r/20210204130249.4495-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20210204130249.4495-2-phillip@squashfs.org.uk
Fixes: 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO")
Reported-by: syzbot+6fba78f99b9afd4b5634(a)syzkaller.appspotmail.com
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Cc: Philippe Liard <pliard(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/block.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/squashfs/block.c~squashfs-avoid-out-of-bounds-writes-in-decompressors
+++ a/fs/squashfs/block.c
@@ -196,9 +196,15 @@ int squashfs_read_data(struct super_bloc
length = SQUASHFS_COMPRESSED_SIZE(length);
index += 2;
- TRACE("Block @ 0x%llx, %scompressed size %d\n", index,
+ TRACE("Block @ 0x%llx, %scompressed size %d\n", index - 2,
compressed ? "" : "un", length);
}
+ if (length < 0 || length > output->length ||
+ (index + length) > msblk->bytes_used) {
+ res = -EIO;
+ goto out;
+ }
+
if (next_index)
*next_index = index + length;
_
This is the start of the stable review cycle for the 4.9.257 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.257-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.257-rc1
Shih-Yuan Lee (FourDollars) <sylee(a)canonical.com>
ALSA: hda/realtek - Fix typo of pincfg for Dell quirk
Nadav Amit <namit(a)vmware.com>
iommu/vt-d: Do not use flush-queue when caching-mode is on
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPI: thermal: Do not call acpi_thermal_check() directly
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Hugh Dickins <hughd(a)google.com>
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: fix bounce buffer usage for non-sg list case
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Arnd Bergmann <arnd(a)arndb.de>
elfcore: fix building with clang
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
Sasha Levin <sashal(a)kernel.org>
stable: clamp SUBLEVEL in 4.4 and 4.9
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't fail on missing symbol table
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix fast-rx encryption check
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Thomas Gleixner <tglx(a)linutronix.de>
futex: Handle faults correctly for PI futexes
Thomas Gleixner <tglx(a)linutronix.de>
futex: Simplify fixup_pi_state_owner()
Thomas Gleixner <tglx(a)linutronix.de>
futex: Use pi_state_update_owner() in put_pi_state()
Thomas Gleixner <tglx(a)linutronix.de>
rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
Thomas Gleixner <tglx(a)linutronix.de>
futex: Provide and use pi_state_update_owner()
Thomas Gleixner <tglx(a)linutronix.de>
futex: Replace pointless printk in fixup_owner()
Peter Zijlstra <peterz(a)infradead.org>
futex: Avoid violating the 10th rule of futex
Peter Zijlstra <peterz(a)infradead.org>
futex: Rework inconsistent rt_mutex/futex_q state
Peter Zijlstra <peterz(a)infradead.org>
futex: Remove rt_mutex_deadlock_account_*()
Peter Zijlstra <peterz(a)infradead.org>
futex,rt_mutex: Provide futex specific rt_mutex API
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: Ensure that CRQ entry read are correctly ordered
Pan Bian <bianpan2016(a)163.com>
net: dsa: bcm_sf2: put device node before return
-------------
Diffstat:
Makefile | 12 +-
arch/arm/mach-footbridge/dc21285.c | 12 +-
arch/x86/Makefile | 3 +
arch/x86/include/asm/apic.h | 10 --
arch/x86/include/asm/barrier.h | 18 +++
arch/x86/kernel/apic/apic.c | 4 +
arch/x86/kernel/apic/x2apic_cluster.c | 6 +-
arch/x86/kernel/apic/x2apic_phys.c | 6 +-
drivers/acpi/thermal.c | 55 ++++---
drivers/input/joystick/xpad.c | 17 ++-
drivers/input/serio/i8042-x86ia64io.h | 2 +
drivers/iommu/intel-iommu.c | 6 +
drivers/mmc/core/sdio_cis.c | 6 +
drivers/net/dsa/bcm_sf2.c | 8 +-
drivers/net/ethernet/ibm/ibmvnic.c | 6 +
drivers/scsi/ibmvscsi/ibmvfc.c | 4 +-
drivers/scsi/libfc/fc_exch.c | 16 +-
drivers/usb/class/usblp.c | 19 ++-
drivers/usb/dwc2/gadget.c | 8 +-
drivers/usb/gadget/legacy/ether.c | 4 +-
drivers/usb/host/xhci-ring.c | 31 ++--
drivers/usb/serial/cp210x.c | 2 +
drivers/usb/serial/option.c | 6 +
fs/cifs/dir.c | 22 ++-
fs/hugetlbfs/inode.c | 3 +-
include/linux/elfcore.h | 22 +++
include/linux/hugetlb.h | 3 +
kernel/Makefile | 1 -
kernel/elfcore.c | 25 ---
kernel/futex.c | 276 +++++++++++++++++++---------------
kernel/kprobes.c | 4 +
kernel/locking/rtmutex-debug.c | 9 --
kernel/locking/rtmutex-debug.h | 3 -
kernel/locking/rtmutex.c | 127 ++++++++++------
kernel/locking/rtmutex.h | 2 -
kernel/locking/rtmutex_common.h | 12 +-
mm/huge_memory.c | 37 +++--
mm/hugetlb.c | 9 +-
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 +-
net/mac80211/rate.c | 3 +-
net/mac80211/rx.c | 2 +
net/sched/sch_api.c | 3 +-
sound/pci/hda/patch_realtek.c | 2 +-
tools/objtool/elf.c | 7 +-
45 files changed, 521 insertions(+), 320 deletions(-)
This is the start of the stable review cycle for the 4.14.220 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 07 Feb 2021 14:06:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.220-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.220-rc1
Peter Zijlstra <peterz(a)infradead.org>
kthread: Extract KTHREAD_IS_PER_CPU
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't fail on missing symbol table
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix fast-rx encryption check
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Martin Wilck <mwilck(a)suse.com>
scsi: scsi_transport_srp: Don't block target in failfast state
Peter Zijlstra <peterz(a)infradead.org>
x86: __always_inline __{rd,wr}msr()
Tony Lindgren <tony(a)atomide.com>
phy: cpcap-usb: Fix warning for missing regulator_disable
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
driver core: Extend device_is_dependent()
Benjamin Gaignard <benjamin.gaignard(a)linaro.org>
base: core: Remove WARN_ON from link dependencies check
Eric Dumazet <edumazet(a)google.com>
net_sched: gen_estimator: support large ewma log
Eric Dumazet <edumazet(a)google.com>
net_sched: reject silly cell_log in qdisc_get_rtab()
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPI: thermal: Do not call acpi_thermal_check() directly
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: Ensure that CRQ entry read are correctly ordered
Pan Bian <bianpan2016(a)163.com>
net: dsa: bcm_sf2: put device node before return
-------------
Diffstat:
Makefile | 4 +--
arch/x86/include/asm/msr.h | 4 +--
drivers/acpi/thermal.c | 55 +++++++++++++++++++++++++-----------
drivers/base/core.c | 19 +++++++++++--
drivers/net/dsa/bcm_sf2.c | 8 ++++--
drivers/net/ethernet/ibm/ibmvnic.c | 6 ++++
drivers/phy/motorola/phy-cpcap-usb.c | 19 +++++++++----
drivers/scsi/ibmvscsi/ibmvfc.c | 4 ++-
drivers/scsi/libfc/fc_exch.c | 16 +++++++++--
drivers/scsi/scsi_transport_srp.c | 9 +++++-
include/linux/kthread.h | 3 ++
kernel/kthread.c | 27 +++++++++++++++++-
kernel/smpboot.c | 1 +
net/core/gen_estimator.c | 11 +++++---
net/mac80211/rx.c | 2 ++
net/sched/sch_api.c | 3 +-
tools/objtool/elf.c | 7 +++--
17 files changed, 155 insertions(+), 43 deletions(-)
This is the start of the stable review cycle for the 4.14.221 release.
There are 30 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.221-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.221-rc1
DENG Qingfang <dqfext(a)gmail.com>
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Nadav Amit <namit(a)vmware.com>
iommu/vt-d: Do not use flush-queue when caching-mode is on
Benjamin Valentin <benpicco(a)googlemail.com>
Input: xpad - sync supported devices with fork on GitHub
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/build: Disable CET instrumentation in the kernel
Hugh Dickins <hughd(a)google.com>
mm: thp: fix MADV_REMOVE deadlock on shmem THP
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlb: fix a race between isolating and freeing page
Muchun Song <songmuchun(a)bytedance.com>
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King <rmk+kernel(a)armlinux.org.uk>
ARM: footbridge: fix dc21285 PCI configuration accessors
Thorsten Leemhuis <linux(a)leemhuis.info>
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Fengnan Chang <fengnanchang(a)gmail.com>
mmc: core: Limit retries when analyse of SDIO tuples fails
Gustavo A. R. Silva <gustavoars(a)kernel.org>
smb3: Fix out-of-bounds bug in SMB2_negotiate()
Aurelien Aptel <aaptel(a)suse.com>
cifs: report error instead of invalid when revalidating a dentry fails
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: fix bounce buffer usage for non-sg list case
Wang ShaoBo <bobo.shaobowang(a)huawei.com>
kretprobe: Avoid re-registration of the same kretprobe earlier
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix station rate table updates on assoc
Liangyan <liangyan.peng(a)linux.alibaba.com>
ovl: fix dentry leak in ovl_get_redirect
Heiko Stuebner <heiko.stuebner(a)theobroma-systems.com>
usb: dwc2: Fix endpoint direction check in ep_from_windex
Jeremy Figgins <kernel(a)jeremyfiggins.com>
USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter <dan.carpenter(a)oracle.com>
USB: gadget: legacy: fix an error code in eth_bind()
Wei Wang <weiwan(a)google.com>
ipv4: fix race condition between route lookup and invalidation
Arnd Bergmann <arnd(a)arndb.de>
elfcore: fix building with clang
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Support Clang non-section symbols in ORC generation
Xie He <xie.he.0141(a)gmail.com>
net: lapb: Copy the skb before sending a packet
Zyta Szpak <zr(a)semihalf.com>
arm64: dts: ls1046a: fix dcfg address range
Alexey Dobriyan <adobriyan(a)gmail.com>
Input: i8042 - unbreak Pegatron C15B
Christoph Schemmel <christoph.schemmel(a)gmail.com>
USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin <bg4akv(a)hotmail.com>
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran <Pho.Tran(a)silabs.com>
USB: serial: cp210x: add pid/vid for WSDA-200-USB
-------------
Diffstat:
Makefile | 10 ++-----
arch/arm/mach-footbridge/dc21285.c | 12 ++++----
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +-
arch/x86/Makefile | 3 ++
arch/x86/include/asm/apic.h | 10 -------
arch/x86/include/asm/barrier.h | 18 ++++++++++++
arch/x86/kernel/apic/apic.c | 4 +++
arch/x86/kernel/apic/x2apic_cluster.c | 6 ++--
arch/x86/kernel/apic/x2apic_phys.c | 6 ++--
drivers/input/joystick/xpad.c | 17 +++++++++++-
drivers/input/serio/i8042-x86ia64io.h | 2 ++
drivers/iommu/intel-iommu.c | 6 ++++
drivers/mmc/core/sdio_cis.c | 6 ++++
drivers/net/dsa/mv88e6xxx/chip.c | 6 +++-
drivers/nvme/host/pci.c | 2 ++
drivers/usb/class/usblp.c | 19 +++++++------
drivers/usb/dwc2/gadget.c | 8 +-----
drivers/usb/gadget/legacy/ether.c | 4 ++-
drivers/usb/host/xhci-ring.c | 31 +++++++++++++--------
drivers/usb/serial/cp210x.c | 2 ++
drivers/usb/serial/option.c | 6 ++++
fs/cifs/dir.c | 22 +++++++++++++--
fs/cifs/smb2pdu.h | 2 +-
fs/hugetlbfs/inode.c | 3 +-
fs/overlayfs/dir.c | 2 +-
include/linux/elfcore.h | 22 +++++++++++++++
include/linux/hugetlb.h | 3 ++
kernel/Makefile | 1 -
kernel/elfcore.c | 26 ------------------
kernel/kprobes.c | 4 +++
mm/huge_memory.c | 37 +++++++++++++++----------
mm/hugetlb.c | 9 +++---
net/ipv4/route.c | 38 +++++++++++++-------------
net/lapb/lapb_out.c | 3 +-
net/mac80211/driver-ops.c | 5 +++-
net/mac80211/rate.c | 3 +-
tools/objtool/orc_gen.c | 33 +++++++++++++++++-----
37 files changed, 255 insertions(+), 138 deletions(-)