This series improves the defensive posture of sysfs's use of seq_file to gain the vmap guard pages at the end of vmalloc buffers to stop a class of recurring flaw[1]. The long-term goal is to switch sysfs from a buffer to using seq_file directly, but this will take time to refactor.
Included is also a Clang fix for NULL arithmetic and an LKDTM test to validate vmalloc guard pages.
v4: - fix NULL arithmetic (Arnd) - add lkdtm test - reword commit message v3: https://lore.kernel.org/lkml/20210401022145.2019422-1-keescook@chromium.org/ v2: https://lore.kernel.org/lkml/20210315174851.622228-1-keescook@chromium.org/ v1: https://lore.kernel.org/lkml/20210312205558.2947488-1-keescook@chromium.org/
Thanks!
-Kees
Arnd Bergmann (1): seq_file: Fix clang warning for NULL pointer arithmetic
Kees Cook (2): lkdtm/heap: Add vmalloc linear overflow test sysfs: Unconditionally use vmalloc for buffer
drivers/misc/lkdtm/core.c | 3 ++- drivers/misc/lkdtm/heap.c | 21 +++++++++++++++++- drivers/misc/lkdtm/lkdtm.h | 3 ++- fs/kernfs/file.c | 9 +++++--- fs/seq_file.c | 5 ++++- fs/sysfs/file.c | 29 +++++++++++++++++++++++++ include/linux/seq_file.h | 6 +++++ tools/testing/selftests/lkdtm/tests.txt | 3 ++- 8 files changed, 71 insertions(+), 8 deletions(-)
Similar to the existing slab overflow and stack exhaustion tests, add VMALLOC_LINEAR_OVERFLOW (and rename the slab test SLAB_LINEAR_OVERFLOW).
Signed-off-by: Kees Cook keescook@chromium.org --- drivers/misc/lkdtm/core.c | 3 ++- drivers/misc/lkdtm/heap.c | 21 ++++++++++++++++++++- drivers/misc/lkdtm/lkdtm.h | 3 ++- tools/testing/selftests/lkdtm/tests.txt | 3 ++- 4 files changed, 26 insertions(+), 4 deletions(-)
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c index b2aff4d87c01..c3a5ad21b3d2 100644 --- a/drivers/misc/lkdtm/core.c +++ b/drivers/misc/lkdtm/core.c @@ -119,7 +119,8 @@ static const struct crashtype crashtypes[] = { CRASHTYPE(UNALIGNED_LOAD_STORE_WRITE), CRASHTYPE(FORTIFY_OBJECT), CRASHTYPE(FORTIFY_SUBOBJECT), - CRASHTYPE(OVERWRITE_ALLOCATION), + CRASHTYPE(SLAB_LINEAR_OVERFLOW), + CRASHTYPE(VMALLOC_LINEAR_OVERFLOW), CRASHTYPE(WRITE_AFTER_FREE), CRASHTYPE(READ_AFTER_FREE), CRASHTYPE(WRITE_BUDDY_AFTER_FREE), diff --git a/drivers/misc/lkdtm/heap.c b/drivers/misc/lkdtm/heap.c index 1323bc16f113..5d491c22e09a 100644 --- a/drivers/misc/lkdtm/heap.c +++ b/drivers/misc/lkdtm/heap.c @@ -5,18 +5,37 @@ */ #include "lkdtm.h" #include <linux/slab.h> +#include <linux/vmalloc.h> #include <linux/sched.h>
static struct kmem_cache *double_free_cache; static struct kmem_cache *a_cache; static struct kmem_cache *b_cache;
+/* + * If there aren't guard pages, it's likely that a consecutive allocation will + * let us overflow into the second allocation without overwriting something real. + */ +void lkdtm_VMALLOC_LINEAR_OVERFLOW(void) +{ + char *one, *two; + + one = vzalloc(PAGE_SIZE); + two = vzalloc(PAGE_SIZE); + + pr_info("Attempting vmalloc linear overflow ...\n"); + memset(one, 0xAA, PAGE_SIZE + 1); + + vfree(two); + vfree(one); +} + /* * This tries to stay within the next largest power-of-2 kmalloc cache * to avoid actually overwriting anything important if it's not detected * correctly. */ -void lkdtm_OVERWRITE_ALLOCATION(void) +void lkdtm_SLAB_LINEAR_OVERFLOW(void) { size_t len = 1020; u32 *data = kmalloc(len, GFP_KERNEL); diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h index 5ae48c64df24..5a852d0beee0 100644 --- a/drivers/misc/lkdtm/lkdtm.h +++ b/drivers/misc/lkdtm/lkdtm.h @@ -38,7 +38,8 @@ void lkdtm_FORTIFY_SUBOBJECT(void); /* heap.c */ void __init lkdtm_heap_init(void); void __exit lkdtm_heap_exit(void); -void lkdtm_OVERWRITE_ALLOCATION(void); +void lkdtm_VMALLOC_LINEAR_OVERFLOW(void); +void lkdtm_SLAB_LINEAR_OVERFLOW(void); void lkdtm_WRITE_AFTER_FREE(void); void lkdtm_READ_AFTER_FREE(void); void lkdtm_WRITE_BUDDY_AFTER_FREE(void); diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt index 11ef159be0fd..322a1d2439e3 100644 --- a/tools/testing/selftests/lkdtm/tests.txt +++ b/tools/testing/selftests/lkdtm/tests.txt @@ -15,7 +15,8 @@ UNSET_SMEP CR4 bits went missing DOUBLE_FAULT CORRUPT_PAC UNALIGNED_LOAD_STORE_WRITE -#OVERWRITE_ALLOCATION Corrupts memory on failure +VMALLOC_LINEAR_OVERFLOW +SLAB_LINEAR_OVERFLOW #WRITE_AFTER_FREE Corrupts memory on failure READ_AFTER_FREE #WRITE_BUDDY_AFTER_FREE Corrupts memory on failure
From: Arnd Bergmann arnd@arndb.de
Clang points out that adding something to NULL is not allowed in standard C:
fs/kernfs/file.c:127:15: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] return NULL + !*ppos; ~~~~ ^ fs/seq_file.c:529:14: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] return NULL + (*pos == 0);
Rephrase the code to be extra explicit about the valid, giving them named SEQ_OPEN_EOF and SEQ_OPEN_SINGLE definitions. The instance in kernfs was copied from single_start, so fix both at once.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: c2b19daf6760 ("sysfs, kernfs: prepare read path for kernfs") Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20201028151202.3074398-1-arnd@kernel.org --- fs/kernfs/file.c | 9 ++++++--- fs/seq_file.c | 5 ++++- include/linux/seq_file.h | 6 ++++++ 3 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index c75719312147..721bcbc1d4d0 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -122,10 +122,13 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) return next; } else { /* - * The same behavior and code as single_open(). Returns - * !NULL if pos is at the beginning; otherwise, NULL. + * The same behavior and code as single_open(). Continues + * if pos is at the beginning; otherwise, NULL. */ - return NULL + !*ppos; + if (*ppos) + return NULL; + + return SEQ_OPEN_SINGLE; } }
diff --git a/fs/seq_file.c b/fs/seq_file.c index cb11a34fb871..1b5bd95d0a48 100644 --- a/fs/seq_file.c +++ b/fs/seq_file.c @@ -542,7 +542,10 @@ EXPORT_SYMBOL(seq_dentry);
static void *single_start(struct seq_file *p, loff_t *pos) { - return NULL + (*pos == 0); + if (*pos) + return NULL; + + return SEQ_OPEN_SINGLE; }
static void *single_next(struct seq_file *p, void *v, loff_t *pos) diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h index b83b3ae3c877..51c870765bfd 100644 --- a/include/linux/seq_file.h +++ b/include/linux/seq_file.h @@ -37,6 +37,12 @@ struct seq_operations {
#define SEQ_SKIP 1
+/* + * op->start must return a non-NULL pointer for single_open(), + * this is used when we don't care about the specific value. + */ +#define SEQ_OPEN_SINGLE ((void *)1) + /** * seq_has_overflowed - check if the buffer has overflowed * @m: the seq_file handle
The sysfs interface to seq_file continues to be rather fragile (seq_get_buf() should not be used outside of seq_file), as seen with some recent exploits[1]. Move the seq_file buffer to the vmap area (while retaining the accounting flag), since it has guard pages that will catch and stop linear overflows. This seems justified given that sysfs's use of seq_file almost always already uses PAGE_SIZE allocations, has normally short-lived allocations, and is not normally on a performance critical path.
Once seq_get_buf() has been removed (and all sysfs callbacks using seq_file directly), this change can also be removed.
[1] https://blog.grimm-co.com/2021/03/new-old-bugs-in-linux-kernel.html
Signed-off-by: Kees Cook keescook@chromium.org --- fs/sysfs/file.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index 9aefa7779b29..351ff75a97b1 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -16,6 +16,7 @@ #include <linux/mutex.h> #include <linux/seq_file.h> #include <linux/mm.h> +#include <linux/vmalloc.h>
#include "sysfs.h"
@@ -32,6 +33,31 @@ static const struct sysfs_ops *sysfs_file_ops(struct kernfs_node *kn) return kobj->ktype ? kobj->ktype->sysfs_ops : NULL; }
+/* + * To be proactively defensive against sysfs show() handlers that do not + * correctly stay within their PAGE_SIZE buffer, use the vmap area to gain + * the trailing guard page which will stop linear buffer overflows. + */ +static void *sysfs_kf_seq_start(struct seq_file *sf, loff_t *ppos) +{ + struct kernfs_open_file *of = sf->private; + struct kernfs_node *kn = of->kn; + + WARN_ON_ONCE(sf->buf); + sf->buf = __vmalloc(kn->attr.size, GFP_KERNEL_ACCOUNT); + if (!sf->buf) + return ERR_PTR(-ENOMEM); + sf->size = kn->attr.size; + + /* + * Use the same behavior and code as single_open(): continue + * if pos is at the beginning; otherwise, NULL. + */ + if (*ppos) + return NULL; + return SEQ_OPEN_SINGLE; +} + /* * Reads on sysfs are handled through seq_file, which takes care of hairy * details like buffering and seeking. The following function pipes @@ -206,14 +232,17 @@ static const struct kernfs_ops sysfs_file_kfops_empty = { };
static const struct kernfs_ops sysfs_file_kfops_ro = { + .seq_start = sysfs_kf_seq_start, .seq_show = sysfs_kf_seq_show, };
static const struct kernfs_ops sysfs_file_kfops_wo = { + .seq_start = sysfs_kf_seq_start, .write = sysfs_kf_write, };
static const struct kernfs_ops sysfs_file_kfops_rw = { + .seq_start = sysfs_kf_seq_start, .seq_show = sysfs_kf_seq_show, .write = sysfs_kf_write, };
On Thu, Apr 01, 2021 at 03:13:20PM -0700, Kees Cook wrote:
The sysfs interface to seq_file continues to be rather fragile (seq_get_buf() should not be used outside of seq_file), as seen with some recent exploits[1]. Move the seq_file buffer to the vmap area (while retaining the accounting flag), since it has guard pages that will catch and stop linear overflows. This seems justified given that sysfs's use of seq_file almost always already uses PAGE_SIZE allocations, has normally short-lived allocations, and is not normally on a performance critical path.
This looks completely weird to me. In the end sysfs uses nothing of the seq_file infrastructure, so why do we even pretend to use it? Just switch sysfs_file_kfops_ro and sysfs_file_kfops_rw from using ->seq_show to ->read and do the vmalloc there instead of pretending this is a seq_file.
Once seq_get_buf() has been removed (and all sysfs callbacks using seq_file directly), this change can also be removed.
And with sysfs out of the way I think kiling off the other few users should be pretty easy as well.
On Fri, Apr 02, 2021 at 08:32:21AM +0200, Christoph Hellwig wrote:
On Thu, Apr 01, 2021 at 03:13:20PM -0700, Kees Cook wrote:
The sysfs interface to seq_file continues to be rather fragile (seq_get_buf() should not be used outside of seq_file), as seen with some recent exploits[1]. Move the seq_file buffer to the vmap area (while retaining the accounting flag), since it has guard pages that will catch and stop linear overflows. This seems justified given that sysfs's use of seq_file almost always already uses PAGE_SIZE allocations, has normally short-lived allocations, and is not normally on a performance critical path.
This looks completely weird to me. In the end sysfs uses nothing of the seq_file infrastructure, so why do we even pretend to use it? Just switch sysfs_file_kfops_ro and sysfs_file_kfops_rw from using ->seq_show to ->read and do the vmalloc there instead of pretending this is a seq_file.
As far as I can tell it's a result of kernfs using seq_file, but sysfs never converted all its callbacks to use seq_file.
Once seq_get_buf() has been removed (and all sysfs callbacks using seq_file directly), this change can also be removed.
And with sysfs out of the way I think kiling off the other few users should be pretty easy as well.
Let me look at switching to "read" ... it is a twisty maze. :)
linux-kselftest-mirror@lists.linaro.org