Userspace generally expects APIs that return -EMSGSIZE to allow for them to adjust their buffer size and retry the operation. However, the fscontext log would previously clear the message even in the -EMSGSIZE case.
Given that it is very cheap for us to check whether the buffer is too small before we remove the message from the ring buffer, let's just do that instead. While we're at it, refactor some fscontext_read() into a separate helper to make the ring buffer logic a bit easier to read.
Fixes: 007ec26cdc9f ("vfs: Implement logging through fs_context") Signed-off-by: Aleksa Sarai cyphar@cyphar.com --- Changes in v3: - selftests: use EXPECT_STREQ() - v2: https://lore.kernel.org/r/20250806-fscontext-log-cleanups-v2-0-88e9d34d142f@cyphar.com
Changes in v2: - Refactor message fetching to fetch_message_locked() which returns ERR_PTR() in error cases. [Al Viro] - v1: https://lore.kernel.org/r/20250806-fscontext-log-cleanups-v1-0-880597d42a5a@cyphar.com
--- Aleksa Sarai (2): fscontext: do not consume log entries when returning -EMSGSIZE selftests/filesystems: add basic fscontext log tests
fs/fsopen.c | 54 +++++----- tools/testing/selftests/filesystems/.gitignore | 1 + tools/testing/selftests/filesystems/Makefile | 2 +- tools/testing/selftests/filesystems/fclog.c | 130 +++++++++++++++++++++++++ 4 files changed, 162 insertions(+), 25 deletions(-) --- base-commit: 66639db858112bf6b0f76677f7517643d586e575 change-id: 20250806-fscontext-log-cleanups-50f0143674ae
Best regards,
Userspace generally expects APIs that return -EMSGSIZE to allow for them to adjust their buffer size and retry the operation. However, the fscontext log would previously clear the message even in the -EMSGSIZE case.
Given that it is very cheap for us to check whether the buffer is too small before we remove the message from the ring buffer, let's just do that instead. While we're at it, refactor some fscontext_read() into a separate helper to make the ring buffer logic a bit easier to read.
Fixes: 007ec26cdc9f ("vfs: Implement logging through fs_context") Cc: David Howells dhowells@redhat.com Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Aleksa Sarai cyphar@cyphar.com --- fs/fsopen.c | 54 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 30 insertions(+), 24 deletions(-)
diff --git a/fs/fsopen.c b/fs/fsopen.c index 1aaf4cb2afb2..538fdf814fbf 100644 --- a/fs/fsopen.c +++ b/fs/fsopen.c @@ -18,47 +18,53 @@ #include "internal.h" #include "mount.h"
+static inline const char *fetch_message_locked(struct fc_log *log, size_t len, + bool *need_free) +{ + const char *p; + int index; + + if (unlikely(log->head == log->tail)) + return ERR_PTR(-ENODATA); + + index = log->tail & (ARRAY_SIZE(log->buffer) - 1); + p = log->buffer[index]; + if (unlikely(strlen(p) > len)) + return ERR_PTR(-EMSGSIZE); + + log->buffer[index] = NULL; + *need_free = log->need_free & (1 << index); + log->need_free &= ~(1 << index); + log->tail++; + + return p; +} + /* * Allow the user to read back any error, warning or informational messages. + * Only one message is returned for each read(2) call. */ static ssize_t fscontext_read(struct file *file, char __user *_buf, size_t len, loff_t *pos) { struct fs_context *fc = file->private_data; - struct fc_log *log = fc->log.log; - unsigned int logsize = ARRAY_SIZE(log->buffer); ssize_t ret; - char *p; + const char *p; bool need_free; - int index, n; + int n;
ret = mutex_lock_interruptible(&fc->uapi_mutex); if (ret < 0) return ret; - - if (log->head == log->tail) { - mutex_unlock(&fc->uapi_mutex); - return -ENODATA; - } - - index = log->tail & (logsize - 1); - p = log->buffer[index]; - need_free = log->need_free & (1 << index); - log->buffer[index] = NULL; - log->need_free &= ~(1 << index); - log->tail++; + p = fetch_message_locked(fc->log.log, len, &need_free); mutex_unlock(&fc->uapi_mutex); + if (IS_ERR(p)) + return PTR_ERR(p);
- ret = -EMSGSIZE; n = strlen(p); - if (n > len) - goto err_free; - ret = -EFAULT; - if (copy_to_user(_buf, p, n) != 0) - goto err_free; + if (copy_to_user(_buf, p, n)) + n = -EFAULT; ret = n; - -err_free: if (need_free) kfree(p); return ret;
On Thu, Aug 07, 2025 at 03:55:23AM +1000, Aleksa Sarai wrote:
goto err_free;
- ret = -EFAULT;
- if (copy_to_user(_buf, p, n) != 0)
goto err_free;
- if (copy_to_user(_buf, p, n))
ret = n;n = -EFAULT;
-err_free: if (need_free) kfree(p); return ret;
Minor nit: seeing that there's only one path to that return, I would rather turn it into return n; and dropped the assignment to ret a few lines above. Anyway, that's trivially done when applying...
Anyway, who's carrying fscontext-related stuff this cycle? I've got a short series in that area, but there won't be much from me around there - a plenty of tree-in-dcache stuff, quite a bit of mount-related work, etc., but not a lot around the options-parsing machinery.
Christian, do you have any plans around that area?
On 2025-08-06, Al Viro viro@zeniv.linux.org.uk wrote:
On Thu, Aug 07, 2025 at 03:55:23AM +1000, Aleksa Sarai wrote:
goto err_free;
- ret = -EFAULT;
- if (copy_to_user(_buf, p, n) != 0)
goto err_free;
- if (copy_to_user(_buf, p, n))
ret = n;n = -EFAULT;
-err_free: if (need_free) kfree(p); return ret;
Minor nit: seeing that there's only one path to that return, I would rather turn it into return n; and dropped the assignment to ret a few lines above. Anyway, that's trivially done when applying...
It felt odd to use "return ret;" at the start and switch to "return n;" at the end, but feel free to change it when applying.
Anyway, who's carrying fscontext-related stuff this cycle? I've got a short series in that area, but there won't be much from me around there - a plenty of tree-in-dcache stuff, quite a bit of mount-related work, etc., but not a lot around the options-parsing machinery.
Christian, do you have any plans around that area?
On Thu, Aug 07, 2025 at 12:46:17PM +1000, Aleksa Sarai wrote:
It felt odd to use "return ret;" at the start and switch to "return n;" at the end, but feel free to change it when applying.
s/ret/err/ would take care of that, as well as clarifying the intent - not that "mutex_lock_interruptible() returns 0 or error" hadn't been the basic common knowledge...
Anyway, that's really nitpicking at this point.
On Wed, Aug 06, 2025 at 08:07:51PM +0100, Al Viro wrote:
On Thu, Aug 07, 2025 at 03:55:23AM +1000, Aleksa Sarai wrote:
goto err_free;
- ret = -EFAULT;
- if (copy_to_user(_buf, p, n) != 0)
goto err_free;
- if (copy_to_user(_buf, p, n))
ret = n;n = -EFAULT;
-err_free: if (need_free) kfree(p); return ret;
Minor nit: seeing that there's only one path to that return, I would rather turn it into return n; and dropped the assignment to ret a few lines above. Anyway, that's trivially done when applying...
Anyway, who's carrying fscontext-related stuff this cycle? I've got a short series in that area, but there won't be much from me around there - a plenty of tree-in-dcache stuff, quite a bit of mount-related work, etc., but not a lot around the options-parsing machinery.
Christian, do you have any plans around that area?
I've got a tree for that already and have applied related stuff there. I've fixed up the comments from this thread.
Signed-off-by: Aleksa Sarai cyphar@cyphar.com --- tools/testing/selftests/filesystems/.gitignore | 1 + tools/testing/selftests/filesystems/Makefile | 2 +- tools/testing/selftests/filesystems/fclog.c | 130 +++++++++++++++++++++++++ 3 files changed, 132 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/filesystems/.gitignore b/tools/testing/selftests/filesystems/.gitignore index fcbdb1297e24..64ac0dfa46b7 100644 --- a/tools/testing/selftests/filesystems/.gitignore +++ b/tools/testing/selftests/filesystems/.gitignore @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only dnotify_test devpts_pts +fclog file_stressor anon_inode_test kernfs_test diff --git a/tools/testing/selftests/filesystems/Makefile b/tools/testing/selftests/filesystems/Makefile index 73d4650af1a5..85427d7f19b9 100644 --- a/tools/testing/selftests/filesystems/Makefile +++ b/tools/testing/selftests/filesystems/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0
CFLAGS += $(KHDR_INCLUDES) -TEST_GEN_PROGS := devpts_pts file_stressor anon_inode_test kernfs_test +TEST_GEN_PROGS := devpts_pts file_stressor anon_inode_test kernfs_test fclog TEST_GEN_PROGS_EXTENDED := dnotify_test
include ../lib.mk diff --git a/tools/testing/selftests/filesystems/fclog.c b/tools/testing/selftests/filesystems/fclog.c new file mode 100644 index 000000000000..912a8b755c3b --- /dev/null +++ b/tools/testing/selftests/filesystems/fclog.c @@ -0,0 +1,130 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Author: Aleksa Sarai cyphar@cyphar.com + * Copyright (C) 2025 SUSE LLC. + */ + +#include <assert.h> +#include <errno.h> +#include <sched.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <sys/mount.h> + +#include "../kselftest_harness.h" + +#define ASSERT_ERRNO(expected, _t, seen) \ + __EXPECT(expected, #expected, \ + ({__typeof__(seen) _tmp_seen = (seen); \ + _tmp_seen >= 0 ? _tmp_seen : -errno; }), #seen, _t, 1) + +#define ASSERT_ERRNO_EQ(expected, seen) \ + ASSERT_ERRNO(expected, ==, seen) + +#define ASSERT_SUCCESS(seen) \ + ASSERT_ERRNO(0, <=, seen) + +FIXTURE(ns) +{ + int host_mntns; +}; + +FIXTURE_SETUP(ns) +{ + /* Stash the old mntns. */ + self->host_mntns = open("/proc/self/ns/mnt", O_RDONLY|O_CLOEXEC); + ASSERT_SUCCESS(self->host_mntns); + + /* Create a new mount namespace and make it private. */ + ASSERT_SUCCESS(unshare(CLONE_NEWNS)); + ASSERT_SUCCESS(mount(NULL, "/", NULL, MS_PRIVATE|MS_REC, NULL)); +} + +FIXTURE_TEARDOWN(ns) +{ + ASSERT_SUCCESS(setns(self->host_mntns, CLONE_NEWNS)); + ASSERT_SUCCESS(close(self->host_mntns)); +} + +TEST_F(ns, fscontext_log_enodata) +{ + int fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC); + ASSERT_SUCCESS(fsfd); + + /* A brand new fscontext has no log entries. */ + char buf[128] = {}; + for (int i = 0; i < 16; i++) + ASSERT_ERRNO_EQ(-ENODATA, read(fsfd, buf, sizeof(buf))); + + ASSERT_SUCCESS(close(fsfd)); +} + +TEST_F(ns, fscontext_log_errorfc) +{ + int fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC); + ASSERT_SUCCESS(fsfd); + + ASSERT_ERRNO_EQ(-EINVAL, fsconfig(fsfd, FSCONFIG_SET_STRING, "invalid-arg", "123", 0)); + + char buf[128] = {}; + ASSERT_SUCCESS(read(fsfd, buf, sizeof(buf))); + EXPECT_STREQ("e tmpfs: Unknown parameter 'invalid-arg'\n", buf); + + /* The message has been consumed. */ + ASSERT_ERRNO_EQ(-ENODATA, read(fsfd, buf, sizeof(buf))); + ASSERT_SUCCESS(close(fsfd)); +} + +TEST_F(ns, fscontext_log_errorfc_after_fsmount) +{ + int fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC); + ASSERT_SUCCESS(fsfd); + + ASSERT_ERRNO_EQ(-EINVAL, fsconfig(fsfd, FSCONFIG_SET_STRING, "invalid-arg", "123", 0)); + + ASSERT_SUCCESS(fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0)); + int mfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOEXEC | MOUNT_ATTR_NOSUID); + ASSERT_SUCCESS(mfd); + ASSERT_SUCCESS(move_mount(mfd, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH)); + + /* + * The fscontext log should still contain data even after + * FSCONFIG_CMD_CREATE and fsmount(). + */ + char buf[128] = {}; + ASSERT_SUCCESS(read(fsfd, buf, sizeof(buf))); + EXPECT_STREQ("e tmpfs: Unknown parameter 'invalid-arg'\n", buf); + + /* The message has been consumed. */ + ASSERT_ERRNO_EQ(-ENODATA, read(fsfd, buf, sizeof(buf))); + ASSERT_SUCCESS(close(fsfd)); +} + +TEST_F(ns, fscontext_log_emsgsize) +{ + int fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC); + ASSERT_SUCCESS(fsfd); + + ASSERT_ERRNO_EQ(-EINVAL, fsconfig(fsfd, FSCONFIG_SET_STRING, "invalid-arg", "123", 0)); + + char buf[128] = {}; + /* + * Attempting to read a message with too small a buffer should not + * result in the message getting consumed. + */ + ASSERT_ERRNO_EQ(-EMSGSIZE, read(fsfd, buf, 0)); + ASSERT_ERRNO_EQ(-EMSGSIZE, read(fsfd, buf, 1)); + for (int i = 0; i < 16; i++) + ASSERT_ERRNO_EQ(-EMSGSIZE, read(fsfd, buf, 16)); + + ASSERT_SUCCESS(read(fsfd, buf, sizeof(buf))); + EXPECT_STREQ("e tmpfs: Unknown parameter 'invalid-arg'\n", buf); + + /* The message has been consumed. */ + ASSERT_ERRNO_EQ(-ENODATA, read(fsfd, buf, sizeof(buf))); + ASSERT_SUCCESS(close(fsfd)); +} + +TEST_HARNESS_MAIN
On Thu, 07 Aug 2025 03:55:22 +1000, Aleksa Sarai wrote:
Userspace generally expects APIs that return -EMSGSIZE to allow for them to adjust their buffer size and retry the operation. However, the fscontext log would previously clear the message even in the -EMSGSIZE case.
Given that it is very cheap for us to check whether the buffer is too small before we remove the message from the ring buffer, let's just do that instead. While we're at it, refactor some fscontext_read() into a separate helper to make the ring buffer logic a bit easier to read.
[...]
Applied to the vfs-6.18.mount branch of the vfs/vfs.git tree. Patches in the vfs-6.18.mount branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: vfs-6.18.mount
[1/2] fscontext: do not consume log entries when returning -EMSGSIZE https://git.kernel.org/vfs/vfs/c/b78c4328c498 [2/2] selftests/filesystems: add basic fscontext log tests https://git.kernel.org/vfs/vfs/c/d70b6ceebd29
linux-kselftest-mirror@lists.linaro.org