From: Mattias Nissler mnissler@chromium.org
For mounts that have the new "nosymfollow" option, don't follow symlinks when resolving paths. The new option is similar in spirit to the existing "nodev", "noexec", and "nosuid" options, as well as to the LOOKUP_NO_SYMLINKS resolve flag in the openat2(2) syscall. Various BSD variants have been supporting the "nosymfollow" mount option for a long time with equivalent implementations.
Note that symlinks may still be created on file systems mounted with the "nosymfollow" option present. readlink() remains functional, so user space code that is aware of symlinks can still choose to follow them explicitly.
Setting the "nosymfollow" mount option helps prevent privileged writers from modifying files unintentionally in case there is an unexpected link along the accessed path. The "nosymfollow" option is thus useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts.
More information on the history and motivation for this patch can be found here:
https://sites.google.com/a/chromium.org/dev/chromium-os/chromiumos-design-do...
Signed-off-by: Mattias Nissler mnissler@chromium.org Signed-off-by: Ross Zwisler zwisler@google.com Reviewed-by: Aleksa Sarai cyphar@cyphar.com --- Changes since v8 [1]: * Look for MNT_NOSYMFOLLOW in link->mnt->mnt_flags so we are testing the link itself rather than the directory holding the link. (Al Viro) * Rebased onto v5.9-rc2.
After this lands I will upstream changes to util-linux[2] and man-pages [3].
[1]: https://patchwork.kernel.org/patch/11724607/ [2]: https://github.com/rzwisler/util-linux/commit/7f8771acd85edb70d97921c026c55e... [3]: https://github.com/rzwisler/man-pages/commit/b8fe8079f64b5068940c0144586e580... --- fs/namei.c | 3 ++- fs/namespace.c | 2 ++ fs/proc_namespace.c | 1 + fs/statfs.c | 2 ++ include/linux/mount.h | 3 ++- include/linux/statfs.h | 1 + include/uapi/linux/mount.h | 1 + 7 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c index e99e2a9da0f7d..33e8c79bc761e 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1626,7 +1626,8 @@ static const char *pick_link(struct nameidata *nd, struct path *link, return ERR_PTR(error); }
- if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS)) + if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS) || + unlikely(link->mnt->mnt_flags & MNT_NOSYMFOLLOW)) return ERR_PTR(-ELOOP);
if (!(nd->flags & LOOKUP_RCU)) { diff --git a/fs/namespace.c b/fs/namespace.c index bae0e95b3713a..6408788a649e1 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3160,6 +3160,8 @@ int path_mount(const char *dev_name, struct path *path, mnt_flags &= ~(MNT_RELATIME | MNT_NOATIME); if (flags & MS_RDONLY) mnt_flags |= MNT_READONLY; + if (flags & MS_NOSYMFOLLOW) + mnt_flags |= MNT_NOSYMFOLLOW;
/* The default atime for remount is preservation */ if ((flags & MS_REMOUNT) && diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c index 3059a9394c2d6..e59d4bb3a89e4 100644 --- a/fs/proc_namespace.c +++ b/fs/proc_namespace.c @@ -70,6 +70,7 @@ static void show_mnt_opts(struct seq_file *m, struct vfsmount *mnt) { MNT_NOATIME, ",noatime" }, { MNT_NODIRATIME, ",nodiratime" }, { MNT_RELATIME, ",relatime" }, + { MNT_NOSYMFOLLOW, ",nosymfollow" }, { 0, NULL } }; const struct proc_fs_opts *fs_infop; diff --git a/fs/statfs.c b/fs/statfs.c index 2616424012ea7..59f33752c1311 100644 --- a/fs/statfs.c +++ b/fs/statfs.c @@ -29,6 +29,8 @@ static int flags_by_mnt(int mnt_flags) flags |= ST_NODIRATIME; if (mnt_flags & MNT_RELATIME) flags |= ST_RELATIME; + if (mnt_flags & MNT_NOSYMFOLLOW) + flags |= ST_NOSYMFOLLOW; return flags; }
diff --git a/include/linux/mount.h b/include/linux/mount.h index de657bd211fa6..aaf343b38671c 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -30,6 +30,7 @@ struct fs_context; #define MNT_NODIRATIME 0x10 #define MNT_RELATIME 0x20 #define MNT_READONLY 0x40 /* does the user want this to be r/o? */ +#define MNT_NOSYMFOLLOW 0x80
#define MNT_SHRINKABLE 0x100 #define MNT_WRITE_HOLD 0x200 @@ -46,7 +47,7 @@ struct fs_context; #define MNT_SHARED_MASK (MNT_UNBINDABLE) #define MNT_USER_SETTABLE_MASK (MNT_NOSUID | MNT_NODEV | MNT_NOEXEC \ | MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME \ - | MNT_READONLY) + | MNT_READONLY | MNT_NOSYMFOLLOW) #define MNT_ATIME_MASK (MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME )
#define MNT_INTERNAL_FLAGS (MNT_SHARED | MNT_WRITE_HOLD | MNT_INTERNAL | \ diff --git a/include/linux/statfs.h b/include/linux/statfs.h index 9bc69edb8f188..fac4356ea1bfc 100644 --- a/include/linux/statfs.h +++ b/include/linux/statfs.h @@ -40,6 +40,7 @@ struct kstatfs { #define ST_NOATIME 0x0400 /* do not update access times */ #define ST_NODIRATIME 0x0800 /* do not update directory access times */ #define ST_RELATIME 0x1000 /* update atime relative to mtime/ctime */ +#define ST_NOSYMFOLLOW 0x2000 /* do not follow symlinks */
struct dentry; extern int vfs_get_fsid(struct dentry *dentry, __kernel_fsid_t *fsid); diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h index 96a0240f23fed..dd8306ea336c1 100644 --- a/include/uapi/linux/mount.h +++ b/include/uapi/linux/mount.h @@ -16,6 +16,7 @@ #define MS_REMOUNT 32 /* Alter flags of a mounted FS */ #define MS_MANDLOCK 64 /* Allow mandatory locks on an FS */ #define MS_DIRSYNC 128 /* Directory modifications are synchronous */ +#define MS_NOSYMFOLLOW 256 /* Do not follow symlinks */ #define MS_NOATIME 1024 /* Do not update access times. */ #define MS_NODIRATIME 2048 /* Do not update directory access times */ #define MS_BIND 4096
Add tests for the new 'nosymfollow' mount option. We test to make sure that symlink traversal fails with ELOOP when 'nosymfollow' is set, but that readlink(2) and realpath(3) still work as expected. We also verify that statfs(2) correctly returns ST_NOSYMFOLLOW when we are mounted with the 'nosymfollow' option.
Signed-off-by: Ross Zwisler zwisler@google.com --- tools/testing/selftests/mount/.gitignore | 1 + tools/testing/selftests/mount/Makefile | 4 +- .../selftests/mount/nosymfollow-test.c | 218 ++++++++++++++++++ .../selftests/mount/run_nosymfollow.sh | 4 + ...n_tests.sh => run_unprivileged_remount.sh} | 0 5 files changed, 225 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/mount/nosymfollow-test.c create mode 100755 tools/testing/selftests/mount/run_nosymfollow.sh rename tools/testing/selftests/mount/{run_tests.sh => run_unprivileged_remount.sh} (100%)
diff --git a/tools/testing/selftests/mount/.gitignore b/tools/testing/selftests/mount/.gitignore index 0bc64a6d4c181..17f2d84151622 100644 --- a/tools/testing/selftests/mount/.gitignore +++ b/tools/testing/selftests/mount/.gitignore @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only unprivileged-remount-test +nosymfollow-test diff --git a/tools/testing/selftests/mount/Makefile b/tools/testing/selftests/mount/Makefile index 026890744215b..2d9454841644a 100644 --- a/tools/testing/selftests/mount/Makefile +++ b/tools/testing/selftests/mount/Makefile @@ -3,7 +3,7 @@ CFLAGS = -Wall \ -O2
-TEST_PROGS := run_tests.sh -TEST_GEN_FILES := unprivileged-remount-test +TEST_PROGS := run_unprivileged_remount.sh run_nosymfollow.sh +TEST_GEN_FILES := unprivileged-remount-test nosymfollow-test
include ../lib.mk diff --git a/tools/testing/selftests/mount/nosymfollow-test.c b/tools/testing/selftests/mount/nosymfollow-test.c new file mode 100644 index 0000000000000..650d6d80a1d27 --- /dev/null +++ b/tools/testing/selftests/mount/nosymfollow-test.c @@ -0,0 +1,218 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include <errno.h> +#include <fcntl.h> +#include <limits.h> +#include <sched.h> +#include <stdarg.h> +#include <stdbool.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/mount.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <sys/vfs.h> +#include <unistd.h> + +#ifndef MS_NOSYMFOLLOW +# define MS_NOSYMFOLLOW 256 /* Do not follow symlinks */ +#endif + +#ifndef ST_NOSYMFOLLOW +# define ST_NOSYMFOLLOW 0x2000 /* Do not follow symlinks */ +#endif + +#define DATA "/tmp/data" +#define LINK "/tmp/symlink" +#define TMP "/tmp" + +static void die(char *fmt, ...) +{ + va_list ap; + + va_start(ap, fmt); + vfprintf(stderr, fmt, ap); + va_end(ap); + exit(EXIT_FAILURE); +} + +static void vmaybe_write_file(bool enoent_ok, char *filename, char *fmt, + va_list ap) +{ + ssize_t written; + char buf[4096]; + int buf_len; + int fd; + + buf_len = vsnprintf(buf, sizeof(buf), fmt, ap); + if (buf_len < 0) + die("vsnprintf failed: %s\n", strerror(errno)); + + if (buf_len >= sizeof(buf)) + die("vsnprintf output truncated\n"); + + fd = open(filename, O_WRONLY); + if (fd < 0) { + if ((errno == ENOENT) && enoent_ok) + return; + die("open of %s failed: %s\n", filename, strerror(errno)); + } + + written = write(fd, buf, buf_len); + if (written != buf_len) { + if (written >= 0) { + die("short write to %s\n", filename); + } else { + die("write to %s failed: %s\n", + filename, strerror(errno)); + } + } + + if (close(fd) != 0) + die("close of %s failed: %s\n", filename, strerror(errno)); +} + +static void maybe_write_file(char *filename, char *fmt, ...) +{ + va_list ap; + + va_start(ap, fmt); + vmaybe_write_file(true, filename, fmt, ap); + va_end(ap); +} + +static void write_file(char *filename, char *fmt, ...) +{ + va_list ap; + + va_start(ap, fmt); + vmaybe_write_file(false, filename, fmt, ap); + va_end(ap); +} + +static void create_and_enter_ns(void) +{ + uid_t uid = getuid(); + gid_t gid = getgid(); + + if (unshare(CLONE_NEWUSER) != 0) + die("unshare(CLONE_NEWUSER) failed: %s\n", strerror(errno)); + + maybe_write_file("/proc/self/setgroups", "deny"); + write_file("/proc/self/uid_map", "0 %d 1", uid); + write_file("/proc/self/gid_map", "0 %d 1", gid); + + if (setgid(0) != 0) + die("setgid(0) failed %s\n", strerror(errno)); + if (setuid(0) != 0) + die("setuid(0) failed %s\n", strerror(errno)); + + if (unshare(CLONE_NEWNS) != 0) + die("unshare(CLONE_NEWNS) failed: %s\n", strerror(errno)); +} + +static void setup_symlink(void) +{ + int data, err; + + data = creat(DATA, O_RDWR); + if (data < 0) + die("creat failed: %s\n", strerror(errno)); + + err = symlink(DATA, LINK); + if (err < 0) + die("symlink failed: %s\n", strerror(errno)); + + if (close(data) != 0) + die("close of %s failed: %s\n", DATA, strerror(errno)); +} + +static void test_link_traversal(bool nosymfollow) +{ + int link; + + link = open(LINK, 0, O_RDWR); + if (nosymfollow) { + if ((link != -1 || errno != ELOOP)) { + die("link traversal unexpected result: %d, %s\n", + link, strerror(errno)); + } + } else { + if (link < 0) + die("link traversal failed: %s\n", strerror(errno)); + + if (close(link) != 0) + die("close of link failed: %s\n", strerror(errno)); + } +} + +static void test_readlink(void) +{ + char buf[4096]; + ssize_t ret; + + bzero(buf, sizeof(buf)); + + ret = readlink(LINK, buf, sizeof(buf)); + if (ret < 0) + die("readlink failed: %s\n", strerror(errno)); + if (strcmp(buf, DATA) != 0) + die("readlink strcmp failed: '%s' '%s'\n", buf, DATA); +} + +static void test_realpath(void) +{ + char *path = realpath(LINK, NULL); + + if (!path) + die("realpath failed: %s\n", strerror(errno)); + if (strcmp(path, DATA) != 0) + die("realpath strcmp failed\n"); + + free(path); +} + +static void test_statfs(bool nosymfollow) +{ + struct statfs buf; + int ret; + + ret = statfs(TMP, &buf); + if (ret) + die("statfs failed: %s\n", strerror(errno)); + + if (nosymfollow) { + if ((buf.f_flags & ST_NOSYMFOLLOW) == 0) + die("ST_NOSYMFOLLOW not set on %s\n", TMP); + } else { + if ((buf.f_flags & ST_NOSYMFOLLOW) != 0) + die("ST_NOSYMFOLLOW set on %s\n", TMP); + } +} + +static void run_tests(bool nosymfollow) +{ + test_link_traversal(nosymfollow); + test_readlink(); + test_realpath(); + test_statfs(nosymfollow); +} + +int main(int argc, char **argv) +{ + create_and_enter_ns(); + + if (mount("testing", TMP, "ramfs", 0, NULL) != 0) + die("mount failed: %s\n", strerror(errno)); + + setup_symlink(); + run_tests(false); + + if (mount("testing", TMP, "ramfs", MS_REMOUNT|MS_NOSYMFOLLOW, NULL) != 0) + die("remount failed: %s\n", strerror(errno)); + + run_tests(true); + + return EXIT_SUCCESS; +} diff --git a/tools/testing/selftests/mount/run_nosymfollow.sh b/tools/testing/selftests/mount/run_nosymfollow.sh new file mode 100755 index 0000000000000..5fbbf03043a2e --- /dev/null +++ b/tools/testing/selftests/mount/run_nosymfollow.sh @@ -0,0 +1,4 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +./nosymfollow-test diff --git a/tools/testing/selftests/mount/run_tests.sh b/tools/testing/selftests/mount/run_unprivileged_remount.sh similarity index 100% rename from tools/testing/selftests/mount/run_tests.sh rename to tools/testing/selftests/mount/run_unprivileged_remount.sh
On Thu, Aug 27, 2020 at 11:09:46AM -0600, Ross Zwisler wrote:
From: Mattias Nissler mnissler@chromium.org
For mounts that have the new "nosymfollow" option, don't follow symlinks when resolving paths. The new option is similar in spirit to the existing "nodev", "noexec", and "nosuid" options, as well as to the LOOKUP_NO_SYMLINKS resolve flag in the openat2(2) syscall. Various BSD variants have been supporting the "nosymfollow" mount option for a long time with equivalent implementations.
Note that symlinks may still be created on file systems mounted with the "nosymfollow" option present. readlink() remains functional, so user space code that is aware of symlinks can still choose to follow them explicitly.
Setting the "nosymfollow" mount option helps prevent privileged writers from modifying files unintentionally in case there is an unexpected link along the accessed path. The "nosymfollow" option is thus useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts.
More information on the history and motivation for this patch can be found here:
https://sites.google.com/a/chromium.org/dev/chromium-os/chromiumos-design-do...
Signed-off-by: Mattias Nissler mnissler@chromium.org Signed-off-by: Ross Zwisler zwisler@google.com Reviewed-by: Aleksa Sarai cyphar@cyphar.com
Changes since v8 [1]:
- Look for MNT_NOSYMFOLLOW in link->mnt->mnt_flags so we are testing the link itself rather than the directory holding the link. (Al Viro)
- Rebased onto v5.9-rc2.
AFAICS, it applies clean to -rc1; what was the rebase about?
On Thu, Aug 27, 2020 at 09:08:01PM +0100, Al Viro wrote:
On Thu, Aug 27, 2020 at 11:09:46AM -0600, Ross Zwisler wrote:
From: Mattias Nissler mnissler@chromium.org
For mounts that have the new "nosymfollow" option, don't follow symlinks when resolving paths. The new option is similar in spirit to the existing "nodev", "noexec", and "nosuid" options, as well as to the LOOKUP_NO_SYMLINKS resolve flag in the openat2(2) syscall. Various BSD variants have been supporting the "nosymfollow" mount option for a long time with equivalent implementations.
Note that symlinks may still be created on file systems mounted with the "nosymfollow" option present. readlink() remains functional, so user space code that is aware of symlinks can still choose to follow them explicitly.
Setting the "nosymfollow" mount option helps prevent privileged writers from modifying files unintentionally in case there is an unexpected link along the accessed path. The "nosymfollow" option is thus useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts.
More information on the history and motivation for this patch can be found here:
https://sites.google.com/a/chromium.org/dev/chromium-os/chromiumos-design-do...
Signed-off-by: Mattias Nissler mnissler@chromium.org Signed-off-by: Ross Zwisler zwisler@google.com Reviewed-by: Aleksa Sarai cyphar@cyphar.com
Changes since v8 [1]:
- Look for MNT_NOSYMFOLLOW in link->mnt->mnt_flags so we are testing the link itself rather than the directory holding the link. (Al Viro)
- Rebased onto v5.9-rc2.
AFAICS, it applies clean to -rc1; what was the rebase about?
Applied (to -rc1) and pushed
On Thu, Aug 27, 2020 at 09:10:15PM +0100, Al Viro wrote:
On Thu, Aug 27, 2020 at 09:08:01PM +0100, Al Viro wrote:
AFAICS, it applies clean to -rc1; what was the rebase about?
Oh, sorry if that was confusing, I just wanted to make sure that it still applied cleanly to the latest -rc so that you didn't hit a merge conflict.
Yes, these patches apply cleanly to both -rc1 and -rc2.
Applied (to -rc1) and pushed
Many thanks!
On Thu, Aug 27, 2020 at 2:25 PM Ross Zwisler zwisler@google.com wrote:
On Thu, Aug 27, 2020 at 09:10:15PM +0100, Al Viro wrote:
On Thu, Aug 27, 2020 at 09:08:01PM +0100, Al Viro wrote: Applied (to -rc1) and pushed
Many thanks!
(apologies for the resend, the previous one had HTML and was rejected by the lists)
Just FYI, here is the related commit in upstream util-linux:
https://github.com/karelzak/util-linux/commit/50a531f667c31d54fbb920d394e600...
and the thread to linux-man, which I will ping when the v5.10 merge window closes:
https://lore.kernel.org/linux-man/CAKgNAkiAkyUjd=cUvASaT2tyhaCdiMF48KA3Ov_1m...
linux-kselftest-mirror@lists.linaro.org