Hi,
This seven series fix an issue reported by kernel test robot [3].
Shuah, I (as well as Kees and Sean [4]) think this should be in -next really soon to make sure everything works fine for the v6.9 release, which is not currently the case. I cannot test against all kselftests though. I would prefer to let you handle this, but I guess you're not able to do so and I'll push it on my branch without reply from you. Even if I push it on my branch, please push it on yours too as soon as you see this and I'll remove it from mine.
Mark, Jakub, could you please test this series?
As reported by Kernel Test Robot [1] and Sean Christopherson [2], some tests fail since v6.9-rc1 . This is due to the use of vfork() which introduced some side effects. Similarly, while making it more generic, a previous commit made some Landlock file system tests flaky, and subject to the host's file system mount configuration.
This series fixes all these side effects by replacing vfork() with clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory) and makes the Kselftest framework more robust.
I tried different approaches and I found this one to be the cleaner and less invasive for current test cases.
I successfully ran the following tests (using TEST_F and fork/clone/clone3, and KVM_ONE_VCPU_TEST) with this series: - kvm:fix_hypercall_test - kvm:sync_regs_test - kvm:userspace_msr_exit_test - kvm:vmx_pmu_caps_test - landlock:fs_test - landlock:net_test - landlock:ptrace_test - move_mount_set_group:move_mount_set_group_test - net/af_unix:scm_pidfd - perf_events:remove_on_exec - pidfd:pidfd_getfd_test - pidfd:pidfd_setns_test - seccomp:seccomp_bpf - user_events:abi_test
[1] https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com [2] https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com [3] https://lore.kernel.org/r/202405100339.vfBe0t9C-lkp@intel.com [4] https://lore.kernel.org/r/202405061002.01D399877A@keescook
Previous versions: v1: https://lore.kernel.org/r/20240426172252.1862930-1-mic@digikod.net v2: https://lore.kernel.org/r/20240429130931.2394118-1-mic@digikod.net v3: https://lore.kernel.org/r/20240429191911.2552580-1-mic@digikod.net v4: https://lore.kernel.org/r/20240502210926.145539-1-mic@digikod.net v5: https://lore.kernel.org/r/20240503105820.300927-1-mic@digikod.net v6: https://lore.kernel.org/r/20240506165518.474504-1-mic@digikod.net
Regards,
Mickaël Salaün (10): selftests/pidfd: Fix config for pidfd_setns_test selftests/landlock: Fix FS tests when run on a private mount point selftests/harness: Fix fixture teardown selftests/harness: Fix interleaved scheduling leading to race conditions selftests/landlock: Do not allocate memory in fixture data selftests/harness: Constify fixture variants selftests/pidfd: Fix wrong expectation selftests/harness: Share _metadata between forked processes selftests/harness: Fix vfork() side effects selftests/harness: Handle TEST_F()'s explicit exit codes
tools/testing/selftests/kselftest_harness.h | 127 +++++++++++++----- tools/testing/selftests/landlock/fs_test.c | 83 +++++++----- tools/testing/selftests/pidfd/config | 2 + .../selftests/pidfd/pidfd_setns_test.c | 2 +- 4 files changed, 147 insertions(+), 67 deletions(-)
base-commit: e67572cd2204894179d89bd7b984072f19313b03
Required by switch_timens() to open /proc/self/ns/time_for_children.
CONFIG_GENERIC_VDSO_TIME_NS is not available on UML, so pidfd_setns_test cannot be run successfully on this architecture.
Cc: Shuah Khan skhan@linuxfoundation.org Fixes: 2b40c5db73e2 ("selftests/pidfd: add pidfd setns tests") Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Christian Brauner brauner@kernel.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-2-mic@digikod.net --- tools/testing/selftests/pidfd/config | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/pidfd/config b/tools/testing/selftests/pidfd/config index f6f2965e17af..6133524710f7 100644 --- a/tools/testing/selftests/pidfd/config +++ b/tools/testing/selftests/pidfd/config @@ -3,5 +3,7 @@ CONFIG_IPC_NS=y CONFIG_USER_NS=y CONFIG_PID_NS=y CONFIG_NET_NS=y +CONFIG_TIME_NS=y +CONFIG_GENERIC_VDSO_TIME_NS=y CONFIG_CGROUPS=y CONFIG_CHECKPOINT_RESTORE=y
According to the test environment, the mount point of the test's working directory may be shared or not, which changes the visibility of the nested "tmp" mount point for the test's parent process calling umount("tmp").
This was spotted while running tests in containers [1], where mount points are private.
Cc: Günther Noack gnoack@google.com Cc: Shuah Khan skhan@linuxfoundation.org Link: https://github.com/landlock-lsm/landlock-test-tools/pull/4 [1] Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling") Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-3-mic@digikod.net ---
Changes since v1: * Update commit description. --- tools/testing/selftests/landlock/fs_test.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index 9a6036fbf289..46b9effd53e4 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -293,7 +293,15 @@ static void prepare_layout(struct __test_metadata *const _metadata) static void cleanup_layout(struct __test_metadata *const _metadata) { set_cap(_metadata, CAP_SYS_ADMIN); - EXPECT_EQ(0, umount(TMP_DIR)); + if (umount(TMP_DIR)) { + /* + * According to the test environment, the mount point of the + * current directory may be shared or not, which changes the + * visibility of the nested TMP_DIR mount point for the test's + * parent process doing this cleanup. + */ + ASSERT_EQ(EINVAL, errno); + } clear_cap(_metadata, CAP_SYS_ADMIN); EXPECT_EQ(0, remove_path(TMP_DIR)); }
Make sure fixture teardowns are run when test cases failed, including when _metadata->teardown_parent is set to true.
Make sure only one fixture teardown is run per test case, handling the case where the test child forks.
Cc: Jakub Kicinski kuba@kernel.org Cc: Shengyu Li shengyu.li.evgeny@gmail.com Cc: Shuah Khan skhan@linuxfoundation.org Fixes: 72d7cb5c190b ("selftests/harness: Prevent infinite loop due to Assert in FIXTURE_TEARDOWN") Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()") Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-4-mic@digikod.net --- tools/testing/selftests/kselftest_harness.h | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index d98702b6955d..55699a762c45 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -382,7 +382,10 @@ FIXTURE_DATA(fixture_name) self; \ pid_t child = 1; \ int status = 0; \ - bool jmp = false; \ + /* Makes sure there is only one teardown, even when child forks again. */ \ + bool *teardown = mmap(NULL, sizeof(*teardown), \ + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); \ + *teardown = false; \ memset(&self, 0, sizeof(FIXTURE_DATA(fixture_name))); \ if (setjmp(_metadata->env) == 0) { \ /* Use the same _metadata. */ \ @@ -399,15 +402,16 @@ _metadata->exit_code = KSFT_FAIL; \ } \ } \ - else \ - jmp = true; \ if (child == 0) { \ - if (_metadata->setup_completed && !_metadata->teardown_parent && !jmp) \ + if (_metadata->setup_completed && !_metadata->teardown_parent && \ + __sync_bool_compare_and_swap(teardown, false, true)) \ fixture_name##_teardown(_metadata, &self, variant->data); \ _exit(0); \ } \ - if (_metadata->setup_completed && _metadata->teardown_parent) \ + if (_metadata->setup_completed && _metadata->teardown_parent && \ + __sync_bool_compare_and_swap(teardown, false, true)) \ fixture_name##_teardown(_metadata, &self, variant->data); \ + munmap(teardown, sizeof(*teardown)); \ if (!WIFEXITED(status) && WIFSIGNALED(status)) \ /* Forward signal to __wait_for_test(). */ \ kill(getpid(), WTERMSIG(status)); \
Fix a race condition when running several FIXTURE_TEARDOWN() managing the same resource. This fixes a race condition in the Landlock file system tests when creating or unmounting the same directory.
Using clone3() with CLONE_VFORK guarantees that the child and grandchild test processes are sequentially scheduled. This is implemented with a new clone3_vfork() helper replacing the fork() call.
This avoids triggering this error in __wait_for_test(): Test ended in some other way [127]
Cc: Christian Brauner brauner@kernel.org Cc: David S. Miller davem@davemloft.net Cc: Günther Noack gnoack@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Mark Brown broonie@kernel.org Cc: Shuah Khan shuah@kernel.org Cc: Will Drewry wad@chromium.org Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling") Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-5-mic@digikod.net ---
Changes since v2: * Replace __attribute__((__unused__)) with inline for clone3_vfork() (suggested by Kees and Jakub) --- tools/testing/selftests/kselftest_harness.h | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 55699a762c45..9d7178a71c2c 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -66,6 +66,8 @@ #include <sys/wait.h> #include <unistd.h> #include <setjmp.h> +#include <syscall.h> +#include <linux/sched.h>
#include "kselftest.h"
@@ -80,6 +82,17 @@ # define TH_LOG_ENABLED 1 #endif
+/* Wait for the child process to end but without sharing memory mapping. */ +static inline pid_t clone3_vfork(void) +{ + struct clone_args args = { + .flags = CLONE_VFORK, + .exit_signal = SIGCHLD, + }; + + return syscall(__NR_clone3, &args, sizeof(args)); +} + /** * TH_LOG() * @@ -1183,7 +1196,7 @@ void __run_test(struct __fixture_metadata *f, fflush(stdout); fflush(stderr);
- t->pid = fork(); + t->pid = clone3_vfork(); if (t->pid < 0) { ksft_print_msg("ERROR SPAWNING TEST CHILD\n"); t->exit_code = KSFT_FAIL;
On Sat, May 11, 2024 at 07:14:39PM +0200, Mickaël Salaün wrote:
Fix a race condition when running several FIXTURE_TEARDOWN() managing the same resource. This fixes a race condition in the Landlock file system tests when creating or unmounting the same directory.
Using clone3() with CLONE_VFORK guarantees that the child and grandchild test processes are sequentially scheduled. This is implemented with a new clone3_vfork() helper replacing the fork() call.
This is now in mainline and appears to be causing several tests (at least the ptrace vmaccess global_attach test on arm64, possibly also some of the epoll tests) that previously were timed out by the harness to to hang instead. A bisect seems to point at this patch in particular, there was a bunch of discussion of the fallout of these patches but I'm afraid I lost track of it, is there something in flight for this? -next is affected as well from the looks of it.
Log of the ptrace issue (not that it's useful, the job just gets killed by the test runner):
https://lava.sirena.org.uk/scheduler/job/307984
Bisect log:
git bisect start # status: waiting for both good and bad commits # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8 git bisect good e8f897f4afef0031fe618a8e94127a0934896aba # status: waiting for bad commit, 1 good commit known # bad: [a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6] Linux 6.9 git bisect bad a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6 # good: [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel git bisect good 480e035fc4c714fb5536e64ab9db04fedc89e910 # good: [2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c] Merge tag 'hwlock-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux git bisect good 2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c # good: [e858beeddfa3a400844c0e22d2118b3b52f1ea5e] bcachefs: If we run merges at a lower watermark, they must be nonblocking git bisect good e858beeddfa3a400844c0e22d2118b3b52f1ea5e # good: [e43afae4a335ac0bf54c7a8f23ed65dd55449649] Merge tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect good e43afae4a335ac0bf54c7a8f23ed65dd55449649 # good: [09e10499ee6a5a89fc352f25881276398a49596a] Merge tag 'drm-misc-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes git bisect good 09e10499ee6a5a89fc352f25881276398a49596a # good: [3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc] Merge tag 'usb-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb git bisect good 3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc # good: [62788b0f225da1837ad38101112e2c49123470ee] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux git bisect good 62788b0f225da1837ad38101112e2c49123470ee # good: [ed44935c330a2633440e8d2660db3c7538eeaf10] Merge tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi git bisect good ed44935c330a2633440e8d2660db3c7538eeaf10 # good: [c22c3e0753807feee1391a22228b0d5e6ba39b74] Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm git bisect good c22c3e0753807feee1391a22228b0d5e6ba39b74 # good: [b61821bb32c5577272408e1b05e6a0879a64257f] Merge tag 'drm-misc-fixes-2024-05-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes git bisect good b61821bb32c5577272408e1b05e6a0879a64257f # bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33 # bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33 # bad: [3656bc23429a4d539c81b5cb8f17ceeeeca8901a] selftests/landlock: Do not allocate memory in fixture data git bisect bad 3656bc23429a4d539c81b5cb8f17ceeeeca8901a # good: [7e4042abe2ee7c0977fd8bb049a6991b174a5e6f] selftests/landlock: Fix FS tests when run on a private mount point git bisect good 7e4042abe2ee7c0977fd8bb049a6991b174a5e6f # bad: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions git bisect bad a86f18903db9211e265cc130b61adb175b7a4c42 # good: [fff37bd32c7605d93bf900c4c318d56d12000048] selftests/harness: Fix fixture teardown git bisect good fff37bd32c7605d93bf900c4c318d56d12000048 # first bad commit: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions
On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
On Sat, May 11, 2024 at 07:14:39PM +0200, Mickaël Salaün wrote:
Fix a race condition when running several FIXTURE_TEARDOWN() managing the same resource. This fixes a race condition in the Landlock file system tests when creating or unmounting the same directory.
Using clone3() with CLONE_VFORK guarantees that the child and grandchild test processes are sequentially scheduled. This is implemented with a new clone3_vfork() helper replacing the fork() call.
This is now in mainline and appears to be causing several tests (at least the ptrace vmaccess global_attach test on arm64, possibly also some of the epoll tests) that previously were timed out by the harness to to hang instead. A bisect seems to point at this patch in particular, there was a bunch of discussion of the fallout of these patches but I'm afraid I lost track of it, is there something in flight for this? -next is affected as well from the looks of it.
FWIW I'm still seeing this on -rc2...
On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote:
On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
This is now in mainline and appears to be causing several tests (at least the ptrace vmaccess global_attach test on arm64, possibly also some of the epoll tests) that previously were timed out by the harness to to hang instead. A bisect seems to point at this patch in particular, there was a bunch of discussion of the fallout of these patches but I'm afraid I lost track of it, is there something in flight for this? -next is affected as well from the looks of it.
FWIW I'm still seeing this on -rc2...
AFAICT this is due to the switch to using clone3() with CLONE_VFORK to start the test which means we never even call alarm() to set up the timeout for the test, let alone have the signal for it delivered. I'm a confused about how this could ever work, with clone_vfork() the parent shouldn't run until the child execs (which won't happen here) or exits. Since we don't call alarm() until after we started the child we never actually get that far, but even if we reorder things we'll not get the signal for the alarm if the child messes up since the parent is suspended.
I'm not clear what the original race being fixed here was but it seems like we should revert this since the timeout functionality is pretty important?
On Mon, Jun 03, 2024 at 06:22:32PM +0100, Mark Brown wrote:
On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote:
On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
This is now in mainline and appears to be causing several tests (at least the ptrace vmaccess global_attach test on arm64, possibly also some of the epoll tests) that previously were timed out by the harness to to hang instead. A bisect seems to point at this patch in particular, there was a bunch of discussion of the fallout of these patches but I'm afraid I lost track of it, is there something in flight for this? -next is affected as well from the looks of it.
Thanks for the heads up. I warned about not being able to test everything when fixing kselftest last time, but nobody show up. Is there an easy way to run most kselftests? We really need a (more accessible) CI...
FWIW I'm still seeing this on -rc2...
AFAICT this is due to the switch to using clone3() with CLONE_VFORK
I guess it started with the previous vfork() that was later replaced with CLONE_VFORK.
to start the test which means we never even call alarm() to set up the timeout for the test, let alone have the signal for it delivered. I'm a confused about how this could ever work, with clone_vfork() the parent shouldn't run until the child execs (which won't happen here) or exits. Since we don't call alarm() until after we started the child we never actually get that far, but even if we reorder things we'll not get the signal for the alarm if the child messes up since the parent is suspended.
I'm not clear what the original race being fixed here was but it seems like we should revert this since the timeout functionality is pretty important?
It took me a while to fix all the previous issues and it would be much easier to just fix this issue too.
I'm working on it.
On Tue, Jun 04, 2024 at 06:06:48PM +0200, Mickaël Salaün wrote:
Thanks for the heads up. I warned about not being able to test everything when fixing kselftest last time, but nobody show up. Is there an easy way to run most kselftests? We really need a (more accessible) CI...
You can just invoke the top level kselftests Makefile but between things being flaky and runtime requirements there's a bunch of noise there. KernelCI covers a bunch of it and would be my go to, I've got a good chunk of the selftests that actually build and run reliably in my personal CI but it has no visible UI. Part of the issue here might be platform specifics, I'm seeing this on arm64.
FWIW I'm still seeing this on -rc2...
AFAICT this is due to the switch to using clone3() with CLONE_VFORK
I guess it started with the previous vfork() that was later replaced with CLONE_VFORK.
Bisect did seem to point at this commit FWIW, I've not dug into any API differences or anything here. The immediate thing being replaced was a plain fork() though I see it was vfork() at some point before that, and I'd not have noticed if the individual testcases weren't hanging so the timeout was needed.
I'm not clear what the original race being fixed here was but it seems like we should revert this since the timeout functionality is pretty important?
It took me a while to fix all the previous issues and it would be much easier to just fix this issue too.
I'm working on it.
Great, thanks.
Do not allocate self->dir_path in the test process because this would not be visible in the FIXTURE_TEARDOWN() process when relying on fork()/clone3() instead of vfork().
This change is required for a following commit removing vfork() call to not break the layout3_fs.* test cases.
Cc: Günther Noack gnoack@google.com Cc: Shuah Khan skhan@linuxfoundation.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-6-mic@digikod.net ---
Changes since v1: * Extract change from a bigger patch (suggested by Kees). --- tools/testing/selftests/landlock/fs_test.c | 57 +++++++++++++--------- 1 file changed, 35 insertions(+), 22 deletions(-)
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index 46b9effd53e4..1e2cffde02b5 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -9,6 +9,7 @@
#define _GNU_SOURCE #include <fcntl.h> +#include <libgen.h> #include <linux/landlock.h> #include <linux/magic.h> #include <sched.h> @@ -4624,7 +4625,6 @@ FIXTURE(layout3_fs) { bool has_created_dir; bool has_created_file; - char *dir_path; bool skip_test; };
@@ -4683,11 +4683,24 @@ FIXTURE_VARIANT_ADD(layout3_fs, hostfs) { .cwd_fs_magic = HOSTFS_SUPER_MAGIC, };
+static char *dirname_alloc(const char *path) +{ + char *dup; + + if (!path) + return NULL; + + dup = strdup(path); + if (!dup) + return NULL; + + return dirname(dup); +} + FIXTURE_SETUP(layout3_fs) { struct stat statbuf; - const char *slash; - size_t dir_len; + char *dir_path = dirname_alloc(variant->file_path);
if (!supports_filesystem(variant->mnt.type) || !cwd_matches_fs(variant->cwd_fs_magic)) { @@ -4697,25 +4710,15 @@ FIXTURE_SETUP(layout3_fs)
_metadata->teardown_parent = true;
- slash = strrchr(variant->file_path, '/'); - ASSERT_NE(slash, NULL); - dir_len = (size_t)slash - (size_t)variant->file_path; - ASSERT_LT(0, dir_len); - self->dir_path = malloc(dir_len + 1); - self->dir_path[dir_len] = '\0'; - strncpy(self->dir_path, variant->file_path, dir_len); - prepare_layout_opt(_metadata, &variant->mnt);
/* Creates directory when required. */ - if (stat(self->dir_path, &statbuf)) { + if (stat(dir_path, &statbuf)) { set_cap(_metadata, CAP_DAC_OVERRIDE); - EXPECT_EQ(0, mkdir(self->dir_path, 0700)) + EXPECT_EQ(0, mkdir(dir_path, 0700)) { TH_LOG("Failed to create directory "%s": %s", - self->dir_path, strerror(errno)); - free(self->dir_path); - self->dir_path = NULL; + dir_path, strerror(errno)); } self->has_created_dir = true; clear_cap(_metadata, CAP_DAC_OVERRIDE); @@ -4736,6 +4739,8 @@ FIXTURE_SETUP(layout3_fs) self->has_created_file = true; clear_cap(_metadata, CAP_DAC_OVERRIDE); } + + free(dir_path); }
FIXTURE_TEARDOWN(layout3_fs) @@ -4754,16 +4759,17 @@ FIXTURE_TEARDOWN(layout3_fs) }
if (self->has_created_dir) { + char *dir_path = dirname_alloc(variant->file_path); + set_cap(_metadata, CAP_DAC_OVERRIDE); /* * Don't check for error because the directory might already * have been removed (cf. release_inode test). */ - rmdir(self->dir_path); + rmdir(dir_path); clear_cap(_metadata, CAP_DAC_OVERRIDE); + free(dir_path); } - free(self->dir_path); - self->dir_path = NULL;
cleanup_layout(_metadata); } @@ -4830,7 +4836,10 @@ TEST_F_FORK(layout3_fs, tag_inode_dir_mnt)
TEST_F_FORK(layout3_fs, tag_inode_dir_child) { - layer3_fs_tag_inode(_metadata, self, variant, self->dir_path); + char *dir_path = dirname_alloc(variant->file_path); + + layer3_fs_tag_inode(_metadata, self, variant, dir_path); + free(dir_path); }
TEST_F_FORK(layout3_fs, tag_inode_file) @@ -4857,9 +4866,13 @@ TEST_F_FORK(layout3_fs, release_inodes) if (self->has_created_file) EXPECT_EQ(0, remove_path(variant->file_path));
- if (self->has_created_dir) + if (self->has_created_dir) { + char *dir_path = dirname_alloc(variant->file_path); + /* Don't check for error because of cgroup specificities. */ - remove_path(self->dir_path); + remove_path(dir_path); + free(dir_path); + }
ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_DIR, layer1);
FIXTURE_VARIANT_ADD() types are passed as const pointers to FIXTURE_TEARDOWN(). Make that explicit by constifying the variants declarations.
Cc: Shuah Khan skhan@linuxfoundation.org Cc: Will Drewry wad@chromium.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-7-mic@digikod.net ---
Changes since v1: * Extract change from a bigger patch (suggested by Kees). --- tools/testing/selftests/kselftest_harness.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 9d7178a71c2c..201040207c85 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -338,7 +338,7 @@ static inline pid_t clone3_vfork(void) * variant. */ #define FIXTURE_VARIANT_ADD(fixture_name, variant_name) \ - extern FIXTURE_VARIANT(fixture_name) \ + extern const FIXTURE_VARIANT(fixture_name) \ _##fixture_name##_##variant_name##_variant; \ static struct __fixture_variant_metadata \ _##fixture_name##_##variant_name##_object = \ @@ -350,7 +350,7 @@ static inline pid_t clone3_vfork(void) __register_fixture_variant(&_##fixture_name##_fixture_object, \ &_##fixture_name##_##variant_name##_object); \ } \ - FIXTURE_VARIANT(fixture_name) \ + const FIXTURE_VARIANT(fixture_name) \ _##fixture_name##_##variant_name##_variant =
/**
Replace a wrong EXPECT_GT(self->child_pid_exited, 0) with EXPECT_GE(), which will be actually tested on the parent and child sides with a following commit.
Cc: Shuah Khan skhan@linuxfoundation.org Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Christian Brauner brauner@kernel.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-8-mic@digikod.net ---
Changes since v1: * Extract change from a bigger patch (suggested by Kees). --- tools/testing/selftests/pidfd/pidfd_setns_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/pidfd/pidfd_setns_test.c b/tools/testing/selftests/pidfd/pidfd_setns_test.c index 6e2f2cd400ca..47746b0c6acd 100644 --- a/tools/testing/selftests/pidfd/pidfd_setns_test.c +++ b/tools/testing/selftests/pidfd/pidfd_setns_test.c @@ -158,7 +158,7 @@ FIXTURE_SETUP(current_nsset) /* Create task that exits right away. */ self->child_pid_exited = create_child(&self->child_pidfd_exited, CLONE_NEWUSER | CLONE_NEWNET); - EXPECT_GT(self->child_pid_exited, 0); + EXPECT_GE(self->child_pid_exited, 0);
if (self->child_pid_exited == 0) _exit(EXIT_SUCCESS);
Unconditionally share _metadata between all forked processes, which enables to actually catch errors which were previously ignored.
This is required for a following commit replacing vfork() with clone3() and CLONE_VFORK (i.e. not sharing the full memory) . It should also be useful to share _metadata to extend expectations to test process's forks. For instance, this change identified a wrong expectation in pidfd_setns_test.
Because this _metadata is used by the new XFAIL_ADD(), use a global pointer initialized in TEST_F(). This is OK because only XFAIL_ADD() use it, and XFAIL_ADD() already depends on TEST_F().
Cc: Jakub Kicinski kuba@kernel.org Cc: Shuah Khan skhan@linuxfoundation.org Cc: Will Drewry wad@chromium.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-9-mic@digikod.net ---
Changes since v6: * Use a global pointer per TEST_F() and complete build of the __test_xfail object in the xfail constructor to fix XFAIL_ADD(): https://lore.kernel.org/r/202405100339.vfBe0t9C-lkp@intel.com/
Changes since v4: * Reset _metadata's aborted and setup_completed fields.
Changes since v1: * Extract change from a bigger patch (suggested by Kees). --- tools/testing/selftests/kselftest_harness.h | 26 ++++++++++++--------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 201040207c85..28415798fa60 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -430,19 +430,19 @@ static inline pid_t clone3_vfork(void) kill(getpid(), WTERMSIG(status)); \ __test_check_assert(_metadata); \ } \ - static struct __test_metadata \ - _##fixture_name##_##test_name##_object = { \ - .name = #test_name, \ - .fn = &wrapper_##fixture_name##_##test_name, \ - .fixture = &_##fixture_name##_fixture_object, \ - .termsig = signal, \ - .timeout = tmout, \ - .teardown_parent = false, \ - }; \ + static struct __test_metadata *_##fixture_name##_##test_name##_object; \ static void __attribute__((constructor)) \ _register_##fixture_name##_##test_name(void) \ { \ - __register_test(&_##fixture_name##_##test_name##_object); \ + struct __test_metadata *object = mmap(NULL, sizeof(*object), \ + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); \ + object->name = #test_name; \ + object->fn = &wrapper_##fixture_name##_##test_name; \ + object->fixture = &_##fixture_name##_fixture_object; \ + object->termsig = signal; \ + object->timeout = tmout; \ + _##fixture_name##_##test_name##_object = object; \ + __register_test(object); \ } \ static void fixture_name##_##test_name( \ struct __test_metadata __attribute__((unused)) *_metadata, \ @@ -850,11 +850,12 @@ struct __test_xfail { { \ .fixture = &_##fixture_name##_fixture_object, \ .variant = &_##fixture_name##_##variant_name##_object, \ - .test = &_##fixture_name##_##test_name##_object, \ }; \ static void __attribute__((constructor)) \ _register_##fixture_name##_##variant_name##_##test_name##_xfail(void) \ { \ + _##fixture_name##_##variant_name##_##test_name##_xfail.test = \ + _##fixture_name##_##test_name##_object; \ __register_xfail(&_##fixture_name##_##variant_name##_##test_name##_xfail); \ }
@@ -1181,6 +1182,9 @@ void __run_test(struct __fixture_metadata *f, /* reset test struct */ t->exit_code = KSFT_PASS; t->trigger = 0; + t->aborted = false; + t->setup_completed = false; + memset(t->env, 0, sizeof(t->env)); memset(t->results->reason, 0, sizeof(t->results->reason));
if (asprintf(&test_name, "%s%s%s.%s", f->name,
Setting the time namespace with CLONE_NEWTIME returns -EUSERS if the calling thread shares memory with another thread (because of the shared vDSO), which is the case when it is created with vfork().
Fix pidfd_setns_test by replacing test harness's vfork() call with a clone3() call with CLONE_VFORK, and an explicit sharing of the _metadata and self objects.
Replace _metadata->teardown_parent with a new FIXTURE_TEARDOWN_PARENT() helper that can replace FIXTURE_TEARDOWN(). This is a cleaner approach and it enables to selectively share the fixture data between the child process running tests and the parent process running the fixture teardown. This also avoids updating several tests to not rely on the self object's copy-on-write property (e.g. storing the returned value of a fork() call).
Cc: Christian Brauner brauner@kernel.org Cc: David S. Miller davem@davemloft.net Cc: Günther Noack gnoack@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Mark Brown broonie@kernel.org Cc: Shuah Khan shuah@kernel.org Cc: Will Drewry wad@chromium.org Reported-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()") Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-10-mic@digikod.net ---
Changes since v1: * Split changes (suggested by Kees). * Improve documentation. * Remove the static fixture_name##_teardown_parent initialisation to false (as suggested by checkpatch.pl). --- tools/testing/selftests/kselftest_harness.h | 66 ++++++++++++++++----- tools/testing/selftests/landlock/fs_test.c | 16 ++--- 2 files changed, 57 insertions(+), 25 deletions(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 28415798fa60..cbedb4a6cf7b 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -294,6 +294,32 @@ static inline pid_t clone3_vfork(void) * A bare "return;" statement may be used to return early. */ #define FIXTURE_TEARDOWN(fixture_name) \ + static const bool fixture_name##_teardown_parent; \ + __FIXTURE_TEARDOWN(fixture_name) + +/** + * FIXTURE_TEARDOWN_PARENT() + * *_metadata* is included so that EXPECT_*, ASSERT_* etc. work correctly. + * + * @fixture_name: fixture name + * + * .. code-block:: c + * + * FIXTURE_TEARDOWN_PARENT(fixture_name) { implementation } + * + * Same as FIXTURE_TEARDOWN() but run this code in a parent process. This + * enables the test process to drop its privileges without impacting the + * related FIXTURE_TEARDOWN_PARENT() (e.g. to remove files from a directory + * where write access was dropped). + * + * To make it possible for the parent process to use *self*, share (MAP_SHARED) + * the fixture data between all forked processes. + */ +#define FIXTURE_TEARDOWN_PARENT(fixture_name) \ + static const bool fixture_name##_teardown_parent = true; \ + __FIXTURE_TEARDOWN(fixture_name) + +#define __FIXTURE_TEARDOWN(fixture_name) \ void fixture_name##_teardown( \ struct __test_metadata __attribute__((unused)) *_metadata, \ FIXTURE_DATA(fixture_name) __attribute__((unused)) *self, \ @@ -368,10 +394,11 @@ static inline pid_t clone3_vfork(void) * Very similar to TEST() except that *self* is the setup instance of fixture's * datatype exposed for use by the implementation. * - * The @test_name code is run in a separate process sharing the same memory - * (i.e. vfork), which means that the test process can update its privileges - * without impacting the related FIXTURE_TEARDOWN() (e.g. to remove files from - * a directory where write access was dropped). + * The _metadata object is shared (MAP_SHARED) with all the potential forked + * processes, which enables them to use EXCEPT_*() and ASSERT_*(). + * + * The *self* object is only shared with the potential forked processes if + * FIXTURE_TEARDOWN_PARENT() is used instead of FIXTURE_TEARDOWN(). */ #define TEST_F(fixture_name, test_name) \ __TEST_F_IMPL(fixture_name, test_name, -1, TEST_TIMEOUT_DEFAULT) @@ -392,39 +419,49 @@ static inline pid_t clone3_vfork(void) struct __fixture_variant_metadata *variant) \ { \ /* fixture data is alloced, setup, and torn down per call. */ \ - FIXTURE_DATA(fixture_name) self; \ + FIXTURE_DATA(fixture_name) self_private, *self = NULL; \ pid_t child = 1; \ int status = 0; \ /* Makes sure there is only one teardown, even when child forks again. */ \ bool *teardown = mmap(NULL, sizeof(*teardown), \ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); \ *teardown = false; \ - memset(&self, 0, sizeof(FIXTURE_DATA(fixture_name))); \ + if (sizeof(*self) > 0) { \ + if (fixture_name##_teardown_parent) { \ + self = mmap(NULL, sizeof(*self), PROT_READ | PROT_WRITE, \ + MAP_SHARED | MAP_ANONYMOUS, -1, 0); \ + } else { \ + memset(&self_private, 0, sizeof(self_private)); \ + self = &self_private; \ + } \ + } \ if (setjmp(_metadata->env) == 0) { \ - /* Use the same _metadata. */ \ - child = vfork(); \ + /* _metadata and potentially self are shared with all forks. */ \ + child = clone3_vfork(); \ if (child == 0) { \ - fixture_name##_setup(_metadata, &self, variant->data); \ + fixture_name##_setup(_metadata, self, variant->data); \ /* Let setup failure terminate early. */ \ if (_metadata->exit_code) \ _exit(0); \ _metadata->setup_completed = true; \ - fixture_name##_##test_name(_metadata, &self, variant->data); \ + fixture_name##_##test_name(_metadata, self, variant->data); \ } else if (child < 0 || child != waitpid(child, &status, 0)) { \ ksft_print_msg("ERROR SPAWNING TEST GRANDCHILD\n"); \ _metadata->exit_code = KSFT_FAIL; \ } \ } \ if (child == 0) { \ - if (_metadata->setup_completed && !_metadata->teardown_parent && \ + if (_metadata->setup_completed && !fixture_name##_teardown_parent && \ __sync_bool_compare_and_swap(teardown, false, true)) \ - fixture_name##_teardown(_metadata, &self, variant->data); \ + fixture_name##_teardown(_metadata, self, variant->data); \ _exit(0); \ } \ - if (_metadata->setup_completed && _metadata->teardown_parent && \ + if (_metadata->setup_completed && fixture_name##_teardown_parent && \ __sync_bool_compare_and_swap(teardown, false, true)) \ - fixture_name##_teardown(_metadata, &self, variant->data); \ + fixture_name##_teardown(_metadata, self, variant->data); \ munmap(teardown, sizeof(*teardown)); \ + if (self && fixture_name##_teardown_parent) \ + munmap(self, sizeof(*self)); \ if (!WIFEXITED(status) && WIFSIGNALED(status)) \ /* Forward signal to __wait_for_test(). */ \ kill(getpid(), WTERMSIG(status)); \ @@ -898,7 +935,6 @@ struct __test_metadata { bool timed_out; /* did this test timeout instead of exiting? */ bool aborted; /* stopped test due to failed ASSERT */ bool setup_completed; /* did setup finish? */ - bool teardown_parent; /* run teardown in a parent process */ jmp_buf env; /* for exiting out of test early */ struct __test_results *results; struct __test_metadata *prev, *next; diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index 1e2cffde02b5..27744524df51 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -286,8 +286,6 @@ static void prepare_layout_opt(struct __test_metadata *const _metadata,
static void prepare_layout(struct __test_metadata *const _metadata) { - _metadata->teardown_parent = true; - prepare_layout_opt(_metadata, &mnt_tmp); }
@@ -316,7 +314,7 @@ FIXTURE_SETUP(layout0) prepare_layout(_metadata); }
-FIXTURE_TEARDOWN(layout0) +FIXTURE_TEARDOWN_PARENT(layout0) { cleanup_layout(_metadata); } @@ -379,7 +377,7 @@ FIXTURE_SETUP(layout1) create_layout1(_metadata); }
-FIXTURE_TEARDOWN(layout1) +FIXTURE_TEARDOWN_PARENT(layout1) { remove_layout1(_metadata);
@@ -3692,7 +3690,7 @@ FIXTURE_SETUP(ftruncate) create_file(_metadata, file1_s1d1); }
-FIXTURE_TEARDOWN(ftruncate) +FIXTURE_TEARDOWN_PARENT(ftruncate) { EXPECT_EQ(0, remove_path(file1_s1d1)); cleanup_layout(_metadata); @@ -3870,7 +3868,7 @@ FIXTURE_SETUP(layout1_bind) clear_cap(_metadata, CAP_SYS_ADMIN); }
-FIXTURE_TEARDOWN(layout1_bind) +FIXTURE_TEARDOWN_PARENT(layout1_bind) { /* umount(dir_s2d2)) is handled by namespace lifetime. */
@@ -4275,7 +4273,7 @@ FIXTURE_SETUP(layout2_overlay) clear_cap(_metadata, CAP_SYS_ADMIN); }
-FIXTURE_TEARDOWN(layout2_overlay) +FIXTURE_TEARDOWN_PARENT(layout2_overlay) { if (self->skip_test) SKIP(return, "overlayfs is not supported (teardown)"); @@ -4708,8 +4706,6 @@ FIXTURE_SETUP(layout3_fs) SKIP(return, "this filesystem is not supported (setup)"); }
- _metadata->teardown_parent = true; - prepare_layout_opt(_metadata, &variant->mnt);
/* Creates directory when required. */ @@ -4743,7 +4739,7 @@ FIXTURE_SETUP(layout3_fs) free(dir_path); }
-FIXTURE_TEARDOWN(layout3_fs) +FIXTURE_TEARDOWN_PARENT(layout3_fs) { if (self->skip_test) SKIP(return, "this filesystem is not supported (teardown)");
If TEST_F() explicitly calls exit(code) with code different than 0, then _metadata->exit_code is set to this code (e.g. KVM_ONE_VCPU_TEST()). We need to keep in mind that _metadata->exit_code can be KSFT_SKIP while the process exit code is 0.
Cc: Jakub Kicinski kuba@kernel.org Cc: Kees Cook keescook@chromium.org Cc: Mark Brown broonie@kernel.org Cc: Shuah Khan shuah@kernel.org Cc: Will Drewry wad@chromium.org Reported-by: Sean Christopherson seanjc@google.com Tested-by: Sean Christopherson seanjc@google.com Closes: https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()") Signed-off-by: Mickaël Salaün mic@digikod.net Link: https://lore.kernel.org/r/20240511171445.904356-11-mic@digikod.net ---
Changes since v5: * Update commit message as suggested by Sean.
Changes since v4: * Check abort status when the grandchild exited. * Keep the _exit(0) calls because _metadata->exit_code is always checked. * Only set _metadata->exit_code to WEXITSTATUS() if it is not zero.
Changes since v3: * New patch mainly from Sean Christopherson. --- tools/testing/selftests/kselftest_harness.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index cbedb4a6cf7b..3c8f2965c285 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -462,9 +462,13 @@ static inline pid_t clone3_vfork(void) munmap(teardown, sizeof(*teardown)); \ if (self && fixture_name##_teardown_parent) \ munmap(self, sizeof(*self)); \ - if (!WIFEXITED(status) && WIFSIGNALED(status)) \ + if (WIFEXITED(status)) { \ + if (WEXITSTATUS(status)) \ + _metadata->exit_code = WEXITSTATUS(status); \ + } else if (WIFSIGNALED(status)) { \ /* Forward signal to __wait_for_test(). */ \ kill(getpid(), WTERMSIG(status)); \ + } \ __test_check_assert(_metadata); \ } \ static struct __test_metadata *_##fixture_name##_##test_name##_object; \
linux-kselftest-mirror@lists.linaro.org