The following sub-tests are failing in seccomp_bpf selftest:
18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed. 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
I did some bisecting and found that the failures started to happen with:
307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Not sure if the test needs to be fixed after this commit, or if the commit is actually introducing an issue. I'll investigate more, unless someone knows already what's going on.
Thanks, -Andrea
On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
The following sub-tests are failing in seccomp_bpf selftest:
18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed. 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
I did some bisecting and found that the failures started to happen with:
307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Not sure if the test needs to be fixed after this commit, or if the commit is actually introducing an issue. I'll investigate more, unless someone knows already what's going on.
Ah thanks for noticing; I will investigate...
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
The following sub-tests are failing in seccomp_bpf selftest:
18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed. 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
I did some bisecting and found that the failures started to happen with:
307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Not sure if the test needs to be fixed after this commit, or if the commit is actually introducing an issue. I'll investigate more, unless someone knows already what's going on.
Ah thanks for noticing; I will investigate...
I just did a quick read through of the test and while I don't understand everything having a failure seems very weird.
I don't understand the comment: /* Tracer will redirect getpid to getppid, and we should die. */
As I think what happens is it the bpf programs loads the signal number. Tests to see if the signal number if GETPPID and allows that system call and causes any other system call to be terminated.
Which being single threaded would seem to cause the kernel to execute the changed code.
How there kernel at that point is having the process exit with anything except SIGSYS I am not immediately seeing.
The logic is the same as that for SECCOMP_RET_TRAP is there a test for that, that is also failing?
How do you run that test anyway?
Eric
On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
The following sub-tests are failing in seccomp_bpf selftest:
18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed. 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
I did some bisecting and found that the failures started to happen with:
307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Not sure if the test needs to be fixed after this commit, or if the commit is actually introducing an issue. I'll investigate more, unless someone knows already what's going on.
Ah thanks for noticing; I will investigate...
I just did a quick read through of the test and while I don't understand everything having a failure seems very weird.
I don't understand the comment: /* Tracer will redirect getpid to getppid, and we should die. */
As I think what happens is it the bpf programs loads the signal number. Tests to see if the signal number if GETPPID and allows that system call and causes any other system call to be terminated.
The test suite runs a series of seccomp filter vs syscalls under tracing, either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the expected behavioral states. It seems that what's happened is that the SIGSYS has suddenly become non-killing:
# RUN TRACE_syscall.ptrace.kill_after ... # seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128) # seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31 # kill_after: Test exited normally instead of by signal (code: 12) # FAIL TRACE_syscall.ptrace.kill_after
i.e. the ptracer no longer sees a dead tracee, which would pass through here:
if (WIFSIGNALED(status) || WIFEXITED(status)) /* Child is dead. Time to go. */ return;
So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e. instead of WIFSIGNALED(stauts) being true, it instead catches a PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process should be getting killed).
Which being single threaded would seem to cause the kernel to execute the changed code.
How there kernel at that point is having the process exit with anything except SIGSYS I am not immediately seeing.
I've run out of time at the moment to debug further, but I've appended my changes to the test, and a brute-force change to kernel/seccomp.c to restore original behavior (though I haven't tested if coredumping works still). I'll return to this in a few hours...
The logic is the same as that for SECCOMP_RET_TRAP is there a test for that, that is also failing?
How do you run that test anyway?
cd tools/testing/selftests/seccomp make seccomp_bpf scp seccomp_bpf target: ssh target ./seccomp_bpf
diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 4d8f44a17727..b6c8c8f8bd69 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -1269,10 +1269,12 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, syscall_rollback(current, current_pt_regs()); /* Trigger a coredump with SIGSYS */ force_sig_seccomp(this_syscall, data, true); - } else { - do_exit(SIGSYS); + do_group_exit(SIGSYS); } - return -1; /* skip the syscall go directly to signal handling */ + if (action == SECCOMP_RET_KILL_THREAD) + do_exit(SIGSYS); + else + do_group_exit(SIGSYS); }
unreachable(); diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 1d64891e6492..8f8c1df885d6 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -1487,7 +1487,7 @@ TEST_F(precedence, log_is_fifth_in_any_order) #define PTRACE_EVENT_SECCOMP 7 #endif
-#define IS_SECCOMP_EVENT(status) ((status >> 16) == PTRACE_EVENT_SECCOMP) +#define PTRACE_EVENT_MASK(status) ((status) >> 16) bool tracer_running; void tracer_stop(int sig) { @@ -1536,17 +1536,34 @@ void start_tracer(struct __test_metadata *_metadata, int fd, pid_t tracee, /* Run until we're shut down. Must assert to stop execution. */ while (tracer_running) { int status; + bool run_callback = true;
if (wait(&status) != tracee) continue; + if (WIFSIGNALED(status) || WIFEXITED(status)) /* Child is dead. Time to go. */ return;
- /* Check if this is a seccomp event. */ - ASSERT_EQ(!ptrace_syscall, IS_SECCOMP_EVENT(status)); + /* Check if we got an expected event. */ + ASSERT_EQ(WIFCONTINUED(status), false); + ASSERT_EQ(WIFSTOPPED(status), true); + ASSERT_EQ(WSTOPSIG(status) & SIGTRAP, SIGTRAP) { + TH_LOG("WSTOPSIG: %d", WSTOPSIG(status)); + } + if (ptrace_syscall) { + EXPECT_EQ(WSTOPSIG(status) & 0x80, 0x80) { + TH_LOG("WSTOPSIG: %d", WSTOPSIG(status)); + run_callback = false; + }; + } else { + EXPECT_EQ(PTRACE_EVENT_MASK(status), PTRACE_EVENT_SECCOMP) { + run_callback = false; + }; + }
- tracer_func(_metadata, tracee, status, args); + if (run_callback) + tracer_func(_metadata, tracee, status, args);
ret = ptrace(ptrace_syscall ? PTRACE_SYSCALL : PTRACE_CONT, tracee, NULL, 0);
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
The following sub-tests are failing in seccomp_bpf selftest:
18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed. 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
I did some bisecting and found that the failures started to happen with:
307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Not sure if the test needs to be fixed after this commit, or if the commit is actually introducing an issue. I'll investigate more, unless someone knows already what's going on.
Ah thanks for noticing; I will investigate...
I just did a quick read through of the test and while I don't understand everything having a failure seems very weird.
I don't understand the comment: /* Tracer will redirect getpid to getppid, and we should die. */
As I think what happens is it the bpf programs loads the signal number. Tests to see if the signal number if GETPPID and allows that system call and causes any other system call to be terminated.
The test suite runs a series of seccomp filter vs syscalls under tracing, either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the expected behavioral states. It seems that what's happened is that the SIGSYS has suddenly become non-killing:
# RUN TRACE_syscall.ptrace.kill_after ... # seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128) # seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31 # kill_after: Test exited normally instead of by signal (code: 12) # FAIL TRACE_syscall.ptrace.kill_after
i.e. the ptracer no longer sees a dead tracee, which would pass through here:
if (WIFSIGNALED(status) || WIFEXITED(status)) /* Child is dead. Time to go. */ return;
So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e. instead of WIFSIGNALED(stauts) being true, it instead catches a PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process should be getting killed).
Oh. This is being ptraced as part of the test?
Yes. The signal started being delivered. As far as that goes that sounds correct.
Ptrace is allowed to intercept even fatal signals. Everything except SIGKILL.
Is this a condition we don't want even ptrace to be able to catch?
I think we can arrange it so that even ptrace can't intercept this signal. I need to sit this problem on the back burner for a few minutes. It is an angle I had not considered.
Is it a problem that the debugger can see the signal if the process does not?
Eric
On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
The following sub-tests are failing in seccomp_bpf selftest:
18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed. 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
I did some bisecting and found that the failures started to happen with:
307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Not sure if the test needs to be fixed after this commit, or if the commit is actually introducing an issue. I'll investigate more, unless someone knows already what's going on.
Ah thanks for noticing; I will investigate...
I just did a quick read through of the test and while I don't understand everything having a failure seems very weird.
I don't understand the comment: /* Tracer will redirect getpid to getppid, and we should die. */
As I think what happens is it the bpf programs loads the signal number. Tests to see if the signal number if GETPPID and allows that system call and causes any other system call to be terminated.
The test suite runs a series of seccomp filter vs syscalls under tracing, either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the expected behavioral states. It seems that what's happened is that the SIGSYS has suddenly become non-killing:
# RUN TRACE_syscall.ptrace.kill_after ... # seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128) # seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31 # kill_after: Test exited normally instead of by signal (code: 12) # FAIL TRACE_syscall.ptrace.kill_after
i.e. the ptracer no longer sees a dead tracee, which would pass through here:
if (WIFSIGNALED(status) || WIFEXITED(status)) /* Child is dead. Time to go. */ return;
So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e. instead of WIFSIGNALED(stauts) being true, it instead catches a PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process should be getting killed).
Oh. This is being ptraced as part of the test?
Yes. The signal started being delivered. As far as that goes that sounds correct.
Ptrace is allowed to intercept even fatal signals. Everything except SIGKILL.
Is this a condition we don't want even ptrace to be able to catch?
I think we can arrange it so that even ptrace can't intercept this signal. I need to sit this problem on the back burner for a few minutes. It is an angle I had not considered.
Is it a problem that the debugger can see the signal if the process does not?
Right, I'm trying to understand that too. However, my neighbor just lost power. :|
What I was in the middle of checking was what ptrace "sees" going through a fatal SIGSYS; my initial debugging attempts were weird.
-Kees
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
The following sub-tests are failing in seccomp_bpf selftest:
18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1) 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after ... 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0) 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0) 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after ... 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed. 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
I did some bisecting and found that the failures started to happen with:
307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Not sure if the test needs to be fixed after this commit, or if the commit is actually introducing an issue. I'll investigate more, unless someone knows already what's going on.
Ah thanks for noticing; I will investigate...
I just did a quick read through of the test and while I don't understand everything having a failure seems very weird.
I don't understand the comment: /* Tracer will redirect getpid to getppid, and we should die. */
As I think what happens is it the bpf programs loads the signal number. Tests to see if the signal number if GETPPID and allows that system call and causes any other system call to be terminated.
The test suite runs a series of seccomp filter vs syscalls under tracing, either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the expected behavioral states. It seems that what's happened is that the SIGSYS has suddenly become non-killing:
# RUN TRACE_syscall.ptrace.kill_after ... # seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128) # seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31 # kill_after: Test exited normally instead of by signal (code: 12) # FAIL TRACE_syscall.ptrace.kill_after
i.e. the ptracer no longer sees a dead tracee, which would pass through here:
if (WIFSIGNALED(status) || WIFEXITED(status)) /* Child is dead. Time to go. */ return;
So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e. instead of WIFSIGNALED(stauts) being true, it instead catches a PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process should be getting killed).
Oh. This is being ptraced as part of the test?
Yes. The signal started being delivered. As far as that goes that sounds correct.
Ptrace is allowed to intercept even fatal signals. Everything except SIGKILL.
Is this a condition we don't want even ptrace to be able to catch?
I think we can arrange it so that even ptrace can't intercept this signal. I need to sit this problem on the back burner for a few minutes. It is an angle I had not considered.
Is it a problem that the debugger can see the signal if the process does not?
Right, I'm trying to understand that too. However, my neighbor just lost power. :|
What I was in the middle of checking was what ptrace "sees" going through a fatal SIGSYS; my initial debugging attempts were weird.
If we don't allow ptrace to see these signals, then it makes it possible for complete_signal to short circuit deliver them and ignore ptrace later on. Which seems nice, and allows for not needing to change sigaction at all in the future.
I don't know if it is strictly necessary. It is not like people using debuggers have complained yet.
I just posted a patch that solves this by setting an extra flag called SA_IMMUTABLE and disabling sigaction and ptrace when the flag is set.
I think that is a simple patch that sorts this out.
Eric
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
Is it a problem that the debugger can see the signal if the process does not?
Right, I'm trying to understand that too. However, my neighbor just lost power. :|
What I was in the middle of checking was what ptrace "sees" going through a fatal SIGSYS; my initial debugging attempts were weird.
Kees have you regained power and had a chance to see my SA_IMMUTABLE patch?
Does what I implemented seem like it will work for you?
I think it is a solid and simple solution to a pair of problems with my change to use the ordinary coredump path for seccomp. But I would very much love to hear it seems reasonable to you, as you were looking at the problem as well.
Eric
On Tue, Nov 02, 2021 at 01:22:19PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
Is it a problem that the debugger can see the signal if the process does not?
Right, I'm trying to understand that too. However, my neighbor just lost power. :|
What I was in the middle of checking was what ptrace "sees" going through a fatal SIGSYS; my initial debugging attempts were weird.
Kees have you regained power and had a chance to see my SA_IMMUTABLE patch?
Apologies; I got busy with other stuff, but I've tested this now. It's happy and I see the expected behaviors again. Note that I used the patch with this change:
-#define SA_IMMUTABLE 0x008000000 +#define SA_IMMUTABLE 0x00800000
Tested-by: Kees Cook keescook@chromium.org
Thanks!
-Kees
Kees Cook keescook@chromium.org writes:
On Tue, Nov 02, 2021 at 01:22:19PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
Kees Cook keescook@chromium.org writes:
On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
Is it a problem that the debugger can see the signal if the process does not?
Right, I'm trying to understand that too. However, my neighbor just lost power. :|
What I was in the middle of checking was what ptrace "sees" going through a fatal SIGSYS; my initial debugging attempts were weird.
Kees have you regained power and had a chance to see my SA_IMMUTABLE patch?
Apologies; I got busy with other stuff, but I've tested this now. It's happy and I see the expected behaviors again. Note that I used the patch with this change:
-#define SA_IMMUTABLE 0x008000000 +#define SA_IMMUTABLE 0x00800000
Tested-by: Kees Cook keescook@chromium.org
Thanks.
And I see you have written a test as well that should keep this kind of bug from happening again.
Eric
As Andy pointed out that there are races between force_sig_info_to_task and sigaction[1] when force_sig_info_task. As Kees discovered[2] ptrace is also able to change these signals.
In the case of seeccomp killing a process with a signal it is a security violation to allow the signal to be caught or manipulated.
Solve this problem by introducing a new flag SA_IMMUTABLE that prevents sigaction and ptrace from modifying these forced signals. This flag is carefully made kernel internal so that no new ABI is introduced.
Longer term I think this can be solved by guaranteeing short circuit delivery of signals in this case. Unfortunately reliable and guaranteed short circuit delivery of these signals is still a ways off from being implemented, tested, and merged. So I have implemented a much simpler alternative for now.
[1] https://lkml.kernel.org/r/b5d52d25-7bde-4030-a7b1-7c6f8ab90660@www.fastmail.... [2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook Cc: stable@vger.kernel.org Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com ---
I have tested this patch and this changed works for me to fix the issue.
I believe this closes all of the races that force_sig_info_to_task has when sigdfl is specified. So this should be enough for anything that needs a guaranteed that userspace can not race with the kernel is handled.
Can folks look this over and see if I missed something? Thank you, Eric
include/linux/signal_types.h | 3 +++ include/uapi/asm-generic/signal-defs.h | 1 + kernel/signal.c | 8 +++++++- 3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/include/linux/signal_types.h b/include/linux/signal_types.h index 34cb28b8f16c..927f7c0e5bff 100644 --- a/include/linux/signal_types.h +++ b/include/linux/signal_types.h @@ -70,6 +70,9 @@ struct ksignal { int sig; };
+/* Used to kill the race between sigaction and forced signals */ +#define SA_IMMUTABLE 0x008000000 + #ifndef __ARCH_UAPI_SA_FLAGS #ifdef SA_RESTORER #define __ARCH_UAPI_SA_FLAGS SA_RESTORER diff --git a/include/uapi/asm-generic/signal-defs.h b/include/uapi/asm-generic/signal-defs.h index fe929e7b77ca..7572f2f46ee8 100644 --- a/include/uapi/asm-generic/signal-defs.h +++ b/include/uapi/asm-generic/signal-defs.h @@ -45,6 +45,7 @@ #define SA_UNSUPPORTED 0x00000400 #define SA_EXPOSE_TAGBITS 0x00000800 /* 0x00010000 used on mips */ +/* 0x00800000 used for internal SA_IMMUTABLE */ /* 0x01000000 used on x86 */ /* 0x02000000 used on x86 */ /* diff --git a/kernel/signal.c b/kernel/signal.c index 6a5e1802b9a2..056a107e3cbc 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1336,6 +1336,7 @@ force_sig_info_to_task(struct kernel_siginfo *info, struct task_struct *t, bool blocked = sigismember(&t->blocked, sig); if (blocked || ignored || sigdfl) { action->sa.sa_handler = SIG_DFL; + action->sa.sa_flags |= SA_IMMUTABLE; if (blocked) { sigdelset(&t->blocked, sig); recalc_sigpending_and_wake(t); @@ -2760,7 +2761,8 @@ bool get_signal(struct ksignal *ksig) if (!signr) break; /* will return 0 */
- if (unlikely(current->ptrace) && signr != SIGKILL) { + if (unlikely(current->ptrace) && (signr != SIGKILL) && + !(sighand->action[signr -1].sa.sa_flags & SA_IMMUTABLE)) { signr = ptrace_signal(signr, &ksig->info); if (!signr) continue; @@ -4110,6 +4112,10 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact) k = &p->sighand->action[sig-1];
spin_lock_irq(&p->sighand->siglock); + if (k->sa.sa_flags & SA_IMMUTABLE) { + spin_unlock_irq(&p->sighand->siglock); + return -EINVAL; + } if (oact) *oact = *k;
On Fri, Oct 29, 2021 at 10:09:04AM -0500, Eric W. Biederman wrote:
As Andy pointed out that there are races between force_sig_info_to_task and sigaction[1] when force_sig_info_task. As Kees discovered[2] ptrace is also able to change these signals.
In the case of seeccomp killing a process with a signal it is a security violation to allow the signal to be caught or manipulated.
Solve this problem by introducing a new flag SA_IMMUTABLE that prevents sigaction and ptrace from modifying these forced signals. This flag is carefully made kernel internal so that no new ABI is introduced.
Longer term I think this can be solved by guaranteeing short circuit delivery of signals in this case. Unfortunately reliable and guaranteed short circuit delivery of these signals is still a ways off from being implemented, tested, and merged. So I have implemented a much simpler alternative for now.
[1] https://lkml.kernel.org/r/b5d52d25-7bde-4030-a7b1-7c6f8ab90660@www.fastmail.... [2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook Cc: stable@vger.kernel.org Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com
FWIW I've tested this patch and I confirm that it fixes the failure that I reported with the seccomp_bpf selftest.
Tested-by: Andrea Righi andrea.righi@canonical.com
Thanks! -Andrea
I have tested this patch and this changed works for me to fix the issue.
I believe this closes all of the races that force_sig_info_to_task has when sigdfl is specified. So this should be enough for anything that needs a guaranteed that userspace can not race with the kernel is handled.
Can folks look this over and see if I missed something? Thank you, Eric
include/linux/signal_types.h | 3 +++ include/uapi/asm-generic/signal-defs.h | 1 + kernel/signal.c | 8 +++++++- 3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/include/linux/signal_types.h b/include/linux/signal_types.h index 34cb28b8f16c..927f7c0e5bff 100644 --- a/include/linux/signal_types.h +++ b/include/linux/signal_types.h @@ -70,6 +70,9 @@ struct ksignal { int sig; }; +/* Used to kill the race between sigaction and forced signals */ +#define SA_IMMUTABLE 0x008000000
#ifndef __ARCH_UAPI_SA_FLAGS #ifdef SA_RESTORER #define __ARCH_UAPI_SA_FLAGS SA_RESTORER diff --git a/include/uapi/asm-generic/signal-defs.h b/include/uapi/asm-generic/signal-defs.h index fe929e7b77ca..7572f2f46ee8 100644 --- a/include/uapi/asm-generic/signal-defs.h +++ b/include/uapi/asm-generic/signal-defs.h @@ -45,6 +45,7 @@ #define SA_UNSUPPORTED 0x00000400 #define SA_EXPOSE_TAGBITS 0x00000800 /* 0x00010000 used on mips */ +/* 0x00800000 used for internal SA_IMMUTABLE */ /* 0x01000000 used on x86 */ /* 0x02000000 used on x86 */ /* diff --git a/kernel/signal.c b/kernel/signal.c index 6a5e1802b9a2..056a107e3cbc 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1336,6 +1336,7 @@ force_sig_info_to_task(struct kernel_siginfo *info, struct task_struct *t, bool blocked = sigismember(&t->blocked, sig); if (blocked || ignored || sigdfl) { action->sa.sa_handler = SIG_DFL;
if (blocked) { sigdelset(&t->blocked, sig); recalc_sigpending_and_wake(t);action->sa.sa_flags |= SA_IMMUTABLE;
@@ -2760,7 +2761,8 @@ bool get_signal(struct ksignal *ksig) if (!signr) break; /* will return 0 */
if (unlikely(current->ptrace) && signr != SIGKILL) {
if (unlikely(current->ptrace) && (signr != SIGKILL) &&
!(sighand->action[signr -1].sa.sa_flags & SA_IMMUTABLE)) { signr = ptrace_signal(signr, &ksig->info); if (!signr) continue;
@@ -4110,6 +4112,10 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact) k = &p->sighand->action[sig-1]; spin_lock_irq(&p->sighand->siglock);
- if (k->sa.sa_flags & SA_IMMUTABLE) {
spin_unlock_irq(&p->sighand->siglock);
return -EINVAL;
- } if (oact) *oact = *k;
2.20.1
Andrea Righi andrea.righi@canonical.com writes:
On Fri, Oct 29, 2021 at 10:09:04AM -0500, Eric W. Biederman wrote:
As Andy pointed out that there are races between force_sig_info_to_task and sigaction[1] when force_sig_info_task. As Kees discovered[2] ptrace is also able to change these signals.
In the case of seeccomp killing a process with a signal it is a security violation to allow the signal to be caught or manipulated.
Solve this problem by introducing a new flag SA_IMMUTABLE that prevents sigaction and ptrace from modifying these forced signals. This flag is carefully made kernel internal so that no new ABI is introduced.
Longer term I think this can be solved by guaranteeing short circuit delivery of signals in this case. Unfortunately reliable and guaranteed short circuit delivery of these signals is still a ways off from being implemented, tested, and merged. So I have implemented a much simpler alternative for now.
[1] https://lkml.kernel.org/r/b5d52d25-7bde-4030-a7b1-7c6f8ab90660@www.fastmail.... [2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook Cc: stable@vger.kernel.org Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com
FWIW I've tested this patch and I confirm that it fixes the failure that I reported with the seccomp_bpf selftest.
Tested-by: Andrea Righi andrea.righi@canonical.com
Sigh. Except for the extra 0 in the definition of SA_IMMUTABLE that caused it to conflict with the x86 specific signal numbers.
Eric
linux-kselftest-mirror@lists.linaro.org