This patch series enable secure computing (system call filtering) on arm64, and contain related enhancements and bug fixes.
This code was tested on ARMv8 fast model with 64-bit/32-bit userspace using * libseccomp v2.1.1 with modifications for arm64, especially its "live" tests: No.20, 21 and 24. * modified version of Kees' seccomp test for 'changing/skipping a syscall' and seccomp() system call * in-house tests for 'changing/skipping a system call' in tracing with ptrace(PTRACE_SYSCALL) (that is, without seccomp)' with and without audit tracing.
Changes v5 -> v6: * rebased to v3.17-rc * changed the interface of changing/skipping a system call from re-writing x8 register [v5 1/3] to using dedicated PTRACE_SET_SYSCALL command [1/6, 2/6] Patch [1/6] contains a checkpatch error around a switch statement, but it won't be fixed as in compat_arch_ptrace(). * added a new system call, seccomp(), for compat task [4/6] * added SIGSYS siginfo for compat task [5/6] * changed to always execute audit exit tracing to avoid OOPs [2/6, 6/6]
Changes v4 -> v5: * rebased to v3.16-rc * add patch [1/3] to allow ptrace to change a system call (please note that this patch should be applied even without seccomp.)
Changes v3 -> v4: * removed the following patch and moved it to "arm64: prerequisites for audit and ftrace" patchset since it is required for audit and ftrace in case of !COMPAT, too. "arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h"
Changes v2 -> v3: * removed unnecessary 'type cast' operations [2/3] * check for a return value (-1) of secure_computing() explicitly [2/3] * aligned with the patch, "arm64: split syscall_trace() into separate functions for enter/exit" [2/3] * changed default of CONFIG_SECCOMP to n [2/3]
Changes v1 -> v2: * added generic seccomp.h for arm64 to utilize it [1,2/3] * changed syscall_trace() to return more meaningful value (-EPERM) on seccomp failure case [2/3] * aligned with the change in "arm64: make a single hook to syscall_trace() for all syscall features" v2 [2/3] * removed is_compat_task() definition from compat.h [3/3]
AKASHI Takahiro (6): arm64: ptrace: add PTRACE_SET_SYSCALL arm64: ptrace: allow tracer to skip a system call asm-generic: add generic seccomp.h for secure computing mode 1 arm64: add seccomp syscall for compat task arm64: add SIGSYS siginfo for compat task arm64: add seccomp support
arch/arm64/Kconfig | 14 ++++++++++++ arch/arm64/include/asm/compat.h | 7 ++++++ arch/arm64/include/asm/ptrace.h | 9 ++++++++ arch/arm64/include/asm/seccomp.h | 25 ++++++++++++++++++++++ arch/arm64/include/asm/unistd.h | 5 ++++- arch/arm64/include/asm/unistd32.h | 3 +++ arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/entry.S | 6 ++++++ arch/arm64/kernel/ptrace.c | 39 +++++++++++++++++++++++++++++++++- arch/arm64/kernel/signal32.c | 8 +++++++ include/asm-generic/seccomp.h | 28 ++++++++++++++++++++++++ 11 files changed, 143 insertions(+), 2 deletions(-) create mode 100644 arch/arm64/include/asm/seccomp.h create mode 100644 include/asm-generic/seccomp.h
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/ptrace.c | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 6913643..49c6174 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -23,6 +23,7 @@
#include <asm/hwcap.h>
+#define PTRACE_SET_SYSCALL 23
/* * PSR bits diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 0310811..8876049 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1077,7 +1077,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) { - return ptrace_request(child, request, addr, data); + int ret; + + switch (request) { + case PTRACE_SET_SYSCALL: + task_pt_regs(child)->syscallno = data; + ret = 0; + break; + default: + ret = ptrace_request(child, request, addr, data); + break; + } + + return ret; }
enum ptrace_syscall_dir {
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
Thanks, I like having this on both arm and arm64. I wonder if other archs should add this option too.
Reviewed-by: Kees Cook keescook@chromium.org
arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/ptrace.c | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 6913643..49c6174 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -23,6 +23,7 @@
#include <asm/hwcap.h>
+#define PTRACE_SET_SYSCALL 23
/*
- PSR bits
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 0310811..8876049 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1077,7 +1077,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
return ptrace_request(child, request, addr, data);
int ret;
switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
}
return ret;
}
enum ptrace_syscall_dir {
1.7.9.5
On 08/22/2014 01:47 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
Thanks, I like having this on both arm and arm64.
Yeah, having this simplified the code of syscall_trace_enter() a bit, but also imposes some restriction on arm64, too.
I wonder if other archs should add this option too.
Do you think so? I assumed that SET_SYSCALL is to be avoided if possible.
I also think that SET_SYSCALL should take an extra argument for a return value just in case of -1 (or we have SKIP_SYSCALL?).
-Takahiro AKASHI
Reviewed-by: Kees Cook keescook@chromium.org
arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/ptrace.c | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 6913643..49c6174 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -23,6 +23,7 @@
#include <asm/hwcap.h>
+#define PTRACE_SET_SYSCALL 23
/*
- PSR bits
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 0310811..8876049 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1077,7 +1077,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
return ptrace_request(child, request, addr, data);
int ret;
switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
}
return ret;
}
enum ptrace_syscall_dir {
-- 1.7.9.5
On Fri, Aug 22, 2014 at 01:19:13AM +0100, AKASHI Takahiro wrote:
On 08/22/2014 01:47 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
Thanks, I like having this on both arm and arm64.
Yeah, having this simplified the code of syscall_trace_enter() a bit, but also imposes some restriction on arm64, too.
I wonder if other archs should add this option too.
Do you think so? I assumed that SET_SYSCALL is to be avoided if possible.
I also think that SET_SYSCALL should take an extra argument for a return value just in case of -1 (or we have SKIP_SYSCALL?).
I think we should propose this as a new request in the generic ptrace code. We can have an architecture-hook for actually setting the syscall, and allow architectures to define their own implementation of the request so they can be moved over one by one.
Will
Kees,
On 08/27/2014 02:46 AM, Will Deacon wrote:
On Fri, Aug 22, 2014 at 01:19:13AM +0100, AKASHI Takahiro wrote:
On 08/22/2014 01:47 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
Thanks, I like having this on both arm and arm64.
Yeah, having this simplified the code of syscall_trace_enter() a bit, but also imposes some restriction on arm64, too.
I wonder if other archs should add this option too.
Do you think so? I assumed that SET_SYSCALL is to be avoided if possible.
I also think that SET_SYSCALL should take an extra argument for a return value just in case of -1 (or we have SKIP_SYSCALL?).
I think we should propose this as a new request in the generic ptrace code. We can have an architecture-hook for actually setting the syscall, and allow architectures to define their own implementation of the request so they can be moved over one by one.
What do you think about this request?
-Takahiro AKASHI
Will
On Tue, Aug 26, 2014 at 10:32 PM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
Kees,
On 08/27/2014 02:46 AM, Will Deacon wrote:
On Fri, Aug 22, 2014 at 01:19:13AM +0100, AKASHI Takahiro wrote:
On 08/22/2014 01:47 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
Thanks, I like having this on both arm and arm64.
Yeah, having this simplified the code of syscall_trace_enter() a bit, but also imposes some restriction on arm64, too.
I wonder if other archs should add this option too.
Do you think so? I assumed that SET_SYSCALL is to be avoided if possible.
I also think that SET_SYSCALL should take an extra argument for a return value just in case of -1 (or we have SKIP_SYSCALL?).
I think we should propose this as a new request in the generic ptrace code. We can have an architecture-hook for actually setting the syscall, and allow architectures to define their own implementation of the request so they can be moved over one by one.
What do you think about this request?
That sounds fine -- it doesn't need to be part of this series. I was just noticing this was a common issue across multiple architectures.
-Kees
If tracer specifies -1 as a syscall number, this traced system call should be skipped with a value in x0 used as a return value. This patch enables this semantics, but there is a restriction here:
when syscall(-1) is issued by user, tracer cannot skip this system call and modify a return value at syscall entry.
In order to ease this flavor, we need to treat whatever value in x0 as a return value, but this might result in a bogus value being returned, especially when tracer doesn't do anything at this syscall. So we always return ENOSYS instead, while we have another chance to change a return value at syscall exit.
Please also note: * syscall entry tracing and syscall exit tracing (ftrace tracepoint and audit) are always executed, if enabled, even when skipping a system call (that is, -1). In this way, we can avoid a potential bug where audit_syscall_entry() might be called without audit_syscall_exit() at the previous system call being called, that would cause OOPs in audit_syscall_entry().
* syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS) is not used in this case, we may neglect the case.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/asm/ptrace.h | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/kernel/ptrace.c | 20 ++++++++++++++++++++ 3 files changed, 32 insertions(+)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 501000f..a58cf62 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -65,6 +65,14 @@ #define COMPAT_PT_TEXT_ADDR 0x10000 #define COMPAT_PT_DATA_ADDR 0x10004 #define COMPAT_PT_TEXT_END_ADDR 0x10008 + +/* + * used to skip a system call when tracer changes its number to -1 + * with ptrace(PTRACE_SET_SYSCALL) + */ +#define RET_SKIP_SYSCALL -1 +#define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1) + #ifndef __ASSEMBLY__
/* sizeof(struct user) for AArch32 */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index f0b5e51..fdd6eae 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -25,6 +25,7 @@ #include <asm/asm-offsets.h> #include <asm/errno.h> #include <asm/esr.h> +#include <asm/ptrace.h> #include <asm/thread_info.h> #include <asm/unistd.h>
@@ -671,6 +672,8 @@ ENDPROC(el0_svc) __sys_trace: mov x0, sp bl syscall_trace_enter + cmp w0, #RET_SKIP_SYSCALL // skip syscall? + b.eq __sys_trace_return_skipped adr lr, __sys_trace_return // return address uxtw scno, w0 // syscall number (possibly new) mov x1, sp // pointer to regs @@ -685,6 +688,7 @@ __sys_trace:
__sys_trace_return: str x0, [sp] // save returned x0 +__sys_trace_return_skipped: // x0 already in regs[0] mov x0, sp bl syscall_trace_exit b ret_to_user diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 8876049..c54dbcc 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) { + unsigned int saved_syscallno = regs->syscallno; + if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
+ if (IS_SKIP_SYSCALL(regs->syscallno)) { + /* + * RESTRICTION: we can't modify a return value of user + * issued syscall(-1) here. In order to ease this flavor, + * we need to treat whatever value in x0 as a return value, + * but this might result in a bogus value being returned. + */ + /* + * NOTE: syscallno may also be set to -1 if fatal signal is + * detected in tracehook_report_syscall_entry(), but since + * a value set to x0 here is not used in this case, we may + * neglect the case. + */ + if (!test_thread_flag(TIF_SYSCALL_TRACE) || + (IS_SKIP_SYSCALL(saved_syscallno))) + regs->regs[0] = -ENOSYS; + } + if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) trace_sys_enter(regs, regs->syscallno);
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
If tracer specifies -1 as a syscall number, this traced system call should be skipped with a value in x0 used as a return value. This patch enables this semantics, but there is a restriction here:
when syscall(-1) is issued by user, tracer cannot skip this system call and modify a return value at syscall entry.
In order to ease this flavor, we need to treat whatever value in x0 as a return value, but this might result in a bogus value being returned, especially when tracer doesn't do anything at this syscall. So we always return ENOSYS instead, while we have another chance to change a return value at syscall exit.
Please also note:
syscall entry tracing and syscall exit tracing (ftrace tracepoint and audit) are always executed, if enabled, even when skipping a system call (that is, -1). In this way, we can avoid a potential bug where audit_syscall_entry() might be called without audit_syscall_exit() at the previous system call being called, that would cause OOPs in audit_syscall_entry().
syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS) is not used in this case, we may neglect the case.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/ptrace.h | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/kernel/ptrace.c | 20 ++++++++++++++++++++ 3 files changed, 32 insertions(+)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 501000f..a58cf62 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -65,6 +65,14 @@ #define COMPAT_PT_TEXT_ADDR 0x10000 #define COMPAT_PT_DATA_ADDR 0x10004 #define COMPAT_PT_TEXT_END_ADDR 0x10008
+/*
- used to skip a system call when tracer changes its number to -1
- with ptrace(PTRACE_SET_SYSCALL)
- */
+#define RET_SKIP_SYSCALL -1 +#define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1)
#ifndef __ASSEMBLY__
/* sizeof(struct user) for AArch32 */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index f0b5e51..fdd6eae 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -25,6 +25,7 @@ #include <asm/asm-offsets.h> #include <asm/errno.h> #include <asm/esr.h> +#include <asm/ptrace.h> #include <asm/thread_info.h> #include <asm/unistd.h>
@@ -671,6 +672,8 @@ ENDPROC(el0_svc) __sys_trace: mov x0, sp bl syscall_trace_enter
cmp w0, #RET_SKIP_SYSCALL // skip syscall?
b.eq __sys_trace_return_skipped adr lr, __sys_trace_return // return address uxtw scno, w0 // syscall number (possibly new) mov x1, sp // pointer to regs
@@ -685,6 +688,7 @@ __sys_trace:
__sys_trace_return: str x0, [sp] // save returned x0 +__sys_trace_return_skipped: // x0 already in regs[0] mov x0, sp bl syscall_trace_exit b ret_to_user diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 8876049..c54dbcc 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
unsigned int saved_syscallno = regs->syscallno;
if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
if (IS_SKIP_SYSCALL(regs->syscallno)) {
/*
* RESTRICTION: we can't modify a return value of user
* issued syscall(-1) here. In order to ease this flavor,
* we need to treat whatever value in x0 as a return value,
* but this might result in a bogus value being returned.
*/
/*
* NOTE: syscallno may also be set to -1 if fatal signal is
* detected in tracehook_report_syscall_entry(), but since
* a value set to x0 here is not used in this case, we may
* neglect the case.
*/
if (!test_thread_flag(TIF_SYSCALL_TRACE) ||
(IS_SKIP_SYSCALL(saved_syscallno)))
regs->regs[0] = -ENOSYS;
}
I don't have a runtime environment yet for arm64, so I can't test this directly myself, so I'm just trying to eyeball this. :)
Once the seccomp logic is added here, I don't think using -2 as a special value will work. Doesn't this mean the Oops is possible by the user issuing a "-2" syscall? As in, if TIF_SYSCALL_WORK is set, and the user passed -2 as the syscall, audit will be called only on entry, and then skipped on exit?
-Kees
if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) trace_sys_enter(regs, regs->syscallno);
-- 1.7.9.5
On 08/22/2014 02:08 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
If tracer specifies -1 as a syscall number, this traced system call should be skipped with a value in x0 used as a return value. This patch enables this semantics, but there is a restriction here:
when syscall(-1) is issued by user, tracer cannot skip this system call and modify a return value at syscall entry.
In order to ease this flavor, we need to treat whatever value in x0 as a return value, but this might result in a bogus value being returned, especially when tracer doesn't do anything at this syscall. So we always return ENOSYS instead, while we have another chance to change a return value at syscall exit.
Please also note:
syscall entry tracing and syscall exit tracing (ftrace tracepoint and audit) are always executed, if enabled, even when skipping a system call (that is, -1). In this way, we can avoid a potential bug where audit_syscall_entry() might be called without audit_syscall_exit() at the previous system call being called, that would cause OOPs in audit_syscall_entry().
syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS) is not used in this case, we may neglect the case.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/ptrace.h | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/kernel/ptrace.c | 20 ++++++++++++++++++++ 3 files changed, 32 insertions(+)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 501000f..a58cf62 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -65,6 +65,14 @@ #define COMPAT_PT_TEXT_ADDR 0x10000 #define COMPAT_PT_DATA_ADDR 0x10004 #define COMPAT_PT_TEXT_END_ADDR 0x10008
+/*
- used to skip a system call when tracer changes its number to -1
- with ptrace(PTRACE_SET_SYSCALL)
- */
+#define RET_SKIP_SYSCALL -1 +#define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1)
#ifndef __ASSEMBLY__
/* sizeof(struct user) for AArch32 */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index f0b5e51..fdd6eae 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -25,6 +25,7 @@ #include <asm/asm-offsets.h> #include <asm/errno.h> #include <asm/esr.h> +#include <asm/ptrace.h> #include <asm/thread_info.h> #include <asm/unistd.h>
@@ -671,6 +672,8 @@ ENDPROC(el0_svc) __sys_trace: mov x0, sp bl syscall_trace_enter
cmp w0, #RET_SKIP_SYSCALL // skip syscall?
b.eq __sys_trace_return_skipped adr lr, __sys_trace_return // return address uxtw scno, w0 // syscall number (possibly new) mov x1, sp // pointer to regs
@@ -685,6 +688,7 @@ __sys_trace:
__sys_trace_return: str x0, [sp] // save returned x0 +__sys_trace_return_skipped: // x0 already in regs[0] mov x0, sp bl syscall_trace_exit b ret_to_user diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 8876049..c54dbcc 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
unsigned int saved_syscallno = regs->syscallno;
if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
if (IS_SKIP_SYSCALL(regs->syscallno)) {
/*
* RESTRICTION: we can't modify a return value of user
* issued syscall(-1) here. In order to ease this flavor,
* we need to treat whatever value in x0 as a return value,
* but this might result in a bogus value being returned.
*/
/*
* NOTE: syscallno may also be set to -1 if fatal signal is
* detected in tracehook_report_syscall_entry(), but since
* a value set to x0 here is not used in this case, we may
* neglect the case.
*/
if (!test_thread_flag(TIF_SYSCALL_TRACE) ||
(IS_SKIP_SYSCALL(saved_syscallno)))
regs->regs[0] = -ENOSYS;
}
I don't have a runtime environment yet for arm64, so I can't test this directly myself, so I'm just trying to eyeball this. :)
Once the seccomp logic is added here, I don't think using -2 as a special value will work. Doesn't this mean the Oops is possible by the user issuing a "-2" syscall? As in, if TIF_SYSCALL_WORK is set, and the user passed -2 as the syscall, audit will be called only on entry, and then skipped on exit?
Oops, you're absolutely right. I didn't think of this case. syscall_trace_enter() should not return a syscallno directly, but always return -1 if syscallno < 0. (except when secure_computing() returns with -1) This also implies that tracehook_report_syscall() should also have a return value.
Will, is this fine with you?
-Takahiro AKASHI
-Kees
if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) trace_sys_enter(regs, regs->syscallno);
-- 1.7.9.5
On Fri, Aug 22, 2014 at 01:35:17AM +0100, AKASHI Takahiro wrote:
On 08/22/2014 02:08 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 8876049..c54dbcc 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
unsigned int saved_syscallno = regs->syscallno;
if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
if (IS_SKIP_SYSCALL(regs->syscallno)) {
/*
* RESTRICTION: we can't modify a return value of user
* issued syscall(-1) here. In order to ease this flavor,
* we need to treat whatever value in x0 as a return value,
* but this might result in a bogus value being returned.
*/
/*
* NOTE: syscallno may also be set to -1 if fatal signal is
* detected in tracehook_report_syscall_entry(), but since
* a value set to x0 here is not used in this case, we may
* neglect the case.
*/
if (!test_thread_flag(TIF_SYSCALL_TRACE) ||
(IS_SKIP_SYSCALL(saved_syscallno)))
regs->regs[0] = -ENOSYS;
}
I don't have a runtime environment yet for arm64, so I can't test this directly myself, so I'm just trying to eyeball this. :)
Once the seccomp logic is added here, I don't think using -2 as a special value will work. Doesn't this mean the Oops is possible by the user issuing a "-2" syscall? As in, if TIF_SYSCALL_WORK is set, and the user passed -2 as the syscall, audit will be called only on entry, and then skipped on exit?
Oops, you're absolutely right. I didn't think of this case. syscall_trace_enter() should not return a syscallno directly, but always return -1 if syscallno < 0. (except when secure_computing() returns with -1) This also implies that tracehook_report_syscall() should also have a return value.
Will, is this fine with you?
Well, the first thing that jumps out at me is why this is being done completely differently for arm64 and arm. I thought adding the new ptrace requests would reconcile the differences?
Will
On 08/27/2014 02:51 AM, Will Deacon wrote:
On Fri, Aug 22, 2014 at 01:35:17AM +0100, AKASHI Takahiro wrote:
On 08/22/2014 02:08 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 8876049..c54dbcc 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
unsigned int saved_syscallno = regs->syscallno;
if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
if (IS_SKIP_SYSCALL(regs->syscallno)) {
/*
* RESTRICTION: we can't modify a return value of user
* issued syscall(-1) here. In order to ease this flavor,
* we need to treat whatever value in x0 as a return value,
* but this might result in a bogus value being returned.
*/
/*
* NOTE: syscallno may also be set to -1 if fatal signal is
* detected in tracehook_report_syscall_entry(), but since
* a value set to x0 here is not used in this case, we may
* neglect the case.
*/
if (!test_thread_flag(TIF_SYSCALL_TRACE) ||
(IS_SKIP_SYSCALL(saved_syscallno)))
regs->regs[0] = -ENOSYS;
}
I don't have a runtime environment yet for arm64, so I can't test this directly myself, so I'm just trying to eyeball this. :)
Once the seccomp logic is added here, I don't think using -2 as a special value will work. Doesn't this mean the Oops is possible by the user issuing a "-2" syscall? As in, if TIF_SYSCALL_WORK is set, and the user passed -2 as the syscall, audit will be called only on entry, and then skipped on exit?
Oops, you're absolutely right. I didn't think of this case. syscall_trace_enter() should not return a syscallno directly, but always return -1 if syscallno < 0. (except when secure_computing() returns with -1) This also implies that tracehook_report_syscall() should also have a return value.
Will, is this fine with you?
Well, the first thing that jumps out at me is why this is being done completely differently for arm64 and arm. I thought adding the new ptrace requests would reconcile the differences?
I'm not sure what portion of my code you mentioned as "completely different", but
1) setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
So, anyhow, its a bit difficult and meaningless to mimic these invalid cases.
2) branching a new label, syscall_trace_return_skip (see entry.S), after syscall_trace_enter() is necessary in order to avoid OOPS in audit_syscall_enter() as we discussed.
Did I make it clear?
-Takahiro AKASHI
Will
On Wed, Aug 27, 2014 at 06:55:46AM +0100, AKASHI Takahiro wrote:
On 08/27/2014 02:51 AM, Will Deacon wrote:
On Fri, Aug 22, 2014 at 01:35:17AM +0100, AKASHI Takahiro wrote:
Oops, you're absolutely right. I didn't think of this case. syscall_trace_enter() should not return a syscallno directly, but always return -1 if syscallno < 0. (except when secure_computing() returns with -1) This also implies that tracehook_report_syscall() should also have a return value.
Will, is this fine with you?
Well, the first thing that jumps out at me is why this is being done completely differently for arm64 and arm. I thought adding the new ptrace requests would reconcile the differences?
I'm not sure what portion of my code you mentioned as "completely different", but
setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
So, anyhow, its a bit difficult and meaningless to mimic these invalid cases.
I'm not suggesting we make ourselves bug-compatible with ARM. Instead, I'd rather see a series of patches getting the ARM code working correctly, before we go off doing something different for arm64.
branching a new label, syscall_trace_return_skip (see entry.S), after syscall_trace_enter() is necessary in order to avoid OOPS in audit_syscall_enter() as we discussed.
Did I make it clear?
Sure. So let's fix ARM, then look at the arm64 port after that. I really want to avoid divergence in this area.
Will
On 09/01/2014 08:37 PM, Will Deacon wrote:
On Wed, Aug 27, 2014 at 06:55:46AM +0100, AKASHI Takahiro wrote:
On 08/27/2014 02:51 AM, Will Deacon wrote:
On Fri, Aug 22, 2014 at 01:35:17AM +0100, AKASHI Takahiro wrote:
Oops, you're absolutely right. I didn't think of this case. syscall_trace_enter() should not return a syscallno directly, but always return -1 if syscallno < 0. (except when secure_computing() returns with -1) This also implies that tracehook_report_syscall() should also have a return value.
Will, is this fine with you?
Well, the first thing that jumps out at me is why this is being done completely differently for arm64 and arm. I thought adding the new ptrace requests would reconcile the differences?
I'm not sure what portion of my code you mentioned as "completely different", but
setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
So, anyhow, its a bit difficult and meaningless to mimic these invalid cases.
I'm not suggesting we make ourselves bug-compatible with ARM. Instead, I'd rather see a series of patches getting the ARM code working correctly, before we go off doing something different for arm64.
I see.
branching a new label, syscall_trace_return_skip (see entry.S), after syscall_trace_enter() is necessary in order to avoid OOPS in audit_syscall_enter() as we discussed.
Did I make it clear?
Sure. So let's fix ARM, then look at the arm64 port after that. I really want to avoid divergence in this area.
Okey, I will start with fixing the issue on arm.
-Takahiro AKASHI
Will
On Wed, Aug 27, 2014 at 02:55:46PM +0900, AKASHI Takahiro wrote:
setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
Two points here:
1. You've found a case which causes a BUG_ON(). Where is the bug report for this, so the problem can be investigated and resolved?
2. What do you mean by "aborted" ?
Please, if you find a problem with 32-bit ARM, report it. Don't hide it, because hiding it can be a security issue or in the case of BUG_ON(), it could be a denial of service issue.
As you're part of Linaro, I would have thought you'd be more responsible in this regard - after all, Linaro is supposed to be about improving the ARM kernel... Maybe I got that wrong, and Linaro is actually about ensuring that the ARM kernel is stuffed full of broken features?
On 09/01/2014 08:47 PM, Russell King - ARM Linux wrote:
On Wed, Aug 27, 2014 at 02:55:46PM +0900, AKASHI Takahiro wrote:
setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
Two points here:
- You've found a case which causes a BUG_ON(). Where is the bug report for this, so the problem can be investigated and resolved?
I think that I mentioned it could also happen on arm somewhere in a talk with Will, but don't remember exactly when.
- What do you mean by "aborted" ?
I mean that the process will receive SIGILL and get aborted. A system call number, like -1 and -3000, won't be trapped by *switch* statement in asm_syscall() and end up with being signaled.
Please, if you find a problem with 32-bit ARM, report it. Don't hide it, because hiding it can be a security issue or in the case of BUG_ON(), it could be a denial of service issue.
As you're part of Linaro, I would have thought you'd be more responsible in this regard - after all, Linaro is supposed to be about improving the ARM kernel... Maybe I got that wrong, and Linaro is actually about ensuring that the ARM kernel is stuffed full of broken features?
I thought my first priority was on arm64 (and then arm), but now that you and Will seem to want to see the fix first on arm, okey, I will start with arm issue.
Thanks, -Takahiro AKASHI
On Tue, Sep 02, 2014 at 05:47:29PM +0900, AKASHI Takahiro wrote:
On 09/01/2014 08:47 PM, Russell King - ARM Linux wrote:
On Wed, Aug 27, 2014 at 02:55:46PM +0900, AKASHI Takahiro wrote:
setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
Two points here:
- You've found a case which causes a BUG_ON(). Where is the bug report for this, so the problem can be investigated and resolved?
I think that I mentioned it could also happen on arm somewhere in a talk with Will, but don't remember exactly when.
Sorry, not good enough. Please report this bug so it can be investigated and fixed.
- What do you mean by "aborted" ?
I mean that the process will receive SIGILL and get aborted. A system call number, like -1 and -3000, won't be trapped by *switch* statement in asm_syscall() and end up with being signaled.
That is correct behaviour - because numbers greater than 0xf0000 (or 0x9f0000 for OABI - the 0x900000 offset on the syscalls on OABI is to distinguish them from syscalls used by RISC OS) are not intended to be Linux syscalls per-se.
Please, if you find a problem with 32-bit ARM, report it. Don't hide it, because hiding it can be a security issue or in the case of BUG_ON(), it could be a denial of service issue.
As you're part of Linaro, I would have thought you'd be more responsible in this regard - after all, Linaro is supposed to be about improving the ARM kernel... Maybe I got that wrong, and Linaro is actually about ensuring that the ARM kernel is stuffed full of broken features?
I thought my first priority was on arm64 (and then arm), but now that you and Will seem to want to see the fix first on arm, okey, I will start with arm issue.
So what you're saying there is that if you find a bug in ARM code, which everyone is currently using, you can ignore it until you've sorted out ARM64 which almost no one is using.
This is absurd, and whoever has set your priorities is clearly on drugs.
On Tue, Sep 02, 2014 at 10:16:22AM +0100, Russell King - ARM Linux wrote:
On Tue, Sep 02, 2014 at 05:47:29PM +0900, AKASHI Takahiro wrote:
On 09/01/2014 08:47 PM, Russell King - ARM Linux wrote:
On Wed, Aug 27, 2014 at 02:55:46PM +0900, AKASHI Takahiro wrote:
setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
Two points here:
- You've found a case which causes a BUG_ON(). Where is the bug report for this, so the problem can be investigated and resolved?
I think that I mentioned it could also happen on arm somewhere in a talk with Will, but don't remember exactly when.
Sorry, not good enough. Please report this bug so it can be investigated and fixed.
I'm going to go further than this, and tell you that you have been downright irresponsible here, and I'm disgusted by your behaviour over this.
You have revealed a potential security problem publically, effectively giving details about how to cause it, but without having first reported it to people who can fix it, nor providing a fix for it.
Why is it a security problem? Although it can't be used to gain information, it can be used potentially to deny service. Any user can trace a task which they own, and then set the task's syscall to -1, which according to you results in a kernel oops.
If the kernel oops happens while holding any locks, that part of the system becomes non-functional and can result in all userland stopping dead.
On 09/02/2014 06:16 PM, Russell King - ARM Linux wrote:
On Tue, Sep 02, 2014 at 05:47:29PM +0900, AKASHI Takahiro wrote:
On 09/01/2014 08:47 PM, Russell King - ARM Linux wrote:
On Wed, Aug 27, 2014 at 02:55:46PM +0900, AKASHI Takahiro wrote:
setting x0 to -ENOSYS is necessary because, otherwise, user-issued syscall(-1) will return a bogus value when audit tracing is on.
Please note that, on arm, not traced traced ------ ------ syscall(-1) aborted OOPs(BUG_ON) syscall(-3000) aborted aborted syscall(1000) ENOSYS ENOSYS
Two points here:
- You've found a case which causes a BUG_ON(). Where is the bug report for this, so the problem can be investigated and resolved?
I think that I mentioned it could also happen on arm somewhere in a talk with Will, but don't remember exactly when.
Sorry, not good enough. Please report this bug so it can be investigated and fixed.
Please review my patch as well as the commit message.
- What do you mean by "aborted" ?
I mean that the process will receive SIGILL and get aborted. A system call number, like -1 and -3000, won't be trapped by *switch* statement in asm_syscall() and end up with being signaled.
That is correct behaviour - because numbers greater than 0xf0000 (or 0x9f0000 for OABI - the 0x900000 offset on the syscalls on OABI is to distinguish them from syscalls used by RISC OS) are not intended to be Linux syscalls per-se.
I tried to make such invalid/pseudo syscalls hehave in the same way whether or not a task is traced (by seccomp, ptrace or audit).
Please, if you find a problem with 32-bit ARM, report it. Don't hide it, because hiding it can be a security issue or in the case of BUG_ON(), it could be a denial of service issue.
As you're part of Linaro, I would have thought you'd be more responsible in this regard - after all, Linaro is supposed to be about improving the ARM kernel... Maybe I got that wrong, and Linaro is actually about ensuring that the ARM kernel is stuffed full of broken features?
I thought my first priority was on arm64 (and then arm), but now that you and Will seem to want to see the fix first on arm, okey, I will start with arm issue.
So what you're saying there is that if you find a bug in ARM code, which everyone is currently using, you can ignore it until you've sorted out ARM64 which almost no one is using.
This is absurd, and whoever has set your priorities is clearly on drugs.
Thank you. I learned a new English word, absurd.
-Takahiro AKASHI
Will,
When I was looking into syscall_trace_exit() more closely, I found another (big) problem. There are two system calls, execve() and rt_sigreturn(), which change 'syscallno' in pt_regs to -1 in start_thread() and restore_sigframe(), respectively.
Since syscallno is not valid anymore in syscall_trace_exit() for these system calls, we cannot create a correct syscall exit record for tracepoint in trace_sys_exit() (=> ftrace_syscall_exit()) and for audit in audit_syscall_exit().
This does not happen on arm because syscall numbers are kept in thread_info on arm.
How can we deal with this issue?
-Takahiro AKASHI
On 08/27/2014 02:51 AM, Will Deacon wrote:
On Fri, Aug 22, 2014 at 01:35:17AM +0100, AKASHI Takahiro wrote:
On 08/22/2014 02:08 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 8876049..c54dbcc 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
unsigned int saved_syscallno = regs->syscallno;
if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
if (IS_SKIP_SYSCALL(regs->syscallno)) {
/*
* RESTRICTION: we can't modify a return value of user
* issued syscall(-1) here. In order to ease this flavor,
* we need to treat whatever value in x0 as a return value,
* but this might result in a bogus value being returned.
*/
/*
* NOTE: syscallno may also be set to -1 if fatal signal is
* detected in tracehook_report_syscall_entry(), but since
* a value set to x0 here is not used in this case, we may
* neglect the case.
*/
if (!test_thread_flag(TIF_SYSCALL_TRACE) ||
(IS_SKIP_SYSCALL(saved_syscallno)))
regs->regs[0] = -ENOSYS;
}
I don't have a runtime environment yet for arm64, so I can't test this directly myself, so I'm just trying to eyeball this. :)
Once the seccomp logic is added here, I don't think using -2 as a special value will work. Doesn't this mean the Oops is possible by the user issuing a "-2" syscall? As in, if TIF_SYSCALL_WORK is set, and the user passed -2 as the syscall, audit will be called only on entry, and then skipped on exit?
Oops, you're absolutely right. I didn't think of this case. syscall_trace_enter() should not return a syscallno directly, but always return -1 if syscallno < 0. (except when secure_computing() returns with -1) This also implies that tracehook_report_syscall() should also have a return value.
Will, is this fine with you?
Well, the first thing that jumps out at me is why this is being done completely differently for arm64 and arm. I thought adding the new ptrace requests would reconcile the differences?
Will
On Wed, Oct 01, 2014 at 12:08:05PM +0100, AKASHI Takahiro wrote:
Will,
When I was looking into syscall_trace_exit() more closely, I found another (big) problem. There are two system calls, execve() and rt_sigreturn(), which change 'syscallno' in pt_regs to -1 in start_thread() and restore_sigframe(), respectively.
Since syscallno is not valid anymore in syscall_trace_exit() for these system calls, we cannot create a correct syscall exit record for tracepoint in trace_sys_exit() (=> ftrace_syscall_exit()) and for audit in audit_syscall_exit().
This does not happen on arm because syscall numbers are kept in thread_info on arm.
How can we deal with this issue?
How is this handled on other architectures? x86, for example, seems to zero orig_ax when restoring the sigcontext, but leaves it alone in start_thread.
What is the impact of this problem? AFAICT, we just miss some exits, right (as opposed to an OOPs or the like)?
Will
On 10/04/2014 12:23 AM, Will Deacon wrote:
On Wed, Oct 01, 2014 at 12:08:05PM +0100, AKASHI Takahiro wrote:
Will,
When I was looking into syscall_trace_exit() more closely, I found another (big) problem. There are two system calls, execve() and rt_sigreturn(), which change 'syscallno' in pt_regs to -1 in start_thread() and restore_sigframe(), respectively.
I need to correct my mis-understandings here:
Since syscallno is not valid anymore in syscall_trace_exit() for these system calls, we cannot create a correct syscall exit record for tracepoint in trace_sys_exit() (=> ftrace_syscall_exit())
This is true, but since rt_sigreturn() doesn't have a syscall tracepoint (and so there is no entry under /sys/kernel/tracing/events/syscalls/), it cannot be traced anyway.
and for audit in audit_syscall_exit().
not true. Since a syscall number is saved as 'major' in a per-thread audit context at audit_syscall_exit(), we will see a correct audit log for both system calls.
This does not happen on arm because syscall numbers are kept in thread_info on arm.
How can we deal with this issue?
How is this handled on other architectures? x86, for example, seems to zero orig_ax when restoring the sigcontext, but leaves it alone in start_thread.
What is the impact of this problem? AFAICT, we just miss some exits, right (as opposed to an OOPs or the like)?
So the impacts here are: 1) We just miss a syscall exit for execve tracepoint (syscalls:sys_exit_execve). (no fatal errors like kernel panic) (FYI, on x86, there is no tracepoint entry for execve nor sigreturn.)
2) From the viewpoint of my seccomp patch, we cannot skip some syscall exit tracing for invalid system calls by adding a check for syscallno in the following way: (I'm not quite sure this might cause a threat with DDoS attach as Russell suggested.)
syscall_trace_exit(struct pt_regs *regs) { if (regs->syscallno < NR_syscalls) { /* Adding this check */ audit_syscall_exit(regs); if (test_thread_flags(TIF_SYSCALL_TRACEPOINT)) trace_sys_exit(regs, regs_return_value(regs)); } ... }
As you can imagine, any system call after execve() will hit BUG_ON() in audit_syscall_exit() since audit_syscall_exit() is not called for execve().
Thanks, -Takahiro AKASHI
Will
Those values (__NR_seccomp_*) are used solely in secure_computing() to identify mode 1 system calls. If compat system calls have different syscall numbers, asm/seccomp.h may override them.
Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- include/asm-generic/seccomp.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 include/asm-generic/seccomp.h
diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h new file mode 100644 index 0000000..5e97022 --- /dev/null +++ b/include/asm-generic/seccomp.h @@ -0,0 +1,28 @@ +/* + * include/asm-generic/seccomp.h + * + * Copyright (C) 2014 Linaro Limited + * Author: AKASHI Takahiro takahiro.akashi@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#ifndef _ASM_GENERIC_SECCOMP_H +#define _ASM_GENERIC_SECCOMP_H + +#include <asm-generic/unistd.h> + +#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32) +#define __NR_seccomp_read_32 __NR_read +#define __NR_seccomp_write_32 __NR_write +#define __NR_seccomp_exit_32 __NR_exit +#define __NR_seccomp_sigreturn_32 __NR_rt_sigreturn +#endif /* CONFIG_COMPAT && ! already defined */ + +#define __NR_seccomp_read __NR_read +#define __NR_seccomp_write __NR_write +#define __NR_seccomp_exit __NR_exit +#define __NR_seccomp_sigreturn __NR_rt_sigreturn + +#endif /* _ASM_GENERIC_SECCOMP_H */
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
Those values (__NR_seccomp_*) are used solely in secure_computing() to identify mode 1 system calls. If compat system calls have different syscall numbers, asm/seccomp.h may override them.
Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
Reviewed-by: Kees Cook keescook@chromium.org
include/asm-generic/seccomp.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 include/asm-generic/seccomp.h
diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h new file mode 100644 index 0000000..5e97022 --- /dev/null +++ b/include/asm-generic/seccomp.h @@ -0,0 +1,28 @@ +/*
- include/asm-generic/seccomp.h
- Copyright (C) 2014 Linaro Limited
- Author: AKASHI Takahiro takahiro.akashi@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#ifndef _ASM_GENERIC_SECCOMP_H +#define _ASM_GENERIC_SECCOMP_H
+#include <asm-generic/unistd.h>
While this isn't a problem for ARM, this should be linux/unistd.h for other architectures to get the right stuff.
+#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32) +#define __NR_seccomp_read_32 __NR_read +#define __NR_seccomp_write_32 __NR_write +#define __NR_seccomp_exit_32 __NR_exit +#define __NR_seccomp_sigreturn_32 __NR_rt_sigreturn +#endif /* CONFIG_COMPAT && ! already defined */
+#define __NR_seccomp_read __NR_read +#define __NR_seccomp_write __NR_write +#define __NR_seccomp_exit __NR_exit +#define __NR_seccomp_sigreturn __NR_rt_sigreturn
Some architectures use __NR_sigreturn, so this will need to be adjusted in the future into:
#ifdef __NR_seccomp_sigreturn #define __NR_seccomp_sigreturn __NR_rt_sigreturn #endif
After these changes, I was able to port x86 to using this asm-generic/seccomp.h too.
-Kees
+#endif /* _ASM_GENERIC_SECCOMP_H */
1.7.9.5
On 08/22/2014 02:51 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
Those values (__NR_seccomp_*) are used solely in secure_computing() to identify mode 1 system calls. If compat system calls have different syscall numbers, asm/seccomp.h may override them.
Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
Reviewed-by: Kees Cook keescook@chromium.org
include/asm-generic/seccomp.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 include/asm-generic/seccomp.h
diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h new file mode 100644 index 0000000..5e97022 --- /dev/null +++ b/include/asm-generic/seccomp.h @@ -0,0 +1,28 @@ +/*
- include/asm-generic/seccomp.h
- Copyright (C) 2014 Linaro Limited
- Author: AKASHI Takahiro takahiro.akashi@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#ifndef _ASM_GENERIC_SECCOMP_H +#define _ASM_GENERIC_SECCOMP_H
+#include <asm-generic/unistd.h>
While this isn't a problem for ARM, this should be linux/unistd.h for other architectures to get the right stuff.
I will fix it.
+#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32) +#define __NR_seccomp_read_32 __NR_read +#define __NR_seccomp_write_32 __NR_write +#define __NR_seccomp_exit_32 __NR_exit +#define __NR_seccomp_sigreturn_32 __NR_rt_sigreturn +#endif /* CONFIG_COMPAT && ! already defined */
+#define __NR_seccomp_read __NR_read +#define __NR_seccomp_write __NR_write +#define __NR_seccomp_exit __NR_exit +#define __NR_seccomp_sigreturn __NR_rt_sigreturn
Some architectures use __NR_sigreturn, so this will need to be adjusted in the future into:
#ifdef __NR_seccomp_sigreturn #define __NR_seccomp_sigreturn __NR_rt_sigreturn #endif
I will fix it.
After these changes, I was able to port x86 to using this asm-generic/seccomp.h too.
Thanks, -Takahiro AKASHI
-Kees
+#endif /* _ASM_GENERIC_SECCOMP_H */
1.7.9.5
This patch allows compat task to issue seccomp() system call.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 4bc95d2..cf6ee31 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -41,7 +41,7 @@ #define __ARM_NR_compat_cacheflush (__ARM_NR_COMPAT_BASE+2) #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)
-#define __NR_compat_syscalls 383 +#define __NR_compat_syscalls 384 #endif
#define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index e242600..2922c40 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -787,3 +787,6 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr) __SYSCALL(__NR_sched_getattr, sys_sched_getattr) #define __NR_renameat2 382 __SYSCALL(__NR_renameat2, sys_renameat2) +#define __NR_seccomp 383 +__SYSCALL(__NR_seccomp, sys_seccomp) +
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
This patch allows compat task to issue seccomp() system call.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 4bc95d2..cf6ee31 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -41,7 +41,7 @@ #define __ARM_NR_compat_cacheflush (__ARM_NR_COMPAT_BASE+2) #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)
-#define __NR_compat_syscalls 383 +#define __NR_compat_syscalls 384 #endif
#define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index e242600..2922c40 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -787,3 +787,6 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr) __SYSCALL(__NR_sched_getattr, sys_sched_getattr) #define __NR_renameat2 382 __SYSCALL(__NR_renameat2, sys_renameat2) +#define __NR_seccomp 383 +__SYSCALL(__NR_seccomp, sys_seccomp)
Nit: this adds a trailing blank line. Other than that:
Reviewed-by: Kees Cook keescook@chromium.org
-Kees
-- 1.7.9.5
On 08/22/2014 02:52 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
This patch allows compat task to issue seccomp() system call.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 4bc95d2..cf6ee31 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -41,7 +41,7 @@ #define __ARM_NR_compat_cacheflush (__ARM_NR_COMPAT_BASE+2) #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)
-#define __NR_compat_syscalls 383 +#define __NR_compat_syscalls 384 #endif
#define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index e242600..2922c40 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -787,3 +787,6 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr) __SYSCALL(__NR_sched_getattr, sys_sched_getattr) #define __NR_renameat2 382 __SYSCALL(__NR_renameat2, sys_renameat2) +#define __NR_seccomp 383 +__SYSCALL(__NR_seccomp, sys_seccomp)
Nit: this adds a trailing blank line. Other than that:
I will fix it. Thanks,
-Takahiro AKASHI
Reviewed-by: Kees Cook keescook@chromium.org
-Kees
-- 1.7.9.5
On Thu, Aug 21, 2014 at 09:56:43AM +0100, AKASHI Takahiro wrote:
This patch allows compat task to issue seccomp() system call.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 4bc95d2..cf6ee31 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -41,7 +41,7 @@ #define __ARM_NR_compat_cacheflush (__ARM_NR_COMPAT_BASE+2) #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5) -#define __NR_compat_syscalls 383 +#define __NR_compat_syscalls 384 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index e242600..2922c40 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -787,3 +787,6 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr) __SYSCALL(__NR_sched_getattr, sys_sched_getattr) #define __NR_renameat2 382 __SYSCALL(__NR_renameat2, sys_renameat2) +#define __NR_seccomp 383 +__SYSCALL(__NR_seccomp, sys_seccomp)
This will need rebasing onto -rc2, as we're hooked up two new compat syscalls recently.
Will
On 08/27/2014 02:53 AM, Will Deacon wrote:
On Thu, Aug 21, 2014 at 09:56:43AM +0100, AKASHI Takahiro wrote:
This patch allows compat task to issue seccomp() system call.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 4bc95d2..cf6ee31 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -41,7 +41,7 @@ #define __ARM_NR_compat_cacheflush (__ARM_NR_COMPAT_BASE+2) #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE+5)
-#define __NR_compat_syscalls 383 +#define __NR_compat_syscalls 384 #endif
#define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index e242600..2922c40 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -787,3 +787,6 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr) __SYSCALL(__NR_sched_getattr, sys_sched_getattr) #define __NR_renameat2 382 __SYSCALL(__NR_renameat2, sys_renameat2) +#define __NR_seccomp 383 +__SYSCALL(__NR_seccomp, sys_seccomp)
This will need rebasing onto -rc2, as we're hooked up two new compat syscalls recently.
Thanks for heads-up. Fixed it.
-Takahiro AKASHI
Will
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SYSINFO specified when this signal is delivered.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 8 ++++++++ 2 files changed, 15 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll; + + /* SIGSYS */ + struct { + compat_uptr_t _call_addr; /* calling user insn */ + int _syscall; /* triggering system call number */ + unsigned int _arch; /* AUDIT_ARCH_* of syscall */ + } _sigsys; } _sifields; } compat_siginfo_t;
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c index 1b9ad02..aa550d6 100644 --- a/arch/arm64/kernel/signal32.c +++ b/arch/arm64/kernel/signal32.c @@ -186,6 +186,14 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from) err |= __put_user(from->si_uid, &to->si_uid); err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, &to->si_ptr); break; +#ifdef __ARCH_SIGSYS + case __SI_SYS: + err |= __put_user((compat_uptr_t)(unsigned long) + from->si_call_addr, &to->si_call_addr); + err |= __put_user(from->si_syscall, &to->si_syscall); + err |= __put_user(from->si_arch, &to->si_arch); + break; +#endif default: /* this is just in case for now ... */ err |= __put_user(from->si_pid, &to->si_pid); err |= __put_user(from->si_uid, &to->si_uid);
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SYSINFO specified when this signal is delivered.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
I'm unable to test this myself, but if you've got the test suite passing in compat mode, then this patch must be correct. :)
Reviewed-by: Kees Cook keescook@chromium.org
-Kees
arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 8 ++++++++ 2 files changed, 15 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll;
/* SIGSYS */
struct {
compat_uptr_t _call_addr; /* calling user insn */
int _syscall; /* triggering system call number */
unsigned int _arch; /* AUDIT_ARCH_* of syscall */
} _sigsys; } _sifields;
} compat_siginfo_t;
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c index 1b9ad02..aa550d6 100644 --- a/arch/arm64/kernel/signal32.c +++ b/arch/arm64/kernel/signal32.c @@ -186,6 +186,14 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from) err |= __put_user(from->si_uid, &to->si_uid); err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, &to->si_ptr); break; +#ifdef __ARCH_SIGSYS
case __SI_SYS:
err |= __put_user((compat_uptr_t)(unsigned long)
from->si_call_addr, &to->si_call_addr);
err |= __put_user(from->si_syscall, &to->si_syscall);
err |= __put_user(from->si_arch, &to->si_arch);
break;
+#endif default: /* this is just in case for now ... */ err |= __put_user(from->si_pid, &to->si_pid); err |= __put_user(from->si_uid, &to->si_uid); -- 1.7.9.5
On 08/22/2014 02:54 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SYSINFO specified when this signal is delivered.
typo: SA_SIGINFO
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
I'm unable to test this myself, but if you've got the test suite passing in compat mode, then this patch must be correct. :)
Thanks. Actually I found this bug when I ran your test programs, TRAP.handler, on 32bit userland.
-Takahiro AKASHI
Reviewed-by: Kees Cook keescook@chromium.org
-Kees
arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 8 ++++++++ 2 files changed, 15 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll;
/* SIGSYS */
struct {
compat_uptr_t _call_addr; /* calling user insn */
int _syscall; /* triggering system call number */
unsigned int _arch; /* AUDIT_ARCH_* of syscall */
} compat_siginfo_t;} _sigsys; } _sifields;
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c index 1b9ad02..aa550d6 100644 --- a/arch/arm64/kernel/signal32.c +++ b/arch/arm64/kernel/signal32.c @@ -186,6 +186,14 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from) err |= __put_user(from->si_uid, &to->si_uid); err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, &to->si_ptr); break; +#ifdef __ARCH_SIGSYS
case __SI_SYS:
err |= __put_user((compat_uptr_t)(unsigned long)
from->si_call_addr, &to->si_call_addr);
err |= __put_user(from->si_syscall, &to->si_syscall);
err |= __put_user(from->si_arch, &to->si_arch);
break;
+#endif default: /* this is just in case for now ... */ err |= __put_user(from->si_pid, &to->si_pid); err |= __put_user(from->si_uid, &to->si_uid); -- 1.7.9.5
On Thu, Aug 21, 2014 at 09:56:44AM +0100, AKASHI Takahiro wrote:
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SYSINFO specified when this signal is delivered.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 8 ++++++++ 2 files changed, 15 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll;
/* SIGSYS */
struct {
compat_uptr_t _call_addr; /* calling user insn */
int _syscall; /* triggering system call number */
unsigned int _arch; /* AUDIT_ARCH_* of syscall */
} _sifields;} _sigsys;
} compat_siginfo_t; diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c index 1b9ad02..aa550d6 100644 --- a/arch/arm64/kernel/signal32.c +++ b/arch/arm64/kernel/signal32.c @@ -186,6 +186,14 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from) err |= __put_user(from->si_uid, &to->si_uid); err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, &to->si_ptr); break; +#ifdef __ARCH_SIGSYS
- case __SI_SYS:
err |= __put_user((compat_uptr_t)(unsigned long)
from->si_call_addr, &to->si_call_addr);
err |= __put_user(from->si_syscall, &to->si_syscall);
err |= __put_user(from->si_arch, &to->si_arch);
break;
+#endif
I think you should drop this #ifdef. We care about whether arch/arm/ defines __ARCH_SIGSYS, not whether arm64 defines it (they both happen to define it anyway).
Will
On 08/27/2014 02:55 AM, Will Deacon wrote:
On Thu, Aug 21, 2014 at 09:56:44AM +0100, AKASHI Takahiro wrote:
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SYSINFO specified when this signal is delivered.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 8 ++++++++ 2 files changed, 15 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll;h
/* SIGSYS */
struct {
compat_uptr_t _call_addr; /* calling user insn */
int _syscall; /* triggering system call number */
unsigned int _arch; /* AUDIT_ARCH_* of syscall */
} _sifields; } compat_siginfo_t;} _sigsys;
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c index 1b9ad02..aa550d6 100644 --- a/arch/arm64/kernel/signal32.c +++ b/arch/arm64/kernel/signal32.c @@ -186,6 +186,14 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from) err |= __put_user(from->si_uid, &to->si_uid); err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, &to->si_ptr); break; +#ifdef __ARCH_SIGSYS
- case __SI_SYS:
err |= __put_user((compat_uptr_t)(unsigned long)
from->si_call_addr, &to->si_call_addr);
err |= __put_user(from->si_syscall, &to->si_syscall);
err |= __put_user(from->si_arch, &to->si_arch);
break;
+#endif
I think you should drop this #ifdef. We care about whether arch/arm/ defines __ARCH_SIGSYS, not whether arm64 defines it (they both happen to define it anyway).
Thanks. Done
-Takahiro AKASHI
Will
secure_computing() is called first in syscall_trace_enter() so that a system call will be aborted quickly without doing succeeding syscall tracing, contrary to other cases, if seccomp rules deny that system call.
On compat task, syscall numbers for system calls allowed in seccomp mode 1 are different from those on normal tasks, and so _NR_seccomp_xxx_32's need to be redefined.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/Kconfig | 14 ++++++++++++++ arch/arm64/include/asm/ptrace.h | 1 + arch/arm64/include/asm/seccomp.h | 25 +++++++++++++++++++++++++ arch/arm64/include/asm/unistd.h | 3 +++ arch/arm64/kernel/entry.S | 2 ++ arch/arm64/kernel/ptrace.c | 5 +++++ 6 files changed, 50 insertions(+) create mode 100644 arch/arm64/include/asm/seccomp.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd4e81a..d6dc436 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -34,6 +34,7 @@ config ARM64 select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_KGDB + select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK select HAVE_C_RECORDMCOUNT select HAVE_CC_STACKPROTECTOR @@ -312,6 +313,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
source "mm/Kconfig"
+config SECCOMP + bool "Enable seccomp to safely compute untrusted bytecode" + ---help--- + This kernel feature is useful for number crunching applications + that may need to compute untrusted bytecode during their + execution. By using pipes or other transports made available to + the process as file descriptors supporting the read/write + syscalls, it's possible to isolate those applications in + their own address space using seccomp. Once seccomp is + enabled via prctl(PR_SET_SECCOMP), it cannot be disabled + and the task is only allowed to execute a few safe syscalls + defined by each seccomp mode. + config XEN_DOM0 def_bool y depends on XEN diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index a58cf62..a844d06 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -71,6 +71,7 @@ * with ptrace(PTRACE_SET_SYSCALL) */ #define RET_SKIP_SYSCALL -1 +#define RET_SKIP_SYSCALL_TRACE -2 #define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1)
#ifndef __ASSEMBLY__ diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h new file mode 100644 index 0000000..c76fac9 --- /dev/null +++ b/arch/arm64/include/asm/seccomp.h @@ -0,0 +1,25 @@ +/* + * arch/arm64/include/asm/seccomp.h + * + * Copyright (C) 2014 Linaro Limited + * Author: AKASHI Takahiro takahiro.akashi@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#ifndef _ASM_SECCOMP_H +#define _ASM_SECCOMP_H + +#include <asm/unistd.h> + +#ifdef CONFIG_COMPAT +#define __NR_seccomp_read_32 __NR_compat_read +#define __NR_seccomp_write_32 __NR_compat_write +#define __NR_seccomp_exit_32 __NR_compat_exit +#define __NR_seccomp_sigreturn_32 __NR_compat_rt_sigreturn +#endif /* CONFIG_COMPAT */ + +#include <asm-generic/seccomp.h> + +#endif /* _ASM_SECCOMP_H */ diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index cf6ee31..7c73059 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -31,6 +31,9 @@ * Compat syscall numbers used by the AArch64 kernel. */ #define __NR_compat_restart_syscall 0 +#define __NR_compat_exit 1 +#define __NR_compat_read 3 +#define __NR_compat_write 4 #define __NR_compat_sigreturn 119 #define __NR_compat_rt_sigreturn 173
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index fdd6eae..d5eb447 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -672,6 +672,8 @@ ENDPROC(el0_svc) __sys_trace: mov x0, sp bl syscall_trace_enter + cmp w0, #RET_SKIP_SYSCALL_TRACE // skip syscall and tracing? + b.eq ret_to_user cmp w0, #RET_SKIP_SYSCALL // skip syscall? b.eq __sys_trace_return_skipped adr lr, __sys_trace_return // return address diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index c54dbcc..4287d68 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -27,6 +27,7 @@ #include <linux/smp.h> #include <linux/ptrace.h> #include <linux/user.h> +#include <linux/seccomp.h> #include <linux/security.h> #include <linux/init.h> #include <linux/signal.h> @@ -1123,6 +1124,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs) { unsigned int saved_syscallno = regs->syscallno;
+ /* Do the secure computing check first; failures should be fast. */ + if (secure_computing(regs->syscallno) == -1) + return RET_SKIP_SYSCALL_TRACE; + if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
linaro-kernel@lists.linaro.org