# Sorry for updating the patch so slowly.
This patch series enables secure computing (system call filtering) on arm64, and contains related enhancements and bug fixes.
NOTE: This versions contain a workaround against possible BUG_ON() failure at audit_syscall_exit(), but doesn't contain an extra optimization, as I submitted for arm, of excluding syscall enter/exit tracing against invalid system calls due to an issue that I reported in: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-October/292170.ht...
The code was tested on ARMv8 fast model with 64-bit/32-bit userspace using: * libseccomp v2.1.1 with modifications for arm64, especially its "live" tests: No.20, 21 and 24. * modified version of Kees' seccomp test for 'changing/skipping a syscall' and seccomp() system call * in-house tests for 'changing/skipping a system call' by tracing with ptrace(PTRACE_SYSCALL) (that is, not via seccomp filter)' with and without audit tracing.
Changes v6 -> v7: * simplified the condition of checking for user-issued syscall(-1) at syscall_trace_enter() [2/6] * defines __NR_seccomp_sigreturn only if arch-specific def doesn't exist. As Kees suggests, this is necessary for x86 and others. [3/6] * removed "#ifdef __ARCH_SIGSYS" which is always true on arm64. [5/6] * changed to call syscall_trace_exit() even if secure_computing fails. [6/6] In v6, syscall_trace_enter() returns RET_SYSCALL_SKIP_TRACE (== -2) and skips syscall_trace_exit() to minimize the overhead, but this case can be easily confused with user-issued (and invalid) syscall(-2). Anyway, this is now a consistent behavior with arm and other archs.
Changes v5 -> v6: * rebased to v3.17-rc * changed the interface of changing/skipping a system call from re-writing x8 register [v5 1/3] to using dedicated PTRACE_SET_SYSCALL command [1/6, 2/6] Patch [1/6] contains a checkpatch error around a switch statement, but it won't be fixed as in compat_arch_ptrace(). * added a new system call, seccomp(), for compat task [4/6] * added SIGSYS siginfo for compat task [5/6] * changed to always execute audit exit tracing to avoid OOPs [2/6, 6/6]
Changes v4 -> v5: * rebased to v3.16-rc * add patch [1/3] to allow ptrace to change a system call (please note that this patch should be applied even without seccomp.)
Changes v3 -> v4: * removed the following patch and moved it to "arm64: prerequisites for audit and ftrace" patchset since it is required for audit and ftrace in case of !COMPAT, too. "arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h"
Changes v2 -> v3: * removed unnecessary 'type cast' operations [2/3] * check for a return value (-1) of secure_computing() explicitly [2/3] * aligned with the patch, "arm64: split syscall_trace() into separate functions for enter/exit" [2/3] * changed default of CONFIG_SECCOMP to n [2/3]
Changes v1 -> v2: * added generic seccomp.h for arm64 to utilize it [1,2/3] * changed syscall_trace() to return more meaningful value (-EPERM) on seccomp failure case [2/3] * aligned with the change in "arm64: make a single hook to syscall_trace() for all syscall features" v2 [2/3] * removed is_compat_task() definition from compat.h [3/3]
AKASHI Takahiro (6): arm64: ptrace: add PTRACE_SET_SYSCALL arm64: ptrace: allow tracer to skip a system call asm-generic: add generic seccomp.h for secure computing mode 1 arm64: add seccomp syscall for compat task arm64: add SIGSYS siginfo for compat task arm64: add seccomp support
arch/arm64/Kconfig | 14 ++++++++++++ arch/arm64/include/asm/compat.h | 7 ++++++ arch/arm64/include/asm/ptrace.h | 8 +++++++ arch/arm64/include/asm/seccomp.h | 25 ++++++++++++++++++++ arch/arm64/include/asm/unistd.h | 3 +++ arch/arm64/include/asm/unistd32.h | 3 ++- arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/kernel/ptrace.c | 42 ++++++++++++++++++++++++++++++++-- arch/arm64/kernel/signal32.c | 6 +++++ include/asm-generic/seccomp.h | 30 ++++++++++++++++++++++++ 11 files changed, 140 insertions(+), 3 deletions(-) create mode 100644 arch/arm64/include/asm/seccomp.h create mode 100644 include/asm-generic/seccomp.h
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/ptrace.c | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 6913643..49c6174 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -23,6 +23,7 @@
#include <asm/hwcap.h>
+#define PTRACE_SET_SYSCALL 23
/* * PSR bits diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index fe63ac5..2842f9f 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1082,7 +1082,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) { - return ptrace_request(child, request, addr, data); + int ret; + + switch (request) { + case PTRACE_SET_SYSCALL: + task_pt_regs(child)->syscallno = data; + ret = 0; + break; + default: + ret = ptrace_request(child, request, addr, data); + break; + } + + return ret; }
enum ptrace_syscall_dir {
Hi Akashi,
On Thu, Oct 02, 2014 at 10:46:11AM +0100, AKASHI Takahiro wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/ptrace.c | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 6913643..49c6174 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -23,6 +23,7 @@ #include <asm/hwcap.h> +#define PTRACE_SET_SYSCALL 23 /*
- PSR bits
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index fe63ac5..2842f9f 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1082,7 +1082,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
- return ptrace_request(child, request, addr, data);
- int ret;
- switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
- }
- return ret;
}
I still don't understand why this needs to be in arch-specific code. Can't we implement this in generic code and get architectures to implement something like syscall_set_nr if they want the generic interface?
Will
On Wed, Oct 8, 2014 at 9:13 AM, Will Deacon will.deacon@arm.com wrote:
Hi Akashi,
On Thu, Oct 02, 2014 at 10:46:11AM +0100, AKASHI Takahiro wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/ptrace.c | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 6913643..49c6174 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -23,6 +23,7 @@
#include <asm/hwcap.h>
+#define PTRACE_SET_SYSCALL 23
/*
- PSR bits
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index fe63ac5..2842f9f 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1082,7 +1082,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
return ptrace_request(child, request, addr, data);
int ret;
switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
}
return ret;
}
I still don't understand why this needs to be in arch-specific code. Can't we implement this in generic code and get architectures to implement something like syscall_set_nr if they want the generic interface?
Personally, I'd rather see this land as-is in the arm64 tree, and then later optimize PTRACE_SET_SYSCALL out of arm/ and arm64/, especially since only these architectures implement this at the moment.
This is my plan for the asm-generic seccomp.h too -- I'd rather avoid touching other architectures in this series, as it's easier to review this way. Then we can optimize the code in a separate series, which will have those changes isolated, etc.
-Kees
On 10/09/2014 12:30 AM, Kees Cook wrote:
On Wed, Oct 8, 2014 at 9:13 AM, Will Deacon will.deacon@arm.com wrote:
Hi Akashi,
On Thu, Oct 02, 2014 at 10:46:11AM +0100, AKASHI Takahiro wrote:
To allow tracer to be able to change/skip a system call by re-writing a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case later on in syscall_trace_enter(), or (2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to tracer as well as that secure_computing() expects a changed syscall number to be visible, especially case of -1, before this function returns in syscall_trace_enter(), we'd better take (2).
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/uapi/asm/ptrace.h | 1 + arch/arm64/kernel/ptrace.c | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 6913643..49c6174 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -23,6 +23,7 @@
#include <asm/hwcap.h>
+#define PTRACE_SET_SYSCALL 23
/*
- PSR bits
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index fe63ac5..2842f9f 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1082,7 +1082,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
return ptrace_request(child, request, addr, data);
int ret;
switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
}
}return ret;
I still don't understand why this needs to be in arch-specific code. Can't we implement this in generic code and get architectures to implement something like syscall_set_nr if they want the generic interface?
Personally, I'd rather see this land as-is in the arm64 tree, and then later optimize PTRACE_SET_SYSCALL out of arm/ and arm64/, especially since only these architectures implement this at the moment.
+1 :)
-Takahiro AKASHI
This is my plan for the asm-generic seccomp.h too -- I'd rather avoid touching other architectures in this series, as it's easier to review this way. Then we can optimize the code in a separate series, which will have those changes isolated, etc.
-Kees
On Wed, Oct 08, 2014 at 04:30:18PM +0100, Kees Cook wrote:
On Wed, Oct 8, 2014 at 9:13 AM, Will Deacon will.deacon@arm.com wrote:
On Thu, Oct 02, 2014 at 10:46:11AM +0100, AKASHI Takahiro wrote:
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index fe63ac5..2842f9f 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1082,7 +1082,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
return ptrace_request(child, request, addr, data);
int ret;
switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
}
return ret;
}
I still don't understand why this needs to be in arch-specific code. Can't we implement this in generic code and get architectures to implement something like syscall_set_nr if they want the generic interface?
Personally, I'd rather see this land as-is in the arm64 tree, and then later optimize PTRACE_SET_SYSCALL out of arm/ and arm64/, especially since only these architectures implement this at the moment.
Why? It should be really straightforward to do this in core code from the get-go and experience shows that, if we don't do it now, it will never happen.
This is my plan for the asm-generic seccomp.h too -- I'd rather avoid touching other architectures in this series, as it's easier to review this way. Then we can optimize the code in a separate series, which will have those changes isolated, etc.
But this doesn't need to touch any other architectures...
Will
Hi Will, Kees
#Sorry for this late ping,
On 10/09/2014 06:23 PM, Will Deacon wrote:
On Wed, Oct 08, 2014 at 04:30:18PM +0100, Kees Cook wrote:
On Wed, Oct 8, 2014 at 9:13 AM, Will Deacon will.deacon@arm.com wrote:
On Thu, Oct 02, 2014 at 10:46:11AM +0100, AKASHI Takahiro wrote:
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index fe63ac5..2842f9f 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1082,7 +1082,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
return ptrace_request(child, request, addr, data);
int ret;
switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
}
}return ret;
I still don't understand why this needs to be in arch-specific code. Can't we implement this in generic code and get architectures to implement something like syscall_set_nr if they want the generic interface?
Personally, I'd rather see this land as-is in the arm64 tree, and then later optimize PTRACE_SET_SYSCALL out of arm/ and arm64/, especially since only these architectures implement this at the moment.
Why? It should be really straightforward to do this in core code from the get-go and experience shows that, if we don't do it now, it will never happen.
How should I deal with this issue? I would be able to go either way.
Other than that, I will submit v8 patch series with a few very minor updates: - use compat_uint_t in struct compat_siginfo - use a new call interface of secure_computing(void) - modify and clarify comments in syscall_trace_enter()
Thanks, -Takahiro AKASHI
This is my plan for the asm-generic seccomp.h too -- I'd rather avoid touching other architectures in this series, as it's easier to review this way. Then we can optimize the code in a separate series, which will have those changes isolated, etc.
But this doesn't need to touch any other architectures...
Will
On Wed, Nov 5, 2014 at 6:40 PM, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
Hi Will, Kees
#Sorry for this late ping,
On 10/09/2014 06:23 PM, Will Deacon wrote:
On Wed, Oct 08, 2014 at 04:30:18PM +0100, Kees Cook wrote:
On Wed, Oct 8, 2014 at 9:13 AM, Will Deacon will.deacon@arm.com wrote:
On Thu, Oct 02, 2014 at 10:46:11AM +0100, AKASHI Takahiro wrote:
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index fe63ac5..2842f9f 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1082,7 +1082,19 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task) long arch_ptrace(struct task_struct *child, long request, unsigned long addr, unsigned long data) {
return ptrace_request(child, request, addr, data);
int ret;
switch (request) {
case PTRACE_SET_SYSCALL:
task_pt_regs(child)->syscallno = data;
ret = 0;
break;
default:
ret = ptrace_request(child, request, addr, data);
break;
}
}return ret;
I still don't understand why this needs to be in arch-specific code. Can't we implement this in generic code and get architectures to implement something like syscall_set_nr if they want the generic interface?
Personally, I'd rather see this land as-is in the arm64 tree, and then later optimize PTRACE_SET_SYSCALL out of arm/ and arm64/, especially since only these architectures implement this at the moment.
Why? It should be really straightforward to do this in core code from the get-go and experience shows that, if we don't do it now, it will never happen.
How should I deal with this issue? I would be able to go either way.
It sounds like Will would be happiest with PTRACE_SET_SYSCALL being extracted from arm/ and arm64/ so I'd recommend doing that. It could maybe be its own patch series, too.
Other than that, I will submit v8 patch series with a few very minor updates:
- use compat_uint_t in struct compat_siginfo
- use a new call interface of secure_computing(void)
- modify and clarify comments in syscall_trace_enter()
Sounds great, thank you!
-Kees
Thanks, -Takahiro AKASHI
This is my plan for the asm-generic seccomp.h too -- I'd rather avoid touching other architectures in this series, as it's easier to review this way. Then we can optimize the code in a separate series, which will have those changes isolated, etc.
But this doesn't need to touch any other architectures...
Will
If tracer specifies -1 as a syscall number, this traced system call should be skipped with a value in x0 used as a return value. This patch implements this semantics, but there is one restriction here:
when syscall(-1) is issued by user, tracer cannot skip this system call and modify a return value at syscall entry.
In order to ease this flavor, we need to take whatever value x0 has as a return value, but this might result in a bogus value being returned, especially when tracer doesn't do anything against this syscall. So we always return ENOSYS instead, while we still have another chance to change a return value at syscall exit.
Please also note: * syscall entry tracing and syscall exit tracing (ftrace tracepoint and audit) are always executed, if enabled, even when skipping a system call (that is, -1). In this way, we can avoid a potential bug where audit_syscall_entry() might be called without audit_syscall_exit() at the previous system call being called, that would cause OOPs in audit_syscall_entry().
* syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS) is not used in this case, we may neglect the case.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/asm/ptrace.h | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/kernel/ptrace.c | 23 ++++++++++++++++++++++- 3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 41ed9e1..736ebc3 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -65,6 +65,14 @@ #define COMPAT_PT_TEXT_ADDR 0x10000 #define COMPAT_PT_DATA_ADDR 0x10004 #define COMPAT_PT_TEXT_END_ADDR 0x10008 + +/* + * System call will be skipped if a syscall number is changed to -1 + * with ptrace(PTRACE_SET_SYSCALL). + * Upper 32-bit should be ignored for safe check. + */ +#define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1) + #ifndef __ASSEMBLY__
/* sizeof(struct user) for AArch32 */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index f0b5e51..b53a1c5 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -25,6 +25,7 @@ #include <asm/asm-offsets.h> #include <asm/errno.h> #include <asm/esr.h> +#include <asm/ptrace.h> #include <asm/thread_info.h> #include <asm/unistd.h>
@@ -671,6 +672,8 @@ ENDPROC(el0_svc) __sys_trace: mov x0, sp bl syscall_trace_enter + cmp w0, #-1 // skip the syscall? + b.eq __sys_trace_return_skipped adr lr, __sys_trace_return // return address uxtw scno, w0 // syscall number (possibly new) mov x1, sp // pointer to regs @@ -685,6 +688,7 @@ __sys_trace:
__sys_trace_return: str x0, [sp] // save returned x0 +__sys_trace_return_skipped: mov x0, sp bl syscall_trace_exit b ret_to_user diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 2842f9f..6b11c6a 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1126,6 +1126,8 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) { + unsigned int orig_syscallno = regs->syscallno; + if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
@@ -1133,7 +1135,26 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs) trace_sys_enter(regs, regs->syscallno);
audit_syscall_entry(syscall_get_arch(), regs->syscallno, - regs->orig_x0, regs->regs[1], regs->regs[2], regs->regs[3]); + regs->orig_x0, regs->regs[1], + regs->regs[2], regs->regs[3]); + + if (IS_SKIP_SYSCALL(regs->syscallno) && + IS_SKIP_SYSCALL(orig_syscallno)) { + /* + * For compatibility, we handles user-issued syscall(-1). + * + * RESTRICTION: we can't modify a return value here in this + * specific case. In order to ease this flavor, we have to + * take whatever value x0 has as a return value, but this + * might result in a bogus value being returned. + * + * NOTE: syscallno may also be set to -1 if fatal signal + * is detected in tracehook_report_syscall(ENTRY), + * but since a value set to x0 here is not used in this + * case, we may neglect the case. + */ + regs->regs[0] = -ENOSYS; + }
return regs->syscallno; }
On Thu, Oct 02, 2014 at 10:46:12AM +0100, AKASHI Takahiro wrote:
If tracer specifies -1 as a syscall number, this traced system call should be skipped with a value in x0 used as a return value. This patch implements this semantics, but there is one restriction here:
when syscall(-1) is issued by user, tracer cannot skip this system call and modify a return value at syscall entry.
In order to ease this flavor, we need to take whatever value x0 has as a return value, but this might result in a bogus value being returned, especially when tracer doesn't do anything against this syscall. So we always return ENOSYS instead, while we still have another chance to change a return value at syscall exit.
Please also note:
syscall entry tracing and syscall exit tracing (ftrace tracepoint and audit) are always executed, if enabled, even when skipping a system call (that is, -1). In this way, we can avoid a potential bug where audit_syscall_entry() might be called without audit_syscall_exit() at the previous system call being called, that would cause OOPs in audit_syscall_entry().
syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS) is not used in this case, we may neglect the case.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/ptrace.h | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/kernel/ptrace.c | 23 ++++++++++++++++++++++- 3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 41ed9e1..736ebc3 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -65,6 +65,14 @@ #define COMPAT_PT_TEXT_ADDR 0x10000 #define COMPAT_PT_DATA_ADDR 0x10004 #define COMPAT_PT_TEXT_END_ADDR 0x10008
+/*
- System call will be skipped if a syscall number is changed to -1
- with ptrace(PTRACE_SET_SYSCALL).
- Upper 32-bit should be ignored for safe check.
- */
+#define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1)
I don't think this macro is very useful, especially considering that we already use ~0UL explicitly in other places. Just move the comment into syscall_trace_enter and be done with it. I also don't think you need the mask (the cast is enough).
#ifndef __ASSEMBLY__ /* sizeof(struct user) for AArch32 */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index f0b5e51..b53a1c5 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -25,6 +25,7 @@ #include <asm/asm-offsets.h> #include <asm/errno.h> #include <asm/esr.h> +#include <asm/ptrace.h> #include <asm/thread_info.h> #include <asm/unistd.h> @@ -671,6 +672,8 @@ ENDPROC(el0_svc) __sys_trace: mov x0, sp bl syscall_trace_enter
- cmp w0, #-1 // skip the syscall?
- b.eq __sys_trace_return_skipped adr lr, __sys_trace_return // return address uxtw scno, w0 // syscall number (possibly new) mov x1, sp // pointer to regs
@@ -685,6 +688,7 @@ __sys_trace: __sys_trace_return: str x0, [sp] // save returned x0 +__sys_trace_return_skipped: mov x0, sp bl syscall_trace_exit b ret_to_user diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 2842f9f..6b11c6a 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1126,6 +1126,8 @@ static void tracehook_report_syscall(struct pt_regs *regs, asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
- unsigned int orig_syscallno = regs->syscallno;
- if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
@@ -1133,7 +1135,26 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs) trace_sys_enter(regs, regs->syscallno); audit_syscall_entry(syscall_get_arch(), regs->syscallno,
regs->orig_x0, regs->regs[1], regs->regs[2], regs->regs[3]);
regs->orig_x0, regs->regs[1],
regs->regs[2], regs->regs[3]);
- if (IS_SKIP_SYSCALL(regs->syscallno) &&
IS_SKIP_SYSCALL(orig_syscallno)) {
/*
* For compatibility, we handles user-issued syscall(-1).
Compatibility with what? arch/arm/?
*
* RESTRICTION: we can't modify a return value here in this
* specific case. In order to ease this flavor, we have to
* take whatever value x0 has as a return value, but this
* might result in a bogus value being returned.
This comment isn't helping me. Are we returning a bogus value or not? If so, why is that acceptable?
* NOTE: syscallno may also be set to -1 if fatal signal
* is detected in tracehook_report_syscall(ENTRY),
* but since a value set to x0 here is not used in this
* case, we may neglect the case.
*/
I think can you remove thise NOTE, it's not very informative.
Will
On 10/08/2014 11:23 PM, Will Deacon wrote:
On Thu, Oct 02, 2014 at 10:46:12AM +0100, AKASHI Takahiro wrote:
If tracer specifies -1 as a syscall number, this traced system call should be skipped with a value in x0 used as a return value. This patch implements this semantics, but there is one restriction here:
when syscall(-1) is issued by user, tracer cannot skip this system call and modify a return value at syscall entry.
In order to ease this flavor, we need to take whatever value x0 has as a return value, but this might result in a bogus value being returned, especially when tracer doesn't do anything against this syscall. So we always return ENOSYS instead, while we still have another chance to change a return value at syscall exit.
Please also note:
syscall entry tracing and syscall exit tracing (ftrace tracepoint and audit) are always executed, if enabled, even when skipping a system call (that is, -1). In this way, we can avoid a potential bug where audit_syscall_entry() might be called without audit_syscall_exit() at the previous system call being called, that would cause OOPs in audit_syscall_entry().
syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS) is not used in this case, we may neglect the case.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/ptrace.h | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/kernel/ptrace.c | 23 ++++++++++++++++++++++- 3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 41ed9e1..736ebc3 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -65,6 +65,14 @@ #define COMPAT_PT_TEXT_ADDR 0x10000 #define COMPAT_PT_DATA_ADDR 0x10004 #define COMPAT_PT_TEXT_END_ADDR 0x10008
+/*
- System call will be skipped if a syscall number is changed to -1
- with ptrace(PTRACE_SET_SYSCALL).
- Upper 32-bit should be ignored for safe check.
- */
+#define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1)
I don't think this macro is very useful, especially considering that we already use ~0UL explicitly in other places. Just move the comment into syscall_trace_enter and be done with it. I also don't think you need the mask (the cast is enough).
I remember it was necessary for compat PTRACE_SET_SYSCALL, but will double-check it anyway.
#ifndef __ASSEMBLY__
/* sizeof(struct user) for AArch32 */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index f0b5e51..b53a1c5 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -25,6 +25,7 @@ #include <asm/asm-offsets.h> #include <asm/errno.h> #include <asm/esr.h> +#include <asm/ptrace.h> #include <asm/thread_info.h> #include <asm/unistd.h>
@@ -671,6 +672,8 @@ ENDPROC(el0_svc) __sys_trace: mov x0, sp bl syscall_trace_enter
- cmp w0, #-1 // skip the syscall?
- b.eq __sys_trace_return_skipped adr lr, __sys_trace_return // return address uxtw scno, w0 // syscall number (possibly new) mov x1, sp // pointer to regs
@@ -685,6 +688,7 @@ __sys_trace:
__sys_trace_return: str x0, [sp] // save returned x0 +__sys_trace_return_skipped: mov x0, sp bl syscall_trace_exit b ret_to_user diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 2842f9f..6b11c6a 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1126,6 +1126,8 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
- unsigned int orig_syscallno = regs->syscallno;
- if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
@@ -1133,7 +1135,26 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs) trace_sys_enter(regs, regs->syscallno);
audit_syscall_entry(syscall_get_arch(), regs->syscallno,
regs->orig_x0, regs->regs[1], regs->regs[2], regs->regs[3]);
regs->orig_x0, regs->regs[1],
regs->regs[2], regs->regs[3]);
- if (IS_SKIP_SYSCALL(regs->syscallno) &&
IS_SKIP_SYSCALL(orig_syscallno)) {
/*
* For compatibility, we handles user-issued syscall(-1).
Compatibility with what? arch/arm/?
with the case where a process is *not* traced (including audit).
*
* RESTRICTION: we can't modify a return value here in this
* specific case. In order to ease this flavor, we have to
* take whatever value x0 has as a return value, but this
* might result in a bogus value being returned.
This comment isn't helping me. Are we returning a bogus value or not? If so, why is that acceptable?
I mean that syscall(-1) always returns -1 with ENOSYS.
Let's think about the case that we didn't have this 'if' statement. If a debugger catches an user-issued syscall(-1), but let it go without doing anything (especially changing a value in x0), this syscall will return an original value in x0, which is the first argument of syscall(-1). I mentioned this as "bogus." In this way, a traced process would see a different behavior of syscall(-1). (On arm, this doesn't happen because syscall(-1) is supposed to raise SIGILL.) (On x86, this doesn't happen, probably, because syscall arguments are passed via a stack and we can set a default return value in a register to ENOSYS.)
To avoid this incompatibility, there is no way but to always return -1 in this path because the kernel doesn't know whether a debugger let x0 unchanged on purpose or not. This is also the reason why I wanted to have a dedicated ptrace command to set a return value in skipping a system call.
If we don't care about such erroneous (and exceptional) behaviors, we don't need this 'if' statement.
Did I make it clear?
* NOTE: syscallno may also be set to -1 if fatal signal
* is detected in tracehook_report_syscall(ENTRY),
* but since a value set to x0 here is not used in this
* case, we may neglect the case.
*/
I think can you remove thise NOTE, it's not very informative.
Okey. Also remove descriptions from a commit message.
-Takahiro AKASHI
Will
On Thu, Oct 09, 2014 at 05:29:33AM +0100, AKASHI Takahiro wrote:
On 10/08/2014 11:23 PM, Will Deacon wrote:
On Thu, Oct 02, 2014 at 10:46:12AM +0100, AKASHI Takahiro wrote:
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 41ed9e1..736ebc3 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -65,6 +65,14 @@ #define COMPAT_PT_TEXT_ADDR 0x10000 #define COMPAT_PT_DATA_ADDR 0x10004 #define COMPAT_PT_TEXT_END_ADDR 0x10008
+/*
- System call will be skipped if a syscall number is changed to -1
- with ptrace(PTRACE_SET_SYSCALL).
- Upper 32-bit should be ignored for safe check.
- */
+#define IS_SKIP_SYSCALL(no) ((int)(no & 0xffffffff) == -1)
I don't think this macro is very useful, especially considering that we already use ~0UL explicitly in other places. Just move the comment into syscall_trace_enter and be done with it. I also don't think you need the mask (the cast is enough).
I remember it was necessary for compat PTRACE_SET_SYSCALL, but will double-check it anyway.
Ok, thanks.
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 2842f9f..6b11c6a 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -1126,6 +1126,8 @@ static void tracehook_report_syscall(struct pt_regs *regs,
asmlinkage int syscall_trace_enter(struct pt_regs *regs) {
- unsigned int orig_syscallno = regs->syscallno;
- if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
@@ -1133,7 +1135,26 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs) trace_sys_enter(regs, regs->syscallno);
audit_syscall_entry(syscall_get_arch(), regs->syscallno,
regs->orig_x0, regs->regs[1], regs->regs[2], regs->regs[3]);
regs->orig_x0, regs->regs[1],
regs->regs[2], regs->regs[3]);
- if (IS_SKIP_SYSCALL(regs->syscallno) &&
IS_SKIP_SYSCALL(orig_syscallno)) {
/*
* For compatibility, we handles user-issued syscall(-1).
Compatibility with what? arch/arm/?
with the case where a process is *not* traced (including audit).
Ok, please make that explicit in the comment.
*
* RESTRICTION: we can't modify a return value here in this
* specific case. In order to ease this flavor, we have to
* take whatever value x0 has as a return value, but this
* might result in a bogus value being returned.
This comment isn't helping me. Are we returning a bogus value or not? If so, why is that acceptable?
I mean that syscall(-1) always returns -1 with ENOSYS.
Let's think about the case that we didn't have this 'if' statement. If a debugger catches an user-issued syscall(-1), but let it go without doing anything (especially changing a value in x0), this syscall will return an original value in x0, which is the first argument of syscall(-1). I mentioned this as "bogus." In this way, a traced process would see a different behavior of syscall(-1). (On arm, this doesn't happen because syscall(-1) is supposed to raise SIGILL.) (On x86, this doesn't happen, probably, because syscall arguments are passed via a stack and we can set a default return value in a register to ENOSYS.)
In which case, it's worth mentioning this in the comment and being explicit that this only applies to -1, as that's the same value we use to indicate that the syscall should be skipped. syscall(-2), for example, doesn't have an issue and will always return -ENOSYS.
Will
Those values (__NR_seccomp_*) are used solely in secure_computing() to identify mode 1 system calls. If compat system calls have different syscall numbers, asm/seccomp.h may override them.
Acked-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- include/asm-generic/seccomp.h | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 include/asm-generic/seccomp.h
diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h new file mode 100644 index 0000000..9fa1f65 --- /dev/null +++ b/include/asm-generic/seccomp.h @@ -0,0 +1,30 @@ +/* + * include/asm-generic/seccomp.h + * + * Copyright (C) 2014 Linaro Limited + * Author: AKASHI Takahiro takahiro.akashi@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#ifndef _ASM_GENERIC_SECCOMP_H +#define _ASM_GENERIC_SECCOMP_H + +#include <linux/unistd.h> + +#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32) +#define __NR_seccomp_read_32 __NR_read +#define __NR_seccomp_write_32 __NR_write +#define __NR_seccomp_exit_32 __NR_exit +#define __NR_seccomp_sigreturn_32 __NR_rt_sigreturn +#endif /* CONFIG_COMPAT && ! already defined */ + +#define __NR_seccomp_read __NR_read +#define __NR_seccomp_write __NR_write +#define __NR_seccomp_exit __NR_exit +#ifndef __NR_seccomp_sigreturn +#define __NR_seccomp_sigreturn __NR_rt_sigreturn +#endif + +#endif /* _ASM_GENERIC_SECCOMP_H */
This patch allows compat task to issue seccomp() system call.
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/asm/unistd32.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index da1f06b..812f192 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -787,7 +787,8 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr) __SYSCALL(__NR_sched_getattr, sys_sched_getattr) #define __NR_renameat2 382 __SYSCALL(__NR_renameat2, sys_renameat2) - /* 383 for seccomp */ +#define __NR_seccomp 383 +__SYSCALL(__NR_seccomp, sys_seccomp) #define __NR_getrandom 384 __SYSCALL(__NR_getrandom, sys_getrandom) #define __NR_memfd_create 385
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SIGINFO specified when this signal is delivered.
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 6 ++++++ 2 files changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll; + + /* SIGSYS */ + struct { + compat_uptr_t _call_addr; /* calling user insn */ + int _syscall; /* triggering system call number */ + unsigned int _arch; /* AUDIT_ARCH_* of syscall */ + } _sigsys; } _sifields; } compat_siginfo_t;
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c index 1b9ad02..5a1ba6e 100644 --- a/arch/arm64/kernel/signal32.c +++ b/arch/arm64/kernel/signal32.c @@ -186,6 +186,12 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from) err |= __put_user(from->si_uid, &to->si_uid); err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, &to->si_ptr); break; + case __SI_SYS: + err |= __put_user((compat_uptr_t)(unsigned long) + from->si_call_addr, &to->si_call_addr); + err |= __put_user(from->si_syscall, &to->si_syscall); + err |= __put_user(from->si_arch, &to->si_arch); + break; default: /* this is just in case for now ... */ err |= __put_user(from->si_pid, &to->si_pid); err |= __put_user(from->si_uid, &to->si_uid);
On Thu, Oct 02, 2014 at 10:46:15AM +0100, AKASHI Takahiro wrote:
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SIGINFO specified when this signal is delivered.
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 6 ++++++ 2 files changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll;
/* SIGSYS */
struct {
compat_uptr_t _call_addr; /* calling user insn */
int _syscall; /* triggering system call number */
unsigned int _arch; /* AUDIT_ARCH_* of syscall */
nit, but compat_uint_t looks better here (I have no idea why I didn't do this for the signed int types, but hey).
Will
On 10/08/2014 11:30 PM, Will Deacon wrote:
On Thu, Oct 02, 2014 at 10:46:15AM +0100, AKASHI Takahiro wrote:
SIGSYS is primarily used in secure computing to notify tracer. This patch allows signal handler on compat task to get correct information with SA_SIGINFO specified when this signal is delivered.
Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/include/asm/compat.h | 7 +++++++ arch/arm64/kernel/signal32.c | 6 ++++++ 2 files changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h index 253e33b..c877915 100644 --- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -205,6 +205,13 @@ typedef struct compat_siginfo { compat_long_t _band; /* POLL_IN, POLL_OUT, POLL_MSG */ int _fd; } _sigpoll;
/* SIGSYS */
struct {
compat_uptr_t _call_addr; /* calling user insn */
int _syscall; /* triggering system call number */
unsigned int _arch; /* AUDIT_ARCH_* of syscall */
nit, but compat_uint_t looks better here (I have no idea why I didn't do this for the signed int types, but hey).
I will fix it.
-Takahiro AKASHI
Will
secure_computing() is called first in syscall_trace_enter() so that a system call will be aborted quickly without doing succeeding syscall tracing if seccomp rules want to deny that system call.
On compat task, syscall numbers for system calls allowed in seccomp mode 1 are different from those on normal tasks, and so _NR_seccomp_xxx_32's need to be redefined.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/Kconfig | 14 ++++++++++++++ arch/arm64/include/asm/seccomp.h | 25 +++++++++++++++++++++++++ arch/arm64/include/asm/unistd.h | 3 +++ arch/arm64/kernel/ptrace.c | 5 +++++ 4 files changed, 47 insertions(+) create mode 100644 arch/arm64/include/asm/seccomp.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd4e81a..d6dc436 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -34,6 +34,7 @@ config ARM64 select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_KGDB + select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK select HAVE_C_RECORDMCOUNT select HAVE_CC_STACKPROTECTOR @@ -312,6 +313,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
source "mm/Kconfig"
+config SECCOMP + bool "Enable seccomp to safely compute untrusted bytecode" + ---help--- + This kernel feature is useful for number crunching applications + that may need to compute untrusted bytecode during their + execution. By using pipes or other transports made available to + the process as file descriptors supporting the read/write + syscalls, it's possible to isolate those applications in + their own address space using seccomp. Once seccomp is + enabled via prctl(PR_SET_SECCOMP), it cannot be disabled + and the task is only allowed to execute a few safe syscalls + defined by each seccomp mode. + config XEN_DOM0 def_bool y depends on XEN diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h new file mode 100644 index 0000000..c76fac9 --- /dev/null +++ b/arch/arm64/include/asm/seccomp.h @@ -0,0 +1,25 @@ +/* + * arch/arm64/include/asm/seccomp.h + * + * Copyright (C) 2014 Linaro Limited + * Author: AKASHI Takahiro takahiro.akashi@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#ifndef _ASM_SECCOMP_H +#define _ASM_SECCOMP_H + +#include <asm/unistd.h> + +#ifdef CONFIG_COMPAT +#define __NR_seccomp_read_32 __NR_compat_read +#define __NR_seccomp_write_32 __NR_compat_write +#define __NR_seccomp_exit_32 __NR_compat_exit +#define __NR_seccomp_sigreturn_32 __NR_compat_rt_sigreturn +#endif /* CONFIG_COMPAT */ + +#include <asm-generic/seccomp.h> + +#endif /* _ASM_SECCOMP_H */ diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 6d2bf41..49c9aef 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -31,6 +31,9 @@ * Compat syscall numbers used by the AArch64 kernel. */ #define __NR_compat_restart_syscall 0 +#define __NR_compat_exit 1 +#define __NR_compat_read 3 +#define __NR_compat_write 4 #define __NR_compat_sigreturn 119 #define __NR_compat_rt_sigreturn 173
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index 6b11c6a..7eef857 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -27,6 +27,7 @@ #include <linux/smp.h> #include <linux/ptrace.h> #include <linux/user.h> +#include <linux/seccomp.h> #include <linux/security.h> #include <linux/init.h> #include <linux/signal.h> @@ -1128,6 +1129,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs) { unsigned int orig_syscallno = regs->syscallno;
+ /* Do the secure computing check first; failures should be fast. */ + if (secure_computing(regs->syscallno) == -1) + return -1; + if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
linaro-kernel@lists.linaro.org