Re: [PATCH v6 6/9] kernel: entry: Support Syscall User Dispatch for common syscall entry

7 Sep 2020


      On Mon, Sep 7, 2020 at 7:25 AM Christian Brauner
christian.brauner@ubuntu.com wrote:
...
On Mon, Sep 07, 2020 at 07:15:52AM -0700, Andy Lutomirski wrote:
...
...
On Sep 7, 2020, at 3:15 AM, Christian Brauner christian.brauner@ubuntu.com wrote:
On Fri, Sep 04, 2020 at 04:31:44PM -0400, Gabriel Krisman Bertazi wrote:
...
Syscall User Dispatch (SUD) must take precedence over seccomp, since the
use case is emulation (it can be invoked with a different ABI) such that
seccomp filtering by syscall number doesn't make sense in the first
place.  In addition, either the syscall is dispatched back to userspace,
in which case there is no resource for seccomp to protect, or the
Tbh, I'm torn here. I'm not a super clever attacker but it feels to me
that this is still at least a clever way to circumvent a seccomp
sandbox.
If I'd be confined by a seccomp profile that would cause me to be
SIGKILLed when I try do open() I could prctl() myself to do user
dispatch to prevent that from happening, no?
Not really, I think. The idea is that you didn’t actually do open().
You did a SYSCALL instruction which meant something else, and the
syscall dispatch correctly prevented the kernel from misinterpreting
it as open().
Right, for the case where you're e.g. emulating windows syscalls that's
true. I was thinking when you're running natively on Linux: couldn't I
first load a seccomp profile "kill me if someone does an open()", then
I exec() the target binary and that binary is setup to do
prctl(USER_DISPATCH) first thing. I guess, it's ok because as far as I
had time to read it this is a nothing or all mechanism, i.e. _all_
system calls are re-routed in contrast to e.g. seccomp where I could do
this per-syscall. So for user-dispatch it wouldn't make sense to use it
on Linux per se. Still makes me a little uneasy. :)
There's an escape hatch, so processes using this can still make syscalls.
Maybe think about it another way: a process using user dispatch should
definitely *not* trigger seccomp user notifiers, errno returns, or
ptrace events, since they'll all do the wrong thing.  IMO RET_KILL is
the same.
Barring some very severe defect, there's no way a program can use user
dispatch to escape seccomp -- a program could use user dispatch to
allow them to do:
mov $__NR_open, %rax
syscall
without dying despite the presence of a filter that would kill the
process if it tried to do open(), but this doesn't bypass the filter
at all.  The process could just as easily have done:
mov $__NR_open
jmp magic_stub(%rip)
without tripping the filter, since no system call actually happens here.
--Andy

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v6 6/9] kernel: entry: Support Syscall User Dispatch for common syscall entry