On Wed, Aug 07, 2024 at 01:39:27PM +0100, Mark Brown wrote:
On Tue, Aug 06, 2024 at 10:08:44PM -0700, Kees Cook wrote:
On Tue, Aug 06, 2024 at 04:10:02PM +0100, Mark Brown wrote:
# Running test 'Shadow stack with no token'
It took me a while to figure out where a thread switches shstk (even without this series):
kernel_clone, copy_process, copy_thread, fpu_clone, update_fpu_shstk (and shstk_alloc_thread_stack is called just before update_fpu_shstk).
I don't understand the token consumption in arch_shstk_post_fork(). This wasn't needed before with the fixed-size new shstk, why is it needed now?
Concerns were raised on earlier rounds of review that since instead of allocating the shadow stack as part of creating the new thread we are using a previously allocated shadow stack someone could use this as part of an exploit. You could just jump on top of any existing shadow stack and cause writes to it.
Anyway, my attempt to trace the shstk changes for the test:
write(1, "TAP version 13\n", 15) = 15 write(1, "1..2\n", 5) = 5 clone3({flags=0, exit_signal=18446744073709551615, stack=NULL, stack_size=0}, 104) = -1 EINVAL (Invalid argument) write(1, "# clone3() syscall supported\n", 29) = 29 map_shadow_stack(NULL, 4096, 0) = 125837480497152 write(1, "# Shadow stack supportd\n", 24) = 24 write(1, "# Running test 'Shadow stack wit"..., 44) = 44 getpid() = 4943 write(1, "# [4943] Trying clone3() with fl"..., 51) = 51 map_shadow_stack(NULL, 4096, 0) = 125837480488960 clone3({flags=CLONE_VM, exit_signal=SIGCHLD, stack=NULL, stack_size=0, /* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"} => {/* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"}, 104) = 4944 getpid() = 4943 write(1, "# I am the parent (4943). My chi"..., 49strace: Process 4944 attached ) = 49 [pid 4944] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_CPERR, si_addr=NULL} --- [pid 4943] wait4(-1, <unfinished ...> [pid 4944] +++ killed by SIGSEGV (core dumped) +++
So we created the thread, then before we get to the wait4() in the parent we start delivering a SEGV_CPERR to the child. The flow for the child is as expected.
<... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 4944 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=4944, si_uid=0, si_status=SIGSEGV, si_utime=0, si_stime=0} --- --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7272d21fffe8} --- +++ killed by SIGSEGV (core dumped) +++
Then the parent gets an ordinary segfault, not a shadow stack specific one, like some memory got deallocated underneath it or a pointer got corrupted.
[ 569.153288] shstk_setup: clone3[4943] ssp:7272d2200000 [ 569.153998] process: copy_thread: clone3[4943] new_ssp:7272d2530000 [ 569.154002] update_fpu_shstk: clone3[4943] ssp:7272d2530000 [ 569.154008] shstk_post_fork: clone3[4944] [ 569.154011] shstk_post_fork: clone3[4944] sending SIGSEGV post fork
I don't see an update_fpu_shstk for 4944? Should I with this test?
I'd only expect to see one update, my understanding is that that update is for the child but happening in the context of the parent as the hild is not yet started.
What's weird here that I don't understand is that the parent is 4943, so this report makes sense:
[ 569.153288] shstk_setup: clone3[4943] ssp:7272d2200000
The child is 4944, yet I see:
[ 569.153998] process: copy_thread: clone3[4943] new_ssp:7272d2530000 [ 569.154002] update_fpu_shstk: clone3[4943] ssp:7272d2530000
These map to my logging:
copy_thread(struct task_struct *p, const struct kernel_clone_args *args) ... new_ssp = shstk_alloc_thread_stack(p, args); pr_err("%s: %s[%d] new_ssp:%lx\n", __func__, p->comm, task_pid_nr(p), new_ssp);
and
update_fpu_shstk(struct task_struct *dst, unsigned long ssp) ... xstate->user_ssp = (u64)ssp; pr_err("%s: %s[%d] ssp:%lx\n", __func__, dst->comm, task_pid_nr(dst), ssp);
The child should be "p" (and "dst") here -- stuff is being copied from current to p, but p is reporting itself as 4943 here? (Oh, this is reporting pid, not tid... I bet that's what I've got wrong.)
Does this help:
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index 27acbdf44c5f..d7005974aff5 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -258,6 +258,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, if (args->shadow_stack) { addr = args->shadow_stack; size = args->shadow_stack_size;
shstk->base = 0;
} else { /*shstk->size = 0;
- For CLONE_VFORK the child will share the parents
I'll fix my reporting and give this patch a try too. Thanks!
-Kees