On Thu, Aug 15, 2024 at 04:05:32PM +0100, Mark Brown wrote:
On Thu, Aug 15, 2024 at 03:01:21PM +0100, Dave Martin wrote:
My thought was that if libc knows about shadow stacks, it is probably going to be built to use them too and so would enable shadow stack during startup to protect its own code.
(Technically those would be independent decisions, but it seems a good idea to use a hardening feature if you know about and it is present.)
If so, shadow stacks might always get turned on before the main program gets a look-in.
Or is that not the expectation?
The expectation (at least for arm64) is that the main program will only have shadow stacks if everything says it can support them. If the dynamic linker turns them on during startup prior to parsing the main executables this means that it should turn them off before actually starting the executable, taking care to consider any locking of features.
Hmm, so we really do get a clear "enable shadow stack" call to the kernel, which we can reasonaly expect won't happen for ancient software?
If so, I think dumping the GCS state in the sigframe could be made conditional on that without problems (?)
(We could always make it unconditional later if it turn out that that approach breaks something.)
Is there any scenario where it is legitimate for the signal handler to change the shadow stack mode or to return with an altered GCSPR_EL0?
If userspace can rewrite the stack pointer on return (eg, to return to a different context as part of userspace threading) then it will need to
Do we know if code that actually does that? IIUC, trying to do this is totally broken on most arches nowadays; making it work requires a reentrant C library and/or logic to defer signals around critical sections in userspace.
"Real" threading makes this pretty pointless for the most part.
Related question: does shadow stack work with ucontext-based coroutines? Per-context stacks need to be allocated by the program for that.
Yes, ucontext based coroutines are the sort of thing I meant when I was talking about returning to a different context?
Ah, right. Doing this asynchronously on the back of a signal (instead of doing a sigreturn) is the bad thing. setcontext() officially doesn't work for this any more, and doing it by hacking or rebuilding the sigframe is extremely hairy and probably a terrible idea for the reasons I gave.
be able to also update GCSPR_EL0 to something consistent otherwise attempting to return from the interrupted context isn't going to go well. Changing the mode is a bit more exotic, as it is in general. It's as much to provide information to the signal handler as anything else.
Note, the way sigcontext (a.k.a. mcontext).__reserved[] is used by glibc for the ucontext API is inspired by the way the kernel uses it, but not guaranteed to be compatible. For the ucontext API glibc doesn't try to store/restore asynchronous contexts (which is why setcontext() from a signal handler is totally broken), so there is no need to store SVE/SME state and hence lots of free space, so this probably is supportable with shadow stacks -- if there's a way to allocate them. This series would be unaffected either way.
(IIRC, the contents of mcontext.__reserved[] is totally incompatible with what the kernel puts in there, and doesn't have the same record structure.)
I'm not sure that we should always put information in the signal frame that the signal handler can't obtain directly.
I guess it's harmless to include this, though.
If we don't include it then if different ucontexts have different GCS features enabled we run into trouble on context switch.
As outlined above, nowadays you can only use setcontext() on a context obtained from getcontext(). Using setcontext() on a context obtained from a sigframe works by accident or not at all, but in any case coroutines always switch synchronously and don't rely on doing this.
(See where setcontext deals with the FPSIMD regs: https://sourceware.org/git/?p=glibc.git%3Ba=blob%3Bf=sysdeps/unix/sysv/linux... )
So, overall I think making ucontext coroutines with with GCS is purely a libc matter that is "interesting" here, but we don't need to worry about.
Is the guarded stack considered necessary (or at least beneficial) for backtracing, or is the regular stack sufficient?
It's potentially beneficial, being less vulnerable to corruption and simpler to parse if all you're interested in is return addresses. Profiling in particular was mentioned, grabbing a linear block of memory will hopefully be less overhead than chasing down the stack. The regular stack should generally be sufficient though.
I guess we can't really argue that the shadow stack pointer is redundant here though. The whole point of shadow stacks is to make things more robust...
Just kicking the tyres on the question of whether we need it here, but I guess it's hard to make a good case for saying "no".
Indeed. The general model here is that we don't break userspace that relies on parses the normal stack (so the GCS is never *necessary*) but clearly you want to have it.
Agreed, but perhaps not in programs that haven't enabled shadow stack?
Cheers ---Dave