The 08/01/2023 16:09, Mark Brown wrote:
On Tue, Aug 01, 2023 at 03:13:20PM +0100, Will Deacon wrote:
On Mon, Jul 31, 2023 at 02:43:09PM +0100, Mark Brown wrote:
The arm64 Guarded Control Stack (GCS) feature provides support for hardware protected stacks of return addresses, intended to provide hardening against return oriented programming (ROP) attacks and to make it easier to gather call stacks for applications such as profiling.
Why is this better than Clang's software shadow stack implementation? It would be nice to see some justification behind adding all this, rather than it being an architectural tick-box exercise.
Mainly that it's hardware enforced (as the quoted paragraph says). This makes it harder to attack, and hopefully it's also a bit faster (how measurable that might be will be an open question, but even NOPs in function entry/exit tend to get noticed).
clang shadowstack seems to use x18. this is only valid on a platform like android that can reserve x18, not deployable widely on linux distros.
with gcs the same binary works with gcs enabled or disabled. and it can support disabling gcs at runtime. this is important for incremental deployment or with late detection of incompatibility. clang shadowstack cannot do this. (and there is no abi marking so it is easy to create broken binaries.)
android uses fixed 16k shadowstack, controlling this size from userspace is missing from the current gcs abi patches. the default gcs size can be huge so this may be an actual issue for gcs on android where RLIMIT_AS, RLIMIT_DATA etc are often set i think. but the fixed size has its problems too (e.g. there are libraries, boehm gc, that recursively call a function until segfault to detect stack bounds).
i think the clang shadowstack design does not allow safely switching between shadow stacks. bionic has no makecontext so code that does userspace task scheduling presumably has to do custom things which would need modifications and likely introdce security weakness where x18 is set. (this also means sigaltstack would have the same limitations as the current gcs patches: shadow stack overflow cannot be handled if the signal handler itself wants to use the same shadow stack. one advantage of the weaker software solution is that it can be disabled per function however a signal handler may indirectly call many other functions so i'm not sure if this helps in practice.)
as usual with these sanitizers we cannot recommend them to users in general: they only work in a narrow context. to be fair shstk and gcs are only a little bit better in this case.