On Wed, Feb 21, 2024 at 11:22 AM Edgecombe, Rick P rick.p.edgecombe@intel.com wrote:
On Wed, 2024-02-21 at 14:06 -0500, dalias@libc.org wrote:
Due to arbitrarily nestable signal frames, no, this does not suffice. An interrupted operation using the lock could be arbitrarily delayed, even never execute again, making any call to dlopen deadlock.
Doh! Yep, it is not robust to this. The only thing that could be done would be a timeout in dlopen(). Which would make the whole thing just better than nothing.
It's fine to turn RDSSP into an actual emulated read of the SSP, or at least an emulated load of zero so that uninitialized data is not left in the target register.
We can't intercept RDSSP, but it becomes a NOP by default. (disclaimer x86-only knowledge).
If doing the latter, code working with the shadow stack just needs to be prepared for the possibility that it could be async-disabled, and check the return value.
I have not looked at all the instructions that become #UD but I suspect they all have reasonable trivial ways to implement a "disabled" version of them that userspace can act upon reasonably.
This would have to be thought through functionally and performance wise. I'm not opposed if can come up with a fully fleshed out plan. How serious are you in pursuing musl support, if we had something like this?
HJ, any thoughts on whether glibc would use this as well?
Assuming that we are talking about permissive mode, if kernel can suppress UD, we don't need to disable SHSTK. Glibc can enable ARCH_SHSTK_SUPPRESS_UD instead.
It is probably worth mentioning that from the security side (as Mark mentioned there is always tension in the tradeoffs on these features), permissive mode is seen by some as something that weakens security too much. Apps could call dlopen() on a known unsupported DSO before doing ROP. I don't know if you have any musl users with specific shadow stack use cases to ask about this.