On Wed, Feb 21, 2024 at 05:36:12PM +0000, Mark Brown wrote:
On Wed, Feb 21, 2024 at 09:58:01AM -0500, dalias@libc.org wrote:
On Wed, Feb 21, 2024 at 01:53:10PM +0000, Mark Brown wrote:
On Tue, Feb 20, 2024 at 08:27:37PM -0500, dalias@libc.org wrote:
On Wed, Feb 21, 2024 at 12:35:48AM +0000, Edgecombe, Rick P wrote:
(INCSSP, RSTORSSP, etc). These are a collection of instructions that allow limited control of the SSP. When shadow stack gets disabled, these suddenly turn into #UD generating instructions. So any other threads executing those instructions when shadow stack got disabled would be in for a nasty surprise.
This is the kernel's problem if that's happening. It should be trapping these and returning immediately like a NOP if shadow stack has been disabled, not generating SIGILL.
I'm not sure that's going to work out well, all it takes is some code that's looking at the shadow stack and expecting something to happen as a result of the instructions it's executing and we run into trouble. A
I said NOP but there's no reason it strictly needs to be a NOP. It could instead do something reasonable to convey the state of racing with shadow stack being disabled.
This feels like it's getting complicated and I fear it may be an uphill struggle to get such code merged, at least for arm64. My instinct is that it's going to be much more robust and generally tractable to let things run to some suitable synchronisation point and then disable there, but if we're going to do that then userspace can hopefully arrange to do the disabling itself through the standard disable interface anyway. Presumably it'll want to notice things being disabled at some point anyway? TBH that's been how all the prior proposals for process wide disable I've seen were done.
If it's possible to disable per-thread rather than per-process, some things are easier. Disabling on account of using alt stacks only needs to be done on the threads using those stacks. However, for dlopen purposes you need a way to disable shadow stack for the whole process. Initially this is only needed for the thread that called dlopen, but it needs to have propagated to any thread that synchronizes with completion of the call to dlopen by the time that synchronization occurs, and since that synchronization can happen in lots of different ways that are purely userspace (thanks to futexes being userspace in the uncontended case), I don't see any way to make it work without extremely invasive, high-cost checks.
If folks on the kernel side are not going to be amenable to doing the things that are easy for the kernel to make it work without breaking compatibility with existing interfaces, but that are impossible or near-impossible for userspace to do, this seems like a dead-end. And I suspect an operation to "disable shadow stack, but without making threads still in SS-critical sections crash" is going to be necessary..
Rich