On Mon, Oct 30, 2023 at 11:39:17AM +0000, Szabolcs.Nagy@arm.com wrote:
The 10/27/2023 16:24, Deepak Gupta wrote:
On Fri, Oct 27, 2023 at 12:49:59PM +0100, Szabolcs.Nagy@arm.com wrote:
no. the lifetime is the issue: a stack in principle can outlive a thread and resumed even after the original thread exited. for that to work the shadow stack has to outlive the thread too.
I understand an application can pre-allocate a pool of stack and re-use them whenever it's spawning new threads using clone3 system call.
However, once a new thread has been spawned how can it resume?
a thread can getcontext then exit. later another thread can setcontext and execute on the stack of the exited thread and return to a previous stack frame there.
(unlikely to work on runtimes where tls or thread id is exposed and thus may be cached on the stack. so not for posix.. but e.g. a go runtime could do this)
Aah then as you mentioned, we basically need clear lifetime rules around their creation and deletion. Because `getcontext/swapcontext/setcontext` can be updated to save shadow stack token on stack itself and use that to resume. It's just lifetime that needs to be managed.
By resume I mean consume the callstack context from an earlier thread. Or you meant something else by `resume` here?
Can you give an example of such an application or runtime where a newly created thread consumes callstack context created by going away thread?
my claim was not that existing runtimes are doing this, but that the linux interface contract allows this and tieing the stack lifetime to the thread is a change of contract.
(or the other way around: a stack can be freed before the thread exits, if the thread pivots away from that stack.)
This is simply a thread saying that I am moving to a different stack. Again, interested in learning why would a thread do that. If I've to speculate on reasons, I could think of user runtime managing it's own pool of worker items (some people call them green threads) or current stack became too small.
switching stack is common, freeing the original stack may not be, but there is nothing that prevents this and then the corresponding shadow stack is clearly leaked if the kernel manages it. the amount of leak is proportional to the number of live threads and the sum of their original stack size which can be big.
but as i said i think this lifetime issue is minor compared to other shadow stack issues, so it is ok if the shadow stack is kernel managed.