Re: [musl] Re: [PATCH v8 00/38] arm64/gcs: Provide support for GCS in userspace

20 Feb 2024

      On Tue, Feb 20, 2024 at 06:41:05PM +0000, Edgecombe, Rick P wrote:
...
Hi,
I worked on the x86 kernel shadow stack support. I think it is an
interesting suggestion. Some questions below, and I will think more on
it.
On Tue, 2024-02-20 at 11:36 -0500, Stefan O'Rear wrote:
...
While discussing the ABI implications of shadow stacks in the context
of
Zicfiss and musl a few days ago, I had the following idea for how to
solve
the source compatibility problems with shadow stacks in POSIX.1-2004
and
POSIX.1-2017:

Introduce a "flexible shadow stack handling" option.  For what

follows,
   it doesn't matter if this is system-wide, per-mm, or per-vma.

Shadow stack faults on non-shadow stack pages, if flexible shadow

stack
   handling is in effect, cause the affected page to become a shadow
stack
   page.  When this happens, the page filled with invalid address
tokens.
Hmm, could the shadow stack underflow onto the real stack then? Not
sure how bad that is. INCSSP (incrementing the SSP register on x86)
loops are not rare so it seems like something that could happen.
Shadow stack underflow should fault on attempt to access
non-shadow-stack memory as shadow-stack, no?
...
...
Faults from non-shadow-stack accesses to a shadow-stack page which
was
   created by the previous paragraph will cause the page to revert to
   non-shadow-stack usage, with or without clearing.
Won't this prevent catching stack overflows when they happen? An
overflow will just turn the shadow stack into normal stack and only get
detected when the shadow stack unwinds?
I don't think that's as big a problem as it sounds like. It might make
pinpointing the spot at which things went wrong take a little bit more
work, but it should not admit any wrong-execution.
...
A related question would be how to handle the expanding nature of the
initial stack. I guess the initial stack could be special and have a
separate shadow stack.
That seems fine.
...
...
Important: a shadow stack operation can only load a valid address
from
   a page if that page has been in continuous shadow stack use since
the
   address was written by another shadow stack operation; the
flexibility
   delays error reporting in cases of stray writes but it never
allows for
   corruption of shadow stack operation.
Shadow stacks currently have automatic guard gaps to try to prevent one
thread from overflowing onto another thread's shadow stack. This would
somewhat opens that up, as the stack guard gaps are usually maintained
by userspace for new threads. It would have to be thought through if
these could still be enforced with checking at additional spots.
I would think the existing guard pages would already do that if a
thread's shadow stack is contiguous with its own data stack.
...
...

Standards-defined operations which use a user-provided stack

(makecontext, sigaltstack, pthread_attr_setstack) use a subrange
of the
   provided stack for shadow stack storage.  I propose to use a
shadow
   stack size of 1/32 of the provided stack size, rounded up to a
positive
   integer number of pages, and place the shadow stack allocation at
the
   lowest page-aligned address inside the provided stack region.
Since page usage is flexible, no change in page permissions is
   immediately needed; this merely sets the initial shadow stack
pointer for
   the new context.
If the shadow stack grew in the opposite direction to the
architectural
   stack, it would not be necessary to pick a fixed direction.

SIGSTKSZ and MINSIGSTKSZ are increased by 2 pages to provide

sufficient
   space for a minimum-sized shadow stack region and worst case
alignment.
Do all makecontext() callers ensure the size is greater than this?
I guess glibc's makecontext() could do this scheme to prevent leaking
without any changes to the kernel. Basically steal a little of the
stack address range and overwrite it with a shadow stack mapping. But
only if the apps leave enough room. If they need to be updated, then
they could be updated to manage their own shadow stacks too I think.
From the musl side, I have always looked at the entirely of shadow
stack stuff with very heavy skepticism, and anything that breaks
existing interface contracts, introduced places where apps can get
auto-killed because a late resource allocation fails, or requires
applications to code around the existence of something that should be
an implementation detail, is a non-starter. To even consider shadow
stack support, it must truely be fully non-breaking.
...
...
_Without_ doing this, sigaltstack cannot be used to recover from
stack
overflows if the shadow stack limit is reached first, and makecontext
cannot be supported without memory leaks and unreportable error
conditions.
FWIW, I think the makecontext() shadow stack leaking is a bad idea. I
would prefer the existing makecontext() interface just didn't support
shadow stack, rather than the leaking solution glibc does today.
AIUI the proposal by Stefan makes it non-leaking because it's just
using normal memory that reverts to normal usage on any
non-shadow-stack access.
Rich

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [musl] Re: [PATCH v8 00/38] arm64/gcs: Provide support for GCS in userspace