Re: [PATCH 01/35] prctl: arch-agnostic prctl for shadow stack

18 Jul 2023

      On Tue, Jul 18, 2023 at 05:45:01PM +0000, Edgecombe, Rick P wrote:
...
On Sun, 2023-07-16 at 22:50 +0100, Mark Brown wrote:
...
...
Three architectures (x86, aarch64, riscv) have announced support for
shadow stack.  This patch adds arch-agnostic prtcl support to enable
/disable/get/set status of shadow stack and forward control (landing
pad)
flow cfi statuses.
...
What is this about forward control flow? Seems to be just about shadow
stack.
Sorry, that's the original commit message - the original version of this
also had support for controlling landing pads but I don't need that and
cut them out of the series.  I forgot to update that bit of the commit
message.
...
...
[Rebased onto current kernels, renumbering to track other allocations
 already upstream, dropping indirect LP, updating to pass arg to set
  by value, fix missing prototypes for weak functions and update
title.
  -- broonie]
...

PR_SET_SHADOW_STACK_STATUS seems like a strange name for the thing

actually doing the whole enabling of the feature which involves
allocating memory, etc. And in the future a growing array of different
things (enabling push, write, etc).
I have no strong opinion on naming here.  _MODE?  I didn't find any
discussions around this in the
...

x86 only allows one enabling/disabling operation at a time. So you

can't enable shadow stack AND WRSS with one syscall, for example. This
is to make it so it's clear which operation failed. Also, since some
features depend on others (WRSS), there would need to be some ordering
and rollback logic. There was some discussion about a batch enabling
arch_prctl() that could report failures independently, but it was
deemed premature optimization.
I did see that the x86 implementation required a call per flag, the
logic wasn't hugely obvious there - it didn't seem super helpful.
There's nothing stopping userspace turning one flag at a time if it
wants to, we just don't require it.  I wasn't overly concerned about the
rollback logic since I was anticipating that the main complexity is the
base enable and allocate, everything else would just be storing a mode.
We can implement things with the one bit per call approach, I just
didn't see much upside to it.  Perhaps I'm missing some case though.
...

It only allows you to lock the whole feature, and not individual

subfeatures. For things like WRSS, it came up that there might be an
elf bit, like the shadow stack one, but that works a bit different.
Instead of only enabling shadow stack when ALL DSOs support the
feature, it would want to be enabled if ANY DSOs require it. So
userspace might want to do something like lock shadow stack, but leave
WRSS unlocked in case a dlopen() call came across a WRSS-requiring DSO.
We could add either a second argument with the lock or a separate lock
prctl() and matching query which takes the same bitmask, being able
to lock per feature does give more flexibility to userspace in how we do
the locking and isn't hugely more costly to implement.  My model for
locking had been that there would be a final decision on what the
features should be, I was modelling "can enable" as equivalent access to
"is enabled" when it came to what was locked.
...

To support CRIU, there needed to be a ptrace-only unlock feature.

The arch_prctl() has a special ptrace route to enforce that this unlock
is only coming from ptrace. Is there some way to do this with a regular
prctl()?
For arm64 we need to add a regset to expose the GCS pointer anyway so
the GCS mode is in there, though at the minute we prevent any changes at
all via that mechanism it could be implemented later.  I'm not aware of
any way for prctl() to tell if it is being invoked via ptrace so that'd
need to be dealt with somehow.
...

I see in the next patch there is hinted support for write and push

as well (although I can't find the implementation in the patches, am I
missing it?). X86 has something close enough to write, but not push.
What is the idea for when the features don't exactly match?
The implementation is in "arm64/gcs: Implement shadow stack prctl()
interface", it just boils down to turning on or off a register bit.
...
I think when Deepak originally brought up this unified prctl-based
interface, it seemed far away before we could tell if it *could* be
unified. Do either of you have any thoughts on whether the above points
could be incorporated?
Other than the issue with CRIU I don't see any huge difficulty.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 01/35] prctl: arch-agnostic prctl for shadow stack