The 08/22/2023 18:53, Mark Brown wrote:
On Tue, Aug 22, 2023 at 05:49:51PM +0100, Catalin Marinas wrote:
On Fri, Aug 18, 2023 at 08:38:02PM +0100, Mark Brown wrote:
stack and pass the pointer/size to clone3()? It saves us from having to guess what the right size we'd need. struct clone_args is extensible.
I can't recall or locate the specific reasoning there right now, perhaps Rick or someone else can? I'd guess there would be compat concerns for things that don't go via libc which would complicate the story with identifying and marking things as GCS/SS safe, it's going to be more robust to just supply a GCS if the process is using it. That said having a default doesn't preclude us using the extensibility to allow userspace directly to control the GCS size, I would certainly be in favour of adding support for that.
It would be good if someone provided a summary of the x86 decision (I'll get to those thread but most likely in September). I think we concluded that we can't deploy GCS entirely transparently, so we need a libc change (apart from the ELF annotations). Since libc is opting in to GCS,
Right, we need changes for setjmp()/longjmp() for example.
we could also update the pthread_create() etc. to allocate the shadow together with the standard stack.
Anyway, that's my preference but maybe there were good reasons not to do this.
Yeah, it'd be good to understand. I've been through quite a lot of old versions of the x86 series (I've not found them all, there's 30 versions or something of the old series plus the current one is on v9) and the code always appears to have been this way with changelogs that explain the what but not the why. For example roughly the current behaviour was already in place in v10 of the original series:
https://lore.kernel.org/lkml/20200429220732.31602-26-yu-cheng.yu@intel.com/
well the original shstk patches predate clone3 so no surprise there. e.g. v6 is from 2018 and clone3 is 2019 linux 5.3 https://lore.kernel.org/lkml/20181119214809.6086-1-yu-cheng.yu@intel.com/
I do worry about the story for users calling the underlying clone3() API (or legacy clone() for that matter) directly, and we would also need to handle the initial GCS enable via prctl() - that's not insurmountable, we could add a size argument there that only gets interpreted during the initial enable for example.
musl and bionic currently use plain clone for threads.
and there is user code doing raw clone threads (such threads are technically not allowed to call into libc) it's not immediately clear to me if having gcs in those threads is better or worse.
glibc can use clone3 args for gcs, i'd expect the unmap to be more annoying than the allocation, but possible (it is certainly more work than leaving everything to the kernel).
one difference is that userspace can then set gcspr of a new thread and e.g. two threads can have overlapping gcs, however i don't think this impacts security much since if clone3 is attacker controlled then likely all bets are off.
and yes the main thread gcs can also be libc allocated given we have to deal with the prctl anyway.
if gcs size logic is in libc it can depend on env vars and can be changed more easily (and adapted to android vs musl vs glibc requirements).
sigaltstack with alt gcs was a case where i thought the kernel doing it transparently is better (the libc cannot do the same as it cannot wrap signal handlers currently so does not know when a handler returns or the current alt stack state), but others seems to want an explicit sigaltgcs syscall and expose it to users. in any case we have no unwinder solution for alt gcs nor longjmp solution when the thread gcs is overflowed so this is not an issue for now.
My sense is that they deployment story is going to be smoother with defaults being provided since it avoids dealing with the issue of what to do if userspace creates a thread without a GCS in a GCS enabled process but like I say I'd be totally happy to extend clone3(). I will put some patches together for that (probably once the x86 stuff lands). Given the size of this series it might be better split out for manageability if nothing else.
i would make thread without gcs to implicitly disable gcs, since that's what's bw compat with clones outside of libc (the libc can guarantee gcs allocation when gcs is enabled).