On Thu, Aug 10, 2023 at 10:40:16AM +0100, Will Deacon wrote:
On Tue, Aug 08, 2023 at 09:25:11PM +0100, Mark Brown wrote:
I'm not sure that your assumption that the only people would would consider deploying this are those who have deployed SCS is a valid one, SCS users are definitely part of the mix but GCS is expected to be much more broadly applicable. As you say SCS is very invasive, requires a rebuild of everything with different code generated and as Szabolcs outlined has ABI challenges for general distros. Any code built (or JITed) with anything other than clang is going to require some explicit support to do SCS (eg, the kernel's SCS support does nothing for assembly code) and there's a bunch of runtime support. It's very much a specialist feature, mainly practical in well controlled somewhat vertical systems - I've not seen any suggestion that general purpose distros are considering using it.
I've also seen no suggestion that general purpose distros are considering GCS -- that's what I'm asking about here, and also saying that we shouldn't rush in an ABI without confidence that it actually works beyond unit tests (although it's great that you wrote selftests!).
It defintely works substantially beyond selftests. For the actual distros there's definitely interest out there, gated on upstreaming.
In contrast in the case of GCS one of the nice features is that for most code it's very much non-invasive, much less so than things like PAC/BTI and SCS, which means that the audience is much wider than it is for SCS
- it's a *much* easier sell for general purpose distros to enable GCS
than to enable SCS.
This sounds compelling, but has anybody tried running significant parts of a distribution (e.g. running Debian source package tests, booting Android, using a browser, running QEMU) with GCS enabled? I can well imagine non-trivial applications violating both assumptions of the architecture and the ABI.
Android is the main full userspace that people have been working with, we've not run into anything ABI related yet that I'm aware of - there is one thing that's being chased down but we're fairly confident that is a bug somewhere rather than the ABI being unsuitable.
If not, why are we bothering? If so, how much of that distribution has been brought up and how does the "dynamic linker or other startup code" decide what to do?
There is active interest in the x86 shadow stack support from distros, GCS is a lot earlier on in the process but isn't fundamentally different so it is expected that this will translate. There is also a chicken and egg thing where upstream support gates a lot of people's interest, what people will consider carrying out of tree is different to what they'll enable.
I'm not saying we should wait until distros are committed, but Arm should be able to do that work on a fork, exactly like we did for the arm64 bringup. We have the fastmodel, so running interesting stuff with GCS enabled should be dead easy, no?
Right, this is happening but your pushback seemed to be "why would anyone even consider deploying this?" rather than "could anyone deploy this?", tests on forks can help a bit with the first question but your concern seemed more at the level of even getting people to look at the work rather than just rejecting it out of hand.
The majority of the full distro work at this point is on the x86 side given the hardware availability, we are looking at that within Arm of course. I'm not aware of any huge blockers we have encountered thus far.
Ok, so it sounds like you've started something then? How far have you got?
I'd say thus far text mode embedded/server type stuff is looking pretty good, especially for C stuff - setjmp/longjmp and an unwinder cover a *lot*. We do need to do more here, especially GUI stuff, but it's progressing well thus far.
While we'd be daft not to look at what the x86 folks are doing, I don't think we should rely solely on them to inform the design for arm64 when it should be relatively straightforward to prototype the distro work on the model. There's also no rush to land the kernel changes given that GCS hardware doesn't exist.
Sure, but we're also in the position where there's only been the very beginnings of kernel review and obviously that's very important too and there's often really substantial lead times on that, plus the potential for need for redoing all the testing if there's issues identified. I'd hope to at least be able to get to a point where the major concern people have is testing. Another goal here is to feed any concerns we do have into what's happening with x86 and RISC-V so that we have as much alignment as possible in how this is supposed to work on Linux, that'll make everyone's life easier.
In terms of timescales given that users with generic distros are a big part of the expected audience while we're well in advance of where it's actually going to be used we do need to be mindful of lead times in getting support into the software users are likely to want to run so they've got something they can use when they do get hardware. We don't need to rush into anything, but we should probably use that time for careful consideration.