On Mon, Aug 19, 2024 at 01:04:18PM +0100, Catalin Marinas wrote:
On Thu, Aug 01, 2024 at 01:06:47PM +0100, Mark Brown wrote:
+static int copy_thread_gcs(struct task_struct *p,
const struct kernel_clone_args *args)
+{
- unsigned long gcs;
- gcs = gcs_alloc_thread_stack(p, args);
- if (IS_ERR_VALUE(gcs))
return PTR_ERR((void *)gcs);
Is 0 an ok value here? I can see further down that gcs_alloc_thread_stack() may return 0.
Yes, it's fine for a thread not to have a GCS.
- p->thread.gcs_el0_mode = current->thread.gcs_el0_mode;
- p->thread.gcs_el0_locked = current->thread.gcs_el0_locked;
- /* Ensure the current state of the GCS is seen by CoW */
- gcsb_dsync();
I don't get this barrier. What does it have to do with CoW, which memory effects is it trying to order?
Yeah, I can't remember what that's supposed to be protecting.
- /* Allocate RLIMIT_STACK/2 with limits of PAGE_SIZE..2G */
- size = PAGE_ALIGN(min_t(unsigned long long,
rlimit(RLIMIT_STACK) / 2, SZ_2G));
- return max(PAGE_SIZE, size);
+}
So we still have RLIMIT_STACK/2. I thought we got rid of that and just went with RLIMIT_STACK (or I misremember).
I honestly can't remember either way, it's quite possible it's changed multiple times. I don't have super strong feelings on the particular value here.
+static bool gcs_consume_token(struct mm_struct *mm, unsigned long user_addr) +{
As per the clone3() thread, I think we should try to use get_user_page_vma_remote() and do a cmpxchg() directly.
I've left this as is for now, mainly because it keeps the code in line with x86 and I can't directly test the x86 code. IIRC we can't just do a standard userspace cmpxchg since that will access as though we were at EL0 but EL0 doesn't have standard write permission for the page.
How does the user write the initial token? Do we need any barriers before/after consuming the token?
The token is created by map_shadow_stack() or as part of a GCS pivot. A sync beforehand is probably safer, with the current code we'll have one when we switch to the task.