On Fri, Oct 11, 2024 at 7:55 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
- Jann Horn jannh@google.com [241011 11:51]:
As explained in the comment block this change adds, we can't tell what userspace's intent is when the stack grows towards an inaccessible VMA.
We should ensure that, as long as code is compiled with something like -fstack-check, a stack overflow in this code can never cause the main stack to overflow into adjacent heap memory - so the bottom of a stack should never be directly adjacent to an accessible VMA.
[...]
diff --git a/mm/mmap.c b/mm/mmap.c index dd4b35a25aeb..937361be3c48 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -359,6 +359,20 @@ unsigned long do_mmap(struct file *file, unsigned long addr, return -EEXIST; }
/*
* This does two things:
*
* 1. Disallow MAP_FIXED replacing a PROT_NONE VMA adjacent to a stack
* with an accessible VMA.
* 2. Disallow MAP_FIXED_NOREPLACE creating a new accessible VMA
* adjacent to a stack.
*/
if ((flags & (MAP_FIXED_NOREPLACE | MAP_FIXED)) &&
(prot & (PROT_READ | PROT_WRITE | PROT_EXEC)) &&
!(vm_flags & (VM_GROWSUP|VM_GROWSDOWN)) &&
overlaps_stack_gap(mm, addr, len))
return (flags & MAP_FIXED) ? -ENOMEM : -EEXIST;
This is probably going to impact performance for allocators by causing two walks of the tree any time they protect a portion of mmaped area.
Well, it's one extra walk except on parisc, thanks to the "if (!IS_ENABLED(CONFIG_STACK_GROWSUP))" bailout - but point taken, it would be better to avoid that.
In the mmap_region() code, there is a place we know next/prev on MAP_FIXED, and next for MAP_FIXED_NOREPLACE - which has a vma iterator that would be lower cost than a tree walk. That area may be a better place to check these requirements. Unfortunately, it may cause a vma split in the vms_gather_munmap_vmas() call prior to this check, but considering the rarity it may not be that big of a deal?
Hmm, yeah, that sounds fine to me.
[...]
diff --git a/mm/mprotect.c b/mm/mprotect.c index 0c5d6d06107d..2300e2eff956 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -772,6 +772,12 @@ static int do_mprotect_pkey(unsigned long start, size_t len, } }
error = -ENOMEM;
if ((prot & (PROT_READ | PROT_WRITE | PROT_EXEC)) &&
!(vma->vm_flags & (VM_GROWSUP|VM_GROWSDOWN)) &&
overlaps_stack_gap(current->mm, start, end - start))
goto out;
We have prev just below your call here, so we could reuse that. Getting the vma after the mprotect range doesn't seem that easy. I guess we need to make the loop even more complicated and find the next vma (and remember the fixup can merge). This isn't as straight forward as what you have, but would be faster.
For mprotect, maybe one option would be to do it inside the loop? Something like this:
``` diff --git a/mm/mprotect.c b/mm/mprotect.c index d0e3ebfadef8..2873cc254eaf 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -790,6 +790,24 @@ static int do_mprotect_pkey(unsigned long start, size_t len, break; }
+ if (IS_ENABLED(CONFIG_STACK_GROWSUP) && vma->vm_start == start) { + /* just do an extra lookup here, we do this only on parisc */ + if (overlaps_stack_gap_growsup([...])) { + error = -ENOMEM; + break; + } + } + if (vma->vm_end == end) { + /* peek ahead */ + struct vma_iterator vmi_peek = vmi; + struct vm_area_struct *next = vma_next(&vmi_peek); + + if (next && overlaps_stack_gap_growsdown([...], next)) { + error = -ENOMEM; + break; + } + } + /* Does the application expect PROT_READ to imply PROT_EXEC */ if (rier && (vma->vm_flags & VM_MAYEXEC)) prot |= PROT_EXEC; ```
Assuming that well-behaved userspace only calls mprotect() ranges that are fully covered by VMAs, that should be good enough?
(I don't know how you feel about the idea of peeking ahead from a VMA iterator by copying the iterator, I imagine you might have a better way to do that...)
prev = vma_prev(&vmi); if (start > vma->vm_start) prev = vma;