Re: [PATCH v10 0/5] Introduce mseal

21 May 2024

      TL;DR for Andrew (and to save his page down key):
Reviewed-by: Liam R. Howlett Liam.Howlett@oracle.com
* Jeff Xu jeffxu@chromium.org [240515 20:59]:
...
On Wed, May 15, 2024 at 3:19 PM Liam R. Howlett Liam.Howlett@oracle.com wrote:
...

Jeff Xu jeffxu@chromium.org [240515 13:18]:

...
...
The current mseal patch does up-front checking in two different situations:
1 when applying mseal()
   Checking for unallocated memory in the given memory range.
2 When checking mseal flag during mprotect/unmap/remap/mmap
  Checking mseal flag is placed ahead of the main business logic, and
treated the same as input arguments check.
...
Either we are planning to clean this up and do what we can up-front, or
just move the mseal check with the rest.  Otherwise we are making a
larger mess with more technical dept for a single user, and I think this
is not an acceptable trade-off.
The sealing use case  is different  from regular mm API and this
didn't create additional technical debt.  Please allow me to explain
those separately.
The main use case and threat model is that an attacker exploits a
vulnerability and has arbitrary write access to the process, and can
manipulate some arguments to syscalls from some threads. Placing the
checking of mseal flag ahead of mprotect main business logic is
stricter compared with doing it in-place. It is meant to be harder for
the attacker, e.g. blocking the  opportunistically attempt of munmap
by modifying the size argument.
If you can manipulate some arguments to syscalls, couldn't it avoid
having the VMA mseal'ed?
The mm sealing can be applied in advance. This type of approach is
common in sandboxer, e.g. setup restrictive environments in advance.
Thanks, this detail slipped my mind.
...
...
Again I don't care where the check goes - but having it happen alone is
pointless.
...
The legit app code won't call mprotect/munmap on sealed memory.  It is
irrelevant for both precheck and in-place check approaches, from a
legit app code point of view.
So let's do them together.
For the user case I describe in the threat-model, precheck is a better
approach. Legit code doesn't care.
This is the case for other checks as well, but they're all done
together.
...
...
...
...
About tech debt, code-wise , placing pre-check ahead of the main
business logic of mprotect/munmap APIs, reduces the size of code
change, and is easy to carry from release to release, or backporting.
It sounds like the other changes to the looping code in recent kernels
is going to mess up the backporting if we do this with the rest of the
checks.
What other changes do you refer to ?
I backported V9 to 5.10 when I ran the performance test on your
request, and the backporting to 5.10 is relatively straight forward,
the mseal flag check is placed after input arguments check and before
the main business logic.
The changes to the later looping code would complicate your backport.
94d7d9233951 ("mm: abstract the vma_merge()/split_vma() pattern for
mprotect() et al."), for example.
...
...
...
But let's compare  the alternatives - doing it in-place without precheck.

munmap

munmap calls arch_unmap(mm, start, end) ahead of main business logic,
the checking of sealing flags would need to be architect specific. In
addition, if arch_unmap return fails due to sealing, the code should
still proceed, till the main business logic fails again.
You are going to mseal the vdso?
How is that relevant ?
This is generally what arch_unmap() is checking, that's why I was
wondering if it would be affected.
...
To answer your question: I don't know at this moment.
The initial scope of libc change is sealing the RO/RX part during elf
loading.e.g. .text and .RELO
Right, this is for chrome in your usecase.
...
...
...

mremap/mmap

The check of sealing would be scattered, e.g. checking the src address
range in-place, dest arrange in-place, unmap in-place, etc. The code
is complex and prone to error.
-mprotect/madvice
Easy to change to in-place.

mseal

mseal() check unallocated memory in the given memory range in the
pre-check. Easy to change to in-place (same as mprotect)
The situation in munmap and mremap/mmap make in-place checks less desirable imo.
...
Considering the benchmarks that were provided, performance arguments
seem like they are not a concern.
Yes. Performance is not a factor in making a design choice on this.
...
I want to know if we are planning to sort and move existing checks if we
proceed with this change?
I would argue that we should not change the existing mm code. mseal is
new and no backward compatible problem. That is not the case for
mprotect and other mm api. E.g. if we were to change mprotect to add a
precheck for memory gap, some badly written application might break.
This is a weak argument. Your new function may break these badly written
applications *if* gcc adds support.  If you're not checking the return
type then it doesn't really matter - the application will run into
issues rather quickly anyways.  The only thing that you could argue is
the speed - but you've proven that false.
The point I raised here is that there is a risk to modify  mm API's
established behavior. Kernel doesn't usually make this kind of
behavior change.
Sure, but we have security checks happening later and they can fail 1/2
way through.  Although, depending on the 1/2 success is an application
bug and means the application is not portable.  This was my main reason
for requesting this check be placed with the rest, as we are now
treating mseal() as a special case among even security features.
Some of the existing checks add unnecessary complications to keep them
together, unfortunately.  Your addition of a loop prior to making the
changes means we can probably simplify some of these checks by
generalizing the loop in future patches.
...
mm sealing is a new functionality, I think applications will need to
opt in , e.g. allow dynamic linker to seal .text.
...
...
The 'atomic' approach is also really difficult to enforce to the whole
MM area, mseal() doesn't claim it is atomic. Most regular mm API might
go deeper in mm data structure to update page tables and HW, etc. The
rollback in handling those error cases, and performance cost. I'm not
sure if the benefit is worth the cost. However, atomicity is another
topic to discuss unrelated to mm sealing.  The current design of mm
sealing is due to its use case and practical coding reason.
"best effort" is what I'm saying.  It's actually not really difficult to
do atomic, but no one cares besides Theo.
OK, if you strongly believe in 'atomic' or 'best effort atomic',
whatever it is, consider sending a patch and getting feedback from the
community ?
Sounds good.  This will probably happen over time.
...
...
How hard is it to put userfaultfd into your loop and avoid having that
horrible userfaulfd in munmap?  For years people see horrible failure
paths and just dump in a huge comment saying "but it's okay because it's
probably not going to happen".  But now we're putting this test up
front, and doing it alone - Why?
As a summary of why:

The use case: it makes it harder for attackers to modify memory

opportunistically.

Code: Less and simpler code change.

Fair enough.  Thank you for providing the arguments for each up-front
check vs embedding them. I didn't want to hold up your feature for so
long and I appreciate you taking the time to respond to my questions on
your decisions.  Apologies for kicking the hornets nest on this one.
I think, in the future, we can use your forward loop to clean up some of
the design decisions of the past - ideally by choice and not by CVE
forced changes.  Hopefully having both pre and inter-loop checks won't
mean one will be missed when altering these code paths.
Thanks,
Liam

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v10 0/5] Introduce mseal