Jeff Xu jeffxu@google.com wrote:
May I ask, for BSD's implementation of immutable(), do you cover things such as mlock(), madvice() ? or just the protection bit (WRX) + remap() + unmap().
It only prevents removal of the mapping, placement of a replacement mapping, or changing the existing permissions. If one page in the existing sub-region is marked immutable, the whole operation fails with EPERM.
Those are the only user-visible aspects that an attacker cares about to utilize in this area.
mlock() and madvise() deal with the physical memory handling underneath the VA. They have nothing to do with how attack code might manipulate the VA address space inside a program to convert a series of dead-end approaches into a succesfull escalation strategy.
[It would be very long conversation to explain where and how this has been utilized to make an attack succesfull]
In other words: Is BSD's definition of immutable equivalent to MM_SEAL_MPROTECT|MM_SEAL_MUNMAP|MM_SEAL_MREMAP|MM_SEAL_MMAP, of this patch set ?
I can't compare it to your subsystem, because I completely fail to understand the cause or benefit of all the complexity.
I think I've explained what mimmutable() is in extremely simple terms.
And I don't understand else you are trying to do anything beyond what mimmutable() offers. It seems like this is inventing additional solutions without proof that any of them are necessary to solve the specific problem that is known.
I hesitate to introduce the concept of immutable into linux because I don't know all the scenarios present in linux where VMAs's metadata can be modified.
Good grief. It seems obvious if you want to lock the change-behaviour of an object (the object in this case being a VA sub-region, there is a datastructure for that, in OpenBSD it is called an "entry"), then you put a flag in that object's data-structure and you simply check the flag everytime a change-operation is attempted. It is a flag which gets set, and checked. Nothing ever clears it (except address space teardown).
This flag must be put on the data structure that manages VA sub-ranges.
In our case when a prot/mapping operation reaches low-level code that will want to change an "entry", we notice it is not allowed and simply percolate EPERM up through the layers.
There could be quite a few things we still need to deal with, to completely block the possibility, e.g. malicious code attempting to write to a RO memory
What?! writes to RO memory are blocked by the permission bits.
or change RW memory to RWX.
In our case that is blocked by W^X policy.
But if the region is marked mimmutable, then that's another reason you cannot change RW to RWX. It seems so off-topic, to talk about writes to RO memory. I get a feeling you are a bit lost.
mimmutable() is not about permissions, but about locking permissions. - You can't change the permissions of the address space region. - You cannot map a replacement object at the location instead (especially with different permission). - You cannot unmap at that location (which you would do if you wanted to map a new object, with a different permission).
All 3 of these scenarios are identical. No regular code performs these 3 operations on regions of the address space which we mark immutable.
There is nothing more to mimmutable in the VM layer. The hard work is writing code in execve() and ld.so which will decide which objects can be marked immutable automatically, so that programs don't do this to themselves.
I'm aware of where this simple piece fits in. It does not solve all problems, it is a very narrow change to impact a problem which only high-value targets will ever face (like chrome).
But I think you don't understand the purpose of this mechanism.
If, as part of immutable, I also block madvice(), mlock(), which also updates VMA's metadata, so by definition, I could. What if the user wants the features in madvice() and at the same time, also wants their .text protected ?
I have no idea what you are talking about. None of those things relate to the access permission of the memory the user sees, and therefore none of them are in the attack surface profile which is being prevented.
Meaning, we allow madvise() and mlock() and mphysicalquantummemory() because those relate to the physical storage and not the VA permission model.
Also, if linux introduces a new syscall that depends on a new metadata of VMA, say msecret(), (for discussion purpose), should immutable automatically support that ?
How about the future makingexcuses() system call?
I don't think you understand the problem space well enough to come up with your own solution for it. I spent a year on this, and ship a complete system using it. You are asking such simplistic questions above it shocks me.
Maybe read the LWN article;
https://lwn.net/Articles/915640/
Without those questions answered, I couldn't choose the route of immutable() yet.
"... so I can clearly not choose the wine in front of you."
If you don't understand what this thing is for, and cannot minimize the complexity of this thing, then Linux doesn't need it at all.
I should warn everyone the hard work is not in the VM layer, but in ld.so -- deciding which parts of the image to make immutable, and when. It is also possible to make some segments immutable directly in execve() -- but in both cases you better have a really good grasp on RELRO executable layout or will make too many pieces immutable...
I am pretty sure Linux will never get as far as we got. Even our main stacks are marked immutable, but in Linux that would conflict with glibc ld.so mprotecting RWX the stack if you dlopen() a shared library with GNUSTACK, a very bad idea which needs a different fight...