Stephen Röttger sroettger@google.com wrote:
I do like us starting with just "mimmutable()", since it already exists. Particularly if chrome already knows how to use it.
Maybe add a flag field (require it to be zero initially) just to allow any future expansion. Maybe the chrome team has *wanted* to have some finer granularity thing and currently doesn't use mimmutable() in some case?
Yes, we do have a use case in Chrome to split the sealing into unmap and mprotect which will allow us to seal additional pages that we can't seal with pure mimmutable().
For example, we have pkey-tagged RWX memory that we want to seal. Since the memory is already RWX and the pkey controls write access, we don't care about permission changes but sometimes we do need to mprotect data only pages.
Let me try to decompose this statement.
This is clearly for the JIT. You can pivot between the a JIT generated code mapping being RW and RX (or X-only), the object will pivot between W or X to satisfy W^X policy for safety.
I think you are talking about a RWX MAP_ANON object.
Then you use pkey_alloc() to get a PKEY. pkey_mprotect() sets the PKEY on the region. I argue you can then make it entirely immutable / sealed.
Let's say it is fully immutable / sealed.
After which, you can change the in-processor PKU register (using pkey_set) to toggle the Write-Inhibit and eXecute-Inhibit bits on that key.
The immutable object has a dangerous RWX permission. But the specific PKEY making it either RX (or X-only) or RW depending upon your context. The mapping is never exposed as RWX. The PKU model reduces the permission access of the object below the immutable permission level.
The security depends on the PKEY WI/XI bits being difficult to control.
SADLY on x86, this is managed with a PKRU userland register which is changeble without any supervisor control -- yes, it seems quite dangerous. Changing it requires a complicated MSR dance. It is unfortunate that the pkey_set() library function is easily reachedable in the PLT via ROP methods. On non-x86 cpus that have similar functionality, the register is privileged, but operating supporting it generally change it and return immediately.
The problem you seem to have with fully locked mseal() in chrome seems to be here:
about permission changes but sometimes we do need to mprotect data only pages.
Does that data have to be in the same region? Can your allocator not put the non-code pieces of the JIT elsewhere, with a different permission, fully immutable / msealed -- and perhaps even managed with a different PKEY if neccessary?
May that requires a huge rearchitecture. But isn't the root problem here that data and code are being handled in the same object with a shared permission model?
But the munmap sealing will provide protection against implicit changes of the pkey in this case which would happen if a page gets unmapped and another mapped in its place.
That primitive feels so weird, I have a difficult time believing it will remain unattackable in the long term.
But what if you could replace mprotect() with pkey_mprotect() upon a different key.. ?
--
A few more notes comparing what OpenBSD has done compared to Linux:
In OpenBSD, we do not have the pkey library. We have stolen one of the PKEY and use it for kernel support of xonly for kernel code and userland code. On x86 we recognize that userland can flip the permission by whacking the RPKU register -- which would make xonly code readable. (The chrome data you are trying to guard faces the same problem).
To prevent that, a majority of traps in the kernel (page faults, interrupts, etc) check if the PKRU register has been modified, and kill the process. It is statistically strong.
We are not making pkey available as a userland feature, but if we later do so we would still have 15 pkeys to play with. We would probably make the pkey_set() operation a system call, so the trap handler can also observe RPKU register modifications by the instruction.
Above, I mentioned pivoting between "RW or RX (or X-only)". On OpenBSD, chrome would be able to pivot between RW and X-only.
When it comes to Pkey utilization, we've ended up in a very different place than Linux.