On Wed, May 17, 2023 at 12:41 AM Dave Hansen dave.hansen@intel.com wrote:
On 5/16/23 00:06, Stephen Röttger wrote:
On Mon, May 15, 2023 at 4:28 PM Dave Hansen dave.hansen@intel.com wrote:
On 5/15/23 06:05, jeffxu@chromium.org wrote:
We're using PKU for in-process isolation to enforce control-flow integrity for a JIT compiler. In our threat model, an attacker exploits a vulnerability and has arbitrary read/write access to the whole process space concurrently to other threads being executed. This attacker can manipulate some arguments to syscalls from some threads.
This all sounds like it hinges on the contents of PKRU in the attacker thread.
Could you talk a bit about how the attacker is prevented from running WRPKRU, XRSTOR or compelling the kernel to write to PKRU like at sigreturn?
(resending without html)
Since we're using the feature for control-flow integrity, we assume the control-flow is still intact at this point. I.e. the attacker thread can't run arbitrary instructions.
Can't run arbitrary instructions, but can make (pretty) arbitrary syscalls?
The threat model is that the attacker has arbitrary read/write, while other threads run in parallel. So whenever a regular thread performs a syscall and takes a syscall argument from memory, we assume that argument can be attacker controlled. Unfortunately, the line is a bit blurry which syscalls / syscall arguments we need to assume to be attacker controlled. We're trying to approach this by roughly categorizing syscalls+args: * how commonly used is the syscall * do we expect the argument to be taken from writable memory * can we restrict the syscall+args with seccomp * how difficult is it to restrict the syscall in userspace vs kernel * does the syscall affect our protections (e.g. change control-flow or pkey)
Using munmap as an example: * it's a very common syscall (nearly every seccomp filter will allow munmap) * the addr argument will come from memory * unmapping pkey-tagged pages breaks our assumptions * it's hard to restrict in userspace since we'd need to keep track of all address ranges that are unsafe to unmap and hook the syscall to perform the validation on every call in the codebase. * it's easy to validate in kernel with this patch
For most other syscalls, they either don't affect the control-flow, are easy to avoid and block with seccomp or we can add validation in userspace (e.g. only install signal handlers at program startup).
- For JIT code, we're going to scan it for wrpkru instructions before
writing it to executable memory
... and XRSTOR, right?
Right. We’ll just have a list of allowed instructions that the JIT compiler can emit.
- For regular code, we only use wrpkru around short critical sections
to temporarily enable write access
Sigreturn is a separate problem that we hope to solve by adding pkey support to sigaltstack
What kind of support were you planning to add?
We’d like to allow registering pkey-tagged memory as a sigaltstack. This would allow the signal handler to run isolated from other threads. Right now, the main reason this doesn’t work is that the kernel would need to change the pkru state before storing the register state on the stack.
I was thinking that an attacker with arbitrary write access would wait until PKRU was on the userspace stack and *JUST* before the kernel sigreturn code restores it to write a malicious value. It could presumably do this with some asynchronous mechanism so that even if there was only one attacker thread, it could change its own value.
I’m not sure I follow the details, can you give an example of an asynchronous mechanism to do this? E.g. would this be the kernel writing to the memory in a syscall for example?
Also, the kernel side respect for PKRU is ... well ... rather weak. It's a best effort and if we *happen* to be in a kernel context where PKRU is relevant, we can try to respect PKRU. But there are a whole bunch of things like get_user_pages_remote() that just plain don't have PKRU available and can't respect it at all.
I think io_uring also greatly expanded how common "remote" access to process memory is.
So, overall, I'm thrilled to see another potential user for pkeys. It sounds like there's an actual user lined up here, which would be wonderful. But, I also want to make sure we don't go to the trouble to build something that doesn't actually present meaningful, durable obstacles to an attacker.
I also haven't more than glanced at the code.