On Thu, May 18, 2023 at 2:43 PM Dave Hansen dave.hansen@intel.com wrote:
On 5/15/23 06:05, jeffxu@chromium.org wrote:
+static inline int __arch_check_vma_pkey_for_write(struct vm_area_struct *vma) +{
int pkey = vma_pkey(vma);
if (mm_pkey_enforce_api(vma->vm_mm, pkey)) {
if (!__pkru_allows_write(read_pkru(), pkey))
return -EACCES;
}
return 0;
+}
Please think very carefully about what I'm about to say:
What connects vma->vm_mm to read_pkru() here?
Now think about what happens when we have kthread_use_mm() or a ptrace() doing get_task_mm() and working on another process's mm or VMA.
Look at arch_vma_access_permitted() and notice how it avoids read_pkru() for 'foreign' aka. 'remote' accesses:
static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write, bool execute, bool foreign) {
...
if (foreign || vma_is_foreign(vma)) return true; return // check read_pkru()
}
In other words, it lets all remote accesses right through. That's because there is *NOTHING* that fundamentally and tightly connects the PKRU value in this context to the VMA or the context that initiated this operation.
If your security model depends on PKRU protection, this 'remote' disconnection is problematic. The PKRU enforcement inside the kernel is best-effort. That usually doesn't map into the security space very well.
Do you have a solid handle on all call paths that will reach __arch_check_vma_pkey_for_write() and can you ensure they are all non-remote?
Is this about the attack scenario where the attacker uses ptrace() into the chrome process ? if so it is not in our threat model, and that is more related to sandboxing on the host.
Or is this about io_uring? Yes, io_uring kernel thread breaks our expectations of PKRU & user space threads, however I thought the break is not just for this - any syscall involved in memory operation will break after into io_uring ?
Other than those, yes, I try to ensure the check is only used at the beginning of syscall entry in all cases, which should be non-remote I hope.
Thanks -Jeff