Hi, Jiaqi,
On Fri, May 19, 2023 at 08:04:09AM -0700, Jiaqi Yan wrote:
I don't think CAP_ADMIN is something we can work around: a VMM must be a good citizen to avoid introducing any vulnerability to the host or guest.
On the other hand, "Userfaults allow the implementation of on-demand paging from userland and more generally they allow userland to take control of various memory page faults, something otherwise only the kernel code could do." [3]. I am not familiar with the UFFD internals, but our use case seems to match what UFFD wants to provide: without affecting the whole world, give a specific userspace (without CAP_ADMIN) the ability to handle page faults (indirectly emulate a HWPOISON page (in my mind I treat it as SetHWPOISON(page) + TestHWPOISON(page) operation in kernel's PF code)). So is it fair to say what Axel provided here is "provide !ADMIN somehow"?
Userfault keywords on "user", IMHO. We don't strictly need userfault to resolve anything regarding CAP_ADMIN problems. MADV_DONTNEED also dosn't need CAP_ADMIN, same to any new madvise() if we want to make it useful for injecting poisoned ptes with !ADMIN and limit it within current->mm.
But I think you're right that userfaultfd always tried to avoid having ADMIN and keep everything within its own scope of permissions.
So again, totally no objection on make it uffd specific for now if you guys are all happy with it, but just to be clear that it's (to me) mostly for avoiding another WAKE, and afaics that's not really for solving the ADMIN issue here.
Thanks,