On Fri, Nov 9, 2018 at 9:41 PM Andy Lutomirski luto@amacapital.net wrote:
On Nov 9, 2018, at 1:06 PM, Jann Horn jannh@google.com wrote:
+linux-api for API addition +hughd as FYI since this is somewhat related to mm/shmem
On Fri, Nov 9, 2018 at 9:46 PM Joel Fernandes (Google) joel@joelfernandes.org wrote:
Android uses ashmem for sharing memory regions. We are looking forward to migrating all usecases of ashmem to memfd so that we can possibly remove the ashmem driver in the future from staging while also benefiting from using memfd and contributing to it. Note staging drivers are also not ABI and generally can be removed at anytime.
One of the main usecases Android has is the ability to create a region and mmap it as writeable, then add protection against making any "future" writes while keeping the existing already mmap'ed writeable-region active. This allows us to implement a usecase where receivers of the shared memory buffer can get a read-only view, while the sender continues to write to the buffer.
Oh I remember trying this years ago with a new seal, F_SEAL_WRITE_PEER, or something like that.
So you're fiddling around with the file, but not the inode? How are you preventing code like the following from re-opening the file as writable?
$ cat memfd.c #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <printf.h> #include <fcntl.h> #include <err.h> #include <stdio.h>
int main(void) { int fd = syscall(__NR_memfd_create, "testfd", 0); if (fd == -1) err(1, "memfd"); char path[100]; sprintf(path, "/proc/self/fd/%d", fd); int fd2 = open(path, O_RDWR); if (fd2 == -1) err(1, "reopen"); printf("reopen successful: %d\n", fd2); } $ gcc -o memfd memfd.c $ ./memfd reopen successful: 4 $
The race condition between memfd_create and applying seals in fcntl? I think it would be possible to block new write mappings from peer processes if there is a new memfd_create api that accepts seals. Allowing caller to set a seal like the one I proposed years ago, though in a race-free manner. Then also consider how to properly handle blocking inherited +W mapping through clone/fork. Maybe I'm forgetting some other pitfalls?
That aside: I wonder whether a better API would be something that allows you to create a new readonly file descriptor, instead of fiddling with the writability of an existing fd.
Every now and then I try to write a patch to prevent using proc to reopen a file with greater permission than the original open.
I like your idea to have a clean way to reopen a a memfd with reduced permissions. But I would make it a syscall instead and maybe make it only work for memfd at first. And the proc issue would need to be fixed, too.
IMO the best solution would handle the issue at memfd creation time by removing the race condition.