On Wed, Aug 31, 2022 at 05:24:39PM +0300, Kirill A . Shutemov wrote:
On Sat, Aug 20, 2022 at 10:15:32PM -0700, Hugh Dickins wrote:
I will try next week to rework it as shim to top of shmem. Does it work for you?
Yes, please do, thanks. It's a compromise between us: the initial TDX case has no justification to use shmem at all, but doing it that way will help you with some of the infrastructure, and will probably be easiest for KVM to extend to other more relaxed fd cases later.
Okay, below is my take on the shim approach.
I don't hate how it turned out. It is easier to understand without callback exchange thing.
The only caveat is I had to introduce external lock to protect against race between lookup and truncate. Otherwise, looks pretty reasonable to me.
I did very limited testing. And it lacks integration with KVM, but API changed not substantially, any it should be easy to adopt.
I have integrated this patch with other KVM patches and verified the functionality works well in TDX environment with a minor fix below.
Any comments?
...
diff --git a/mm/memfd.c b/mm/memfd.c index 08f5f8304746..1853a90f49ff 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -261,7 +261,8 @@ long memfd_fcntl(struct file *file, unsigned int cmd, unsigned long arg) #define MFD_NAME_PREFIX_LEN (sizeof(MFD_NAME_PREFIX) - 1) #define MFD_NAME_MAX_LEN (NAME_MAX - MFD_NAME_PREFIX_LEN) -#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB) +#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | \
MFD_INACCESSIBLE)
SYSCALL_DEFINE2(memfd_create, const char __user *, uname, @@ -283,6 +284,14 @@ SYSCALL_DEFINE2(memfd_create, return -EINVAL; }
- /* Disallow sealing when MFD_INACCESSIBLE is set. */
- if ((flags & MFD_INACCESSIBLE) && (flags & MFD_ALLOW_SEALING))
return -EINVAL;
- /* TODO: add hugetlb support */
- if ((flags & MFD_INACCESSIBLE) && (flags & MFD_HUGETLB))
return -EINVAL;
- /* length includes terminating zero */ len = strnlen_user(uname, MFD_NAME_MAX_LEN + 1); if (len <= 0)
@@ -331,10 +340,24 @@ SYSCALL_DEFINE2(memfd_create, *file_seals &= ~F_SEAL_SEAL; }
- if (flags & MFD_INACCESSIBLE) {
struct file *inaccessible_file;
inaccessible_file = memfd_mkinaccessible(file);
if (IS_ERR(inaccessible_file)) {
error = PTR_ERR(inaccessible_file);
goto err_file;
}
The new file should alse be marked as O_LARGEFILE otherwise setting the initial size greater than 2^31 on the fd will be refused by ftruncate().
+ inaccessible_file->f_flags |= O_LARGEFILE; +
file = inaccessible_file;
- }
- fd_install(fd, file); kfree(name); return fd;
+err_file:
- fput(file);
err_fd: put_unused_fd(fd); err_name: