When sharing a dma-buf between components of different trust levels, the allocator may need to hand a consumer a read-only view of a buffer it holds with read-write access. An example is a camera pipeline where the capture component writes frames into a buffer and needs to pass a read-only handle to a downstream processing component that should not be able to modify the data.
However, no such mechanism exists today. The access mode of a dma-buf file descriptor is fixed at export time, and the standard POSIX interfaces for duplicating or changing file descriptors (i.e., dup(2), dup3(2), and fcntl(F_SETFL)) cannot alter the read/write access mode of the copy.
One natural candidate would be reopening via /proc/self/fd/<N> with O_RDONLY, which works for regular files. For dma-buf this would fail (that is, if we were to add a new handler for open f_op) with ENXIO because the dmabuf pseudo-filesystem carries SB_NOUSER, which prevents the VFS from opening its files through path-based resolution from userspace.
Alternatively, exporting the buffer twice would produce two independent dma_buf instances, which breaks fence synchronization.
Therefore we add a new DMA_BUF_IOCTL_DERIVE ioctl, which produces a new file descriptor for an existing dma-buf with a caller-specified subset of the original permissions:
``` struct dma_buf_derive { __u32 flags; __s32 fd; };
struct dma_buf_derive req = { .flags = O_RDONLY | O_CLOEXEC }; ioctl(rw_fd, DMA_BUF_IOCTL_DERIVE, &req); /* req.fd is now a read-only alias of the same buffer */ ```
Permission escalation is rejected with -EACCES. The new fd aliases the same struct dma_buf as the original, same dma_resv, same exporter ops, same underlying memory; so importers attaching to either fd see the same fence timeline and operate on the same object. Access control for which components may receive or pass on restricted descriptors can be layered on top via SELinux file:read and file:write permissions.
A shared writable mapping (PROT_WRITE | MAP_SHARED) on the read-only fd is rejected with -EACCES in dma_buf_mmap_internal().
Two small internal adjustments accompany the ioctl: - __dma_buf_list_del() is moved to dma_buf_release() so it fires exactly once on dentry destruction rather than on every file close. - dma_buf_file_release() is updated to call dma_buf_put() only for files that are not the primary dma-buf file.
This may not be the best approach, but after considering different options and alternatives (as described above), we decided to raise the discussion upstream. Thus, we welcome any alternative proposal or ideas.
The series is structured as: - Patch 1 adds the new ioctl implementation. - Patch 2 adds selftests covering the new ioctl.
Signed-off-by: Albert Esteve aesteve@redhat.com --- Albert Esteve (2): dma-buf: add DMA_BUF_IOCTL_DERIVE for reduced-permission aliases selftests: dma-buf: add DERIVE ioctl tests
drivers/dma-buf/dma-buf.c | 58 ++++++++++- include/uapi/linux/dma-buf.h | 28 +++++ tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 114 ++++++++++++++++++++- 3 files changed, 198 insertions(+), 2 deletions(-) --- base-commit: ab5fce87a778cb780a05984a2ca448f2b41aafbf change-id: 20260520-dmabuf-limit-access-73261353841a
Best regards,
When sharing a dma-buf between components of different trust levels, the allocator may need to hand out a read-only view of a buffer it holds with read-write access. Currently there is no mechanism to do this: the file flags set at allocation time are fixed for the lifetime of the dma-buf, and dup(2) and dup3(2) cannot change the access mode of the new fd.
Add DMA_BUF_IOCTL_DERIVE, which takes a struct dma_buf_derive carrying the desired access flags and returns a new file descriptor for the same buffer with those flags applied. Permission escalation is rejected with EACCES.
The new fd aliases the same struct dma_buf, same dma_resv, same exporter ops, same underlying memory. Importers that attach to either fd operate on the same object and observe the same fence timeline.
To support multiple struct file instances sharing one struct dma_buf, two small internal adjustments are required. First, move __dma_buf_list_del() to dma_buf_release() so that list removal fires exactly once when the dentry is destroyed. Second, update dma_buf_file_release() to call dma_buf_put() only for the files that are not primary dmabuf files, leaving the primary fd's refcount managed by the normal dentry lifecycle.
Finally, enforce the access restriction in dma_buf_mmap_internal(): a shared writable mapping (MAP_SHARED + PROT_WRITE) on a read-only fd is rejected with -EACCES. Without this check, O_RDONLY on a dma-buf fd would be cosmetic, as the VFS does not enforce f_mode for writable mmap on anonymous inodes.
Signed-off-by: Albert Esteve aesteve@redhat.com --- drivers/dma-buf/dma-buf.c | 58 +++++++++++++++++++++++++++++++++++++++++++- include/uapi/linux/dma-buf.h | 28 +++++++++++++++++++++ 2 files changed, 85 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 71f37544a5c61..34a3872365730 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -180,6 +180,7 @@ static void dma_buf_release(struct dentry *dentry) */ BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
+ __dma_buf_list_del(dmabuf); dmabuf->ops->release(dmabuf);
if (dmabuf->resv == (struct dma_resv *)&dmabuf[1]) @@ -193,10 +194,13 @@ static void dma_buf_release(struct dentry *dentry)
static int dma_buf_file_release(struct inode *inode, struct file *file) { + struct dma_buf *dmabuf = file->private_data; + if (!is_dma_buf_file(file)) return -EINVAL;
- __dma_buf_list_del(file->private_data); + if (file != dmabuf->file) + dma_buf_put(dmabuf);
return 0; } @@ -232,6 +236,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) if (!is_dma_buf_file(file)) return -EINVAL;
+ if ((vma->vm_flags & VM_WRITE) && + (vma->vm_flags & VM_SHARED) && + !(file->f_mode & FMODE_WRITE)) + return -EACCES; + dmabuf = file->private_data;
/* check if buffer supports mmap */ @@ -537,6 +546,50 @@ static long dma_buf_import_sync_file(struct dma_buf *dmabuf, } #endif
+static const struct file_operations dma_buf_fops; + +static int dma_buf_ioctl_derive(struct dma_buf *dmabuf, struct file *file, + void __user *udata) +{ + struct dma_buf_derive params; + struct file *new_file; + int new_fd; + + if (copy_from_user(¶ms, udata, sizeof(params))) + return -EFAULT; + + if (params.flags & ~(O_ACCMODE | O_CLOEXEC)) + return -EINVAL; + + /* Escalating permissions is not allowed. */ + if ((params.flags & O_ACCMODE) == O_RDWR && + !(file->f_mode & FMODE_WRITE)) + return -EACCES; + + new_file = alloc_file_clone(dmabuf->file, params.flags, &dma_buf_fops); + if (IS_ERR(new_file)) + return PTR_ERR(new_file); + + get_dma_buf(dmabuf); + new_file->private_data = dmabuf; + + new_fd = get_unused_fd_flags(params.flags & O_CLOEXEC ? O_CLOEXEC : 0); + if (new_fd < 0) { + fput(new_file); + return new_fd; + } + + params.fd = new_fd; + if (copy_to_user(udata, ¶ms, sizeof(params))) { + put_unused_fd(new_fd); + fput(new_file); + return -EFAULT; + } + + fd_install(new_fd, new_file); + return 0; +} + static long dma_buf_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -587,6 +640,9 @@ static long dma_buf_ioctl(struct file *file, return dma_buf_import_sync_file(dmabuf, (const void __user *)arg); #endif
+ case DMA_BUF_IOCTL_DERIVE: + return dma_buf_ioctl_derive(dmabuf, file, (void __user *)arg); + default: return -ENOTTY; } diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h index e827c9d20c5d3..d0cf616228e55 100644 --- a/include/uapi/linux/dma-buf.h +++ b/include/uapi/linux/dma-buf.h @@ -168,6 +168,33 @@ struct dma_buf_import_sync_file { __s32 fd; };
+/** + * struct dma_buf_derive - Obtain a dma-buf fd with reduced access permissions + * + * Userspace can perform a DMA_BUF_IOCTL_DERIVE to obtain a second file + * descriptor for the same dma-buf with a subset of the calling fd's + * permissions. This allows a producer holding read-write access to hand a + * read-only view to a less-privileged consumer without giving up its own + * write access or allocating a separate buffer. + * + * Unlike first-export ioctls, the new fd is not a re-export. It shares the + * same reservation object, exporter ops, and underlying memory as the + * original. + * + * The requested permissions must not exceed those of the calling fd. + */ +struct dma_buf_derive { + /** + * @flags: Requested access flags. + * + * Accepts O_RDONLY or O_RDWR, optionally combined with O_CLOEXEC. + * All other bits must be zero. + */ + __u32 flags; + /** @fd: Returned file descriptor with the requested permissions */ + __s32 fd; +}; + #define DMA_BUF_BASE 'b' #define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
@@ -179,5 +206,6 @@ struct dma_buf_import_sync_file { #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, __u64) #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, struct dma_buf_export_sync_file) #define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct dma_buf_import_sync_file) +#define DMA_BUF_IOCTL_DERIVE _IOWR(DMA_BUF_BASE, 4, struct dma_buf_derive)
#endif
On 5/21/26 11:10, Albert Esteve wrote:
When sharing a dma-buf between components of different trust levels, the allocator may need to hand out a read-only view of a buffer it holds with read-write access. Currently there is no mechanism to do this: the file flags set at allocation time are fixed for the lifetime of the dma-buf, and dup(2) and dup3(2) cannot change the access mode of the new fd.
Add DMA_BUF_IOCTL_DERIVE, which takes a struct dma_buf_derive carrying the desired access flags and returns a new file descriptor for the same buffer with those flags applied. Permission escalation is rejected with EACCES.
The new fd aliases the same struct dma_buf, same dma_resv, same exporter ops, same underlying memory. Importers that attach to either fd operate on the same object and observe the same fence timeline.
To support multiple struct file instances sharing one struct dma_buf, two small internal adjustments are required. First, move __dma_buf_list_del() to dma_buf_release() so that list removal fires exactly once when the dentry is destroyed. Second, update dma_buf_file_release() to call dma_buf_put() only for the files that are not primary dmabuf files, leaving the primary fd's refcount managed by the normal dentry lifecycle.
Finally, enforce the access restriction in dma_buf_mmap_internal(): a shared writable mapping (MAP_SHARED + PROT_WRITE) on a read-only fd is rejected with -EACCES. Without this check, O_RDONLY on a dma-buf fd would be cosmetic, as the VFS does not enforce f_mode for writable mmap on anonymous inodes.
Clear NAK to that since that would break the existing uAPI.
Regards, Christian.
Signed-off-by: Albert Esteve aesteve@redhat.com
drivers/dma-buf/dma-buf.c | 58 +++++++++++++++++++++++++++++++++++++++++++- include/uapi/linux/dma-buf.h | 28 +++++++++++++++++++++ 2 files changed, 85 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 71f37544a5c61..34a3872365730 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -180,6 +180,7 @@ static void dma_buf_release(struct dentry *dentry) */ BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
- __dma_buf_list_del(dmabuf); dmabuf->ops->release(dmabuf);
if (dmabuf->resv == (struct dma_resv *)&dmabuf[1]) @@ -193,10 +194,13 @@ static void dma_buf_release(struct dentry *dentry) static int dma_buf_file_release(struct inode *inode, struct file *file) {
- struct dma_buf *dmabuf = file->private_data;
- if (!is_dma_buf_file(file)) return -EINVAL;
- __dma_buf_list_del(file->private_data);
- if (file != dmabuf->file)
dma_buf_put(dmabuf);return 0; } @@ -232,6 +236,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) if (!is_dma_buf_file(file)) return -EINVAL;
- if ((vma->vm_flags & VM_WRITE) &&
(vma->vm_flags & VM_SHARED) &&!(file->f_mode & FMODE_WRITE))return -EACCES;- dmabuf = file->private_data;
/* check if buffer supports mmap */ @@ -537,6 +546,50 @@ static long dma_buf_import_sync_file(struct dma_buf *dmabuf, } #endif +static const struct file_operations dma_buf_fops;
+static int dma_buf_ioctl_derive(struct dma_buf *dmabuf, struct file *file,
void __user *udata)+{
- struct dma_buf_derive params;
- struct file *new_file;
- int new_fd;
- if (copy_from_user(¶ms, udata, sizeof(params)))
return -EFAULT;- if (params.flags & ~(O_ACCMODE | O_CLOEXEC))
return -EINVAL;- /* Escalating permissions is not allowed. */
- if ((params.flags & O_ACCMODE) == O_RDWR &&
!(file->f_mode & FMODE_WRITE))return -EACCES;- new_file = alloc_file_clone(dmabuf->file, params.flags, &dma_buf_fops);
- if (IS_ERR(new_file))
return PTR_ERR(new_file);- get_dma_buf(dmabuf);
- new_file->private_data = dmabuf;
- new_fd = get_unused_fd_flags(params.flags & O_CLOEXEC ? O_CLOEXEC : 0);
- if (new_fd < 0) {
fput(new_file);return new_fd;- }
- params.fd = new_fd;
- if (copy_to_user(udata, ¶ms, sizeof(params))) {
put_unused_fd(new_fd);fput(new_file);return -EFAULT;- }
- fd_install(new_fd, new_file);
- return 0;
+}
static long dma_buf_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -587,6 +640,9 @@ static long dma_buf_ioctl(struct file *file, return dma_buf_import_sync_file(dmabuf, (const void __user *)arg); #endif
- case DMA_BUF_IOCTL_DERIVE:
return dma_buf_ioctl_derive(dmabuf, file, (void __user *)arg);- default: return -ENOTTY; }
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h index e827c9d20c5d3..d0cf616228e55 100644 --- a/include/uapi/linux/dma-buf.h +++ b/include/uapi/linux/dma-buf.h @@ -168,6 +168,33 @@ struct dma_buf_import_sync_file { __s32 fd; }; +/**
- struct dma_buf_derive - Obtain a dma-buf fd with reduced access permissions
- Userspace can perform a DMA_BUF_IOCTL_DERIVE to obtain a second file
- descriptor for the same dma-buf with a subset of the calling fd's
- permissions. This allows a producer holding read-write access to hand a
- read-only view to a less-privileged consumer without giving up its own
- write access or allocating a separate buffer.
- Unlike first-export ioctls, the new fd is not a re-export. It shares the
- same reservation object, exporter ops, and underlying memory as the
- original.
- The requested permissions must not exceed those of the calling fd.
- */
+struct dma_buf_derive {
- /**
* @flags: Requested access flags.** Accepts O_RDONLY or O_RDWR, optionally combined with O_CLOEXEC.* All other bits must be zero.*/- __u32 flags;
- /** @fd: Returned file descriptor with the requested permissions */
- __s32 fd;
+};
#define DMA_BUF_BASE 'b' #define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync) @@ -179,5 +206,6 @@ struct dma_buf_import_sync_file { #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, __u64) #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, struct dma_buf_export_sync_file) #define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct dma_buf_import_sync_file) +#define DMA_BUF_IOCTL_DERIVE _IOWR(DMA_BUF_BASE, 4, struct dma_buf_derive) #endif
dma-buf now supports aliasing an existing file descriptor with reduced access permissions via DMA_BUF_IOCTL_DERIVE ioctl, and enforces that a shared writable mapping cannot be created through a read-only file descriptor.
Add two tests to the dmabuf-heaps selftest to exercise this behaviour. The positive test allocates a buffer, derives its file descriptor as O_RDONLY through the ioctl, confirms that a read-only shared mapping succeeds, and verifies that data written through the original read-write file descriptor is visible through the derived one.
The negative test confirms that attempting a DMA_BUF_IOCTL_DERIVE ioctl call with RW flags on a read-only file descriptor is rejected with EACCES. Same for mmap() escalation attempt.
Signed-off-by: Albert Esteve aesteve@redhat.com --- tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 114 ++++++++++++++++++++- 1 file changed, 113 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c index fc9694fc4e89e..c3856189200be 100644 --- a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c +++ b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c @@ -390,6 +390,116 @@ static void test_alloc_errors(char *heap_name) close(heap_fd); }
+static int setup_ro_derive(int heap_fd, int *dmabuf_fd, int *ro_fd) +{ + struct dma_buf_derive params = { + .flags = O_RDONLY | O_CLOEXEC, + }; + int ret; + + ret = dmabuf_heap_alloc(heap_fd, ONE_MEG, 0, dmabuf_fd); + ksft_test_result(!ret, "Allocate RW buffer\n"); + if (ret) + return -1; + + ret = ioctl(*dmabuf_fd, DMA_BUF_IOCTL_DERIVE, ¶ms); + ksft_test_result(!ret, "Derive as O_RDONLY %s\n", + ret < 0 ? strerror(errno) : "OK"); + if (ret < 0) { + close(*dmabuf_fd); + *dmabuf_fd = -1; + return -1; + } + + *ro_fd = params.fd; + return 0; +} + +static void test_ro_derive(char *heap_name) +{ + int heap_fd = -1, dmabuf_fd = -1, ro_fd = -1; + void *rw_map = MAP_FAILED, *ro_map = MAP_FAILED; + int ret; + + heap_fd = dmabuf_heap_open(heap_name); + + ksft_print_msg("Testing read-only derive with mmap:\n"); + + if (setup_ro_derive(heap_fd, &dmabuf_fd, &ro_fd)) + goto out; + + rw_map = mmap(NULL, ONE_MEG, PROT_READ | PROT_WRITE, MAP_SHARED, + dmabuf_fd, 0); + ksft_test_result(rw_map != MAP_FAILED, "RW mmap on RW fd %s\n", + rw_map == MAP_FAILED ? strerror(errno) : "OK"); + if (rw_map == MAP_FAILED) + goto out; + + dmabuf_sync(dmabuf_fd, DMA_BUF_SYNC_START); + memset(rw_map, 0xab, ONE_MEG); + dmabuf_sync(dmabuf_fd, DMA_BUF_SYNC_END); + + ro_map = mmap(NULL, ONE_MEG, PROT_READ, MAP_SHARED, ro_fd, 0); + ksft_test_result(ro_map != MAP_FAILED, "RO mmap on RO fd %s\n", + ro_map == MAP_FAILED ? strerror(errno) : "OK"); + if (ro_map == MAP_FAILED) + goto out; + + dmabuf_sync(ro_fd, DMA_BUF_SYNC_START); + ret = memcmp(rw_map, ro_map, ONE_MEG); + dmabuf_sync(ro_fd, DMA_BUF_SYNC_END); + ksft_test_result(!ret, "Data written via RW fd visible through RO fd\n"); + +out: + if (ro_map != MAP_FAILED) + munmap(ro_map, ONE_MEG); + if (rw_map != MAP_FAILED) + munmap(rw_map, ONE_MEG); + if (ro_fd >= 0) + close(ro_fd); + if (dmabuf_fd >= 0) + close(dmabuf_fd); + close(heap_fd); +} + +static void test_ro_derive_escalation(char *heap_name) +{ + struct dma_buf_derive params = { + .flags = O_RDWR | O_CLOEXEC, + }; + int heap_fd = -1, dmabuf_fd = -1, ro_fd = -1, ret = 0; + void *bad_map; + + heap_fd = dmabuf_heap_open(heap_name); + + ksft_print_msg("Testing read-only derive with escalation attempt:\n"); + + if (setup_ro_derive(heap_fd, &dmabuf_fd, &ro_fd)) + goto out; + + ret = ioctl(ro_fd, DMA_BUF_IOCTL_DERIVE, ¶ms); + ksft_test_result(ret < 0 && errno == EACCES, + "O_RDWR derive on RO fd correctly rejected (errno=%d)\n", + errno); + if (!ret) + close(params.fd); + + bad_map = mmap(NULL, ONE_MEG, PROT_READ | PROT_WRITE, MAP_SHARED, + ro_fd, 0); + ksft_test_result(bad_map == MAP_FAILED && errno == EACCES, + "RW shared mmap on RO fd correctly rejected (errno=%d)\n", + errno); + if (bad_map != MAP_FAILED) + munmap(bad_map, ONE_MEG); + +out: + if (ro_fd >= 0) + close(ro_fd); + if (dmabuf_fd >= 0) + close(dmabuf_fd); + close(heap_fd); +} + static int numer_of_heaps(void) { DIR *d = opendir(DEVPATH); @@ -420,7 +530,7 @@ int main(void) return KSFT_SKIP; }
- ksft_set_plan(11 * numer_of_heaps()); + ksft_set_plan(20 * numer_of_heaps());
while ((dir = readdir(d))) { if (!strncmp(dir->d_name, ".", 2)) @@ -435,6 +545,8 @@ int main(void) test_alloc_zeroed(dir->d_name, ONE_MEG); test_alloc_compat(dir->d_name); test_alloc_errors(dir->d_name); + test_ro_derive(dir->d_name); + test_ro_derive_escalation(dir->d_name); } closedir(d);
On 5/21/26 11:10, Albert Esteve wrote:
When sharing a dma-buf between components of different trust levels, the allocator may need to hand a consumer a read-only view of a buffer it holds with read-write access. An example is a camera pipeline where the capture component writes frames into a buffer and needs to pass a read-only handle to a downstream processing component that should not be able to modify the data.
However, no such mechanism exists today. The access mode of a dma-buf file descriptor is fixed at export time, and the standard POSIX interfaces for duplicating or changing file descriptors (i.e., dup(2), dup3(2), and fcntl(F_SETFL)) cannot alter the read/write access mode of the copy.
One natural candidate would be reopening via /proc/self/fd/<N> with O_RDONLY, which works for regular files. For dma-buf this would fail (that is, if we were to add a new handler for open f_op) with ENXIO because the dmabuf pseudo-filesystem carries SB_NOUSER, which prevents the VFS from opening its files through path-based resolution from userspace.
OH MY GOD! This is the like the sixth time I had to clarify that in the last few weeks, I'm really wondering where that is suddenly coming from.
Creating the DMA-buf with O_RDONLY does *NOT* make the DMA-buf itself read only!
That's a really common misconception. The flag only controls if mmap() can be done read/write or read-only to handle cache coherency issues.
It is still perfectly possible for a device to write into a DMA-buf created with O_RDONLY with DMA!
So long story short there is not such feature as a read only DMA-buf, and putting read-only pages into a DMA-buf and then expecting that nobody can write to them is an absolutely clear No-Go.
If we would want to implement a read-only DMA-buf feature we would need to go over all the different DMA-buf importers in the kernel and add security checks.
Regards, Christian.
Alternatively, exporting the buffer twice would produce two independent dma_buf instances, which breaks fence synchronization.
Therefore we add a new DMA_BUF_IOCTL_DERIVE ioctl, which produces a new file descriptor for an existing dma-buf with a caller-specified subset of the original permissions:
struct dma_buf_derive { __u32 flags; __s32 fd; }; struct dma_buf_derive req = { .flags = O_RDONLY | O_CLOEXEC }; ioctl(rw_fd, DMA_BUF_IOCTL_DERIVE, &req); /* req.fd is now a read-only alias of the same buffer */Permission escalation is rejected with -EACCES. The new fd aliases the same struct dma_buf as the original, same dma_resv, same exporter ops, same underlying memory; so importers attaching to either fd see the same fence timeline and operate on the same object. Access control for which components may receive or pass on restricted descriptors can be layered on top via SELinux file:read and file:write permissions.
A shared writable mapping (PROT_WRITE | MAP_SHARED) on the read-only fd is rejected with -EACCES in dma_buf_mmap_internal().
Two small internal adjustments accompany the ioctl:
- __dma_buf_list_del() is moved to dma_buf_release() so it fires exactly once on dentry destruction rather than on every file close.
- dma_buf_file_release() is updated to call dma_buf_put() only for files that are not the primary dma-buf file.
This may not be the best approach, but after considering different options and alternatives (as described above), we decided to raise the discussion upstream. Thus, we welcome any alternative proposal or ideas.
The series is structured as:
- Patch 1 adds the new ioctl implementation.
- Patch 2 adds selftests covering the new ioctl.
Signed-off-by: Albert Esteve aesteve@redhat.com
Albert Esteve (2): dma-buf: add DMA_BUF_IOCTL_DERIVE for reduced-permission aliases selftests: dma-buf: add DERIVE ioctl tests
drivers/dma-buf/dma-buf.c | 58 ++++++++++- include/uapi/linux/dma-buf.h | 28 +++++ tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 114 ++++++++++++++++++++- 3 files changed, 198 insertions(+), 2 deletions(-)
base-commit: ab5fce87a778cb780a05984a2ca448f2b41aafbf change-id: 20260520-dmabuf-limit-access-73261353841a
Best regards,
Hi Christian,
On Thu, May 21, 2026 at 2:28 PM Christian König christian.koenig@amd.com wrote:
On 5/21/26 11:10, Albert Esteve wrote:
When sharing a dma-buf between components of different trust levels, the allocator may need to hand a consumer a read-only view of a buffer it holds with read-write access. An example is a camera pipeline where the capture component writes frames into a buffer and needs to pass a read-only handle to a downstream processing component that should not be able to modify the data.
However, no such mechanism exists today. The access mode of a dma-buf file descriptor is fixed at export time, and the standard POSIX interfaces for duplicating or changing file descriptors (i.e., dup(2), dup3(2), and fcntl(F_SETFL)) cannot alter the read/write access mode of the copy.
One natural candidate would be reopening via /proc/self/fd/<N> with O_RDONLY, which works for regular files. For dma-buf this would fail (that is, if we were to add a new handler for open f_op) with ENXIO because the dmabuf pseudo-filesystem carries SB_NOUSER, which prevents the VFS from opening its files through path-based resolution from userspace.
OH MY GOD! This is the like the sixth time I had to clarify that in the last few weeks, I'm really wondering where that is suddenly coming from.
Sorry! I do not know where others came from. But my interest comes from automotive, safety, and mixed criticality scenarios. I kind of hinted at that in the opening when referring to "different trust levels".
Creating the DMA-buf with O_RDONLY does *NOT* make the DMA-buf itself read only!
That's a really common misconception. The flag only controls if mmap() can be done read/write or read-only to handle cache coherency issues.
It is still perfectly possible for a device to write into a DMA-buf created with O_RDONLY with DMA!
So long story short there is not such feature as a read only DMA-buf, and putting read-only pages into a DMA-buf and then expecting that nobody can write to them is an absolutely clear No-Go.
If we would want to implement a read-only DMA-buf feature we would need to go over all the different DMA-buf importers in the kernel and add security checks.
This clarifies a lot. Too bad, but it makes sense. I will abandon the series then.
Thanks for the review and the explanation!
BR, Albert
Regards, Christian.
Alternatively, exporting the buffer twice would produce two independent dma_buf instances, which breaks fence synchronization.
Therefore we add a new DMA_BUF_IOCTL_DERIVE ioctl, which produces a new file descriptor for an existing dma-buf with a caller-specified subset of the original permissions:
struct dma_buf_derive { __u32 flags; __s32 fd; }; struct dma_buf_derive req = { .flags = O_RDONLY | O_CLOEXEC }; ioctl(rw_fd, DMA_BUF_IOCTL_DERIVE, &req); /* req.fd is now a read-only alias of the same buffer */Permission escalation is rejected with -EACCES. The new fd aliases the same struct dma_buf as the original, same dma_resv, same exporter ops, same underlying memory; so importers attaching to either fd see the same fence timeline and operate on the same object. Access control for which components may receive or pass on restricted descriptors can be layered on top via SELinux file:read and file:write permissions.
A shared writable mapping (PROT_WRITE | MAP_SHARED) on the read-only fd is rejected with -EACCES in dma_buf_mmap_internal().
Two small internal adjustments accompany the ioctl:
- __dma_buf_list_del() is moved to dma_buf_release() so it fires exactly once on dentry destruction rather than on every file close.
- dma_buf_file_release() is updated to call dma_buf_put() only for files that are not the primary dma-buf file.
This may not be the best approach, but after considering different options and alternatives (as described above), we decided to raise the discussion upstream. Thus, we welcome any alternative proposal or ideas.
The series is structured as:
- Patch 1 adds the new ioctl implementation.
- Patch 2 adds selftests covering the new ioctl.
Signed-off-by: Albert Esteve aesteve@redhat.com
Albert Esteve (2): dma-buf: add DMA_BUF_IOCTL_DERIVE for reduced-permission aliases selftests: dma-buf: add DERIVE ioctl tests
drivers/dma-buf/dma-buf.c | 58 ++++++++++- include/uapi/linux/dma-buf.h | 28 +++++ tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 114 ++++++++++++++++++++- 3 files changed, 198 insertions(+), 2 deletions(-)
base-commit: ab5fce87a778cb780a05984a2ca448f2b41aafbf change-id: 20260520-dmabuf-limit-access-73261353841a
Best regards,
On 5/21/26 15:01, Albert Esteve wrote:
Hi Christian,
On Thu, May 21, 2026 at 2:28 PM Christian König christian.koenig@amd.com wrote:
On 5/21/26 11:10, Albert Esteve wrote:
When sharing a dma-buf between components of different trust levels, the allocator may need to hand a consumer a read-only view of a buffer it holds with read-write access. An example is a camera pipeline where the capture component writes frames into a buffer and needs to pass a read-only handle to a downstream processing component that should not be able to modify the data.
However, no such mechanism exists today. The access mode of a dma-buf file descriptor is fixed at export time, and the standard POSIX interfaces for duplicating or changing file descriptors (i.e., dup(2), dup3(2), and fcntl(F_SETFL)) cannot alter the read/write access mode of the copy.
One natural candidate would be reopening via /proc/self/fd/<N> with O_RDONLY, which works for regular files. For dma-buf this would fail (that is, if we were to add a new handler for open f_op) with ENXIO because the dmabuf pseudo-filesystem carries SB_NOUSER, which prevents the VFS from opening its files through path-based resolution from userspace.
OH MY GOD! This is the like the sixth time I had to clarify that in the last few weeks, I'm really wondering where that is suddenly coming from.
Sorry! I do not know where others came from. But my interest comes from automotive, safety, and mixed criticality scenarios. I kind of hinted at that in the opening when referring to "different trust levels".
AH! Yeah, automotive is most likely the common topic in all those requests.
Creating the DMA-buf with O_RDONLY does *NOT* make the DMA-buf itself read only!
That's a really common misconception. The flag only controls if mmap() can be done read/write or read-only to handle cache coherency issues.
It is still perfectly possible for a device to write into a DMA-buf created with O_RDONLY with DMA!
So long story short there is not such feature as a read only DMA-buf, and putting read-only pages into a DMA-buf and then expecting that nobody can write to them is an absolutely clear No-Go.
If we would want to implement a read-only DMA-buf feature we would need to go over all the different DMA-buf importers in the kernel and add security checks.
This clarifies a lot. Too bad, but it makes sense. I will abandon the series then.
Thanks for the review and the explanation!
No problem, I was just really surprised that this came up once more.
Just for completeness: What some exporters do is to reject read/write mmap calls with O_RDONLY and map_dma_buf() callbacks with DMA_BIDIRECTIONAL or DMA_TO_DEVICE.
But as I said this is just to catch cache coherency issues and not access control.
If I'm not completely mistaken some HW actually can't even guarantee read only mappings, in other words even if you say don't write to that buffer in the kernel you could submit shader or DMA commands from userspace which does exactly that and it works.
So I don't really see a chance for that feature to fly as general DMA-buf thing. Maybe between two specific exporters/importers could work, but yeah...
Regards, Christian.
BR, Albert
Regards, Christian.
Alternatively, exporting the buffer twice would produce two independent dma_buf instances, which breaks fence synchronization.
Therefore we add a new DMA_BUF_IOCTL_DERIVE ioctl, which produces a new file descriptor for an existing dma-buf with a caller-specified subset of the original permissions:
struct dma_buf_derive { __u32 flags; __s32 fd; }; struct dma_buf_derive req = { .flags = O_RDONLY | O_CLOEXEC }; ioctl(rw_fd, DMA_BUF_IOCTL_DERIVE, &req); /* req.fd is now a read-only alias of the same buffer */Permission escalation is rejected with -EACCES. The new fd aliases the same struct dma_buf as the original, same dma_resv, same exporter ops, same underlying memory; so importers attaching to either fd see the same fence timeline and operate on the same object. Access control for which components may receive or pass on restricted descriptors can be layered on top via SELinux file:read and file:write permissions.
A shared writable mapping (PROT_WRITE | MAP_SHARED) on the read-only fd is rejected with -EACCES in dma_buf_mmap_internal().
Two small internal adjustments accompany the ioctl:
- __dma_buf_list_del() is moved to dma_buf_release() so it fires exactly once on dentry destruction rather than on every file close.
- dma_buf_file_release() is updated to call dma_buf_put() only for files that are not the primary dma-buf file.
This may not be the best approach, but after considering different options and alternatives (as described above), we decided to raise the discussion upstream. Thus, we welcome any alternative proposal or ideas.
The series is structured as:
- Patch 1 adds the new ioctl implementation.
- Patch 2 adds selftests covering the new ioctl.
Signed-off-by: Albert Esteve aesteve@redhat.com
Albert Esteve (2): dma-buf: add DMA_BUF_IOCTL_DERIVE for reduced-permission aliases selftests: dma-buf: add DERIVE ioctl tests
drivers/dma-buf/dma-buf.c | 58 ++++++++++- include/uapi/linux/dma-buf.h | 28 +++++ tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 114 ++++++++++++++++++++- 3 files changed, 198 insertions(+), 2 deletions(-)
base-commit: ab5fce87a778cb780a05984a2ca448f2b41aafbf change-id: 20260520-dmabuf-limit-access-73261353841a
Best regards,
linaro-mm-sig@lists.linaro.org