Until CONFIG_DMABUF_SYSFS_STATS was added [1] it was only possible to perform per-buffer accounting with debugfs which is not suitable for production environments. Eventually we discovered the overhead with per-buffer sysfs file creation/removal was significantly impacting allocation and free times, and exacerbated kernfs lock contention. [2] dma_buf_stats_setup() is responsible for 39% of single-page buffer creation duration, or 74% of single-page dma_buf_export() duration when stressing dmabuf allocations and frees.
I prototyped a change from per-buffer to per-exporter statistics with a RCU protected list of exporter allocations that accommodates most (but not all) of our use-cases and avoids almost all of the sysfs overhead. While that adds less overhead than per-buffer sysfs, and less even than the maintenance of the dmabuf debugfs_list, it's still *additional* overhead on top of the debugfs_list and doesn't give us per-buffer info.
This series uses the existing dmabuf debugfs_list to implement a BPF dmabuf iterator, which adds no overhead to buffer allocation/free and provides per-buffer info. While the kernel must have CONFIG_DEBUG_FS for the dmabuf_iter to be available, debugfs does not need to be mounted. The BPF program loaded by userspace that extracts per-buffer information gets to define its own interface which avoids the lack of ABI stability with debugfs (even if it were mounted).
As this is a replacement for our use of CONFIG_DMABUF_SYSFS_STATS, the last patch is a RFC for removing it from the kernel. Please see my suggestion there regarding the timeline for that.
[1] https://lore.kernel.org/linux-media/20201210044400.1080308-1-hridya@google.c... [2] https://lore.kernel.org/all/20220516171315.2400578-1-tjmercier@google.com/
T.J. Mercier (4): dma-buf: Rename and expose debugfs symbols bpf: Add dmabuf iterator selftests/bpf: Add test for dmabuf_iter RFC: dma-buf: Remove DMA-BUF statistics
.../ABI/testing/sysfs-kernel-dmabuf-buffers | 24 --- Documentation/driver-api/dma-buf.rst | 5 - drivers/dma-buf/Kconfig | 15 -- drivers/dma-buf/Makefile | 1 - drivers/dma-buf/dma-buf-sysfs-stats.c | 202 ------------------ drivers/dma-buf/dma-buf-sysfs-stats.h | 35 --- drivers/dma-buf/dma-buf.c | 40 +--- include/linux/btf_ids.h | 1 + include/linux/dma-buf.h | 6 + kernel/bpf/Makefile | 3 + kernel/bpf/dmabuf_iter.c | 130 +++++++++++ tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/dmabuf_iter.c | 116 ++++++++++ .../testing/selftests/bpf/progs/dmabuf_iter.c | 31 +++ 14 files changed, 299 insertions(+), 311 deletions(-) delete mode 100644 Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.c delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.h create mode 100644 kernel/bpf/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/progs/dmabuf_iter.c
Expose the debugfs list and mutex so they are usable for the creation of a BPF iterator for dmabufs. Rename the symbols so it's clear they contain dmabufs and not some other type.
Signed-off-by: T.J. Mercier tjmercier@google.com --- drivers/dma-buf/dma-buf.c | 22 +++++++++++----------- include/linux/dma-buf.h | 6 ++++++ 2 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 5baa83b85515..affb47eb8629 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -36,14 +36,14 @@ static inline int is_dma_buf_file(struct file *);
#if IS_ENABLED(CONFIG_DEBUG_FS) -static DEFINE_MUTEX(debugfs_list_mutex); -static LIST_HEAD(debugfs_list); +DEFINE_MUTEX(dmabuf_debugfs_list_mutex); +LIST_HEAD(dmabuf_debugfs_list);
static void __dma_buf_debugfs_list_add(struct dma_buf *dmabuf) { - mutex_lock(&debugfs_list_mutex); - list_add(&dmabuf->list_node, &debugfs_list); - mutex_unlock(&debugfs_list_mutex); + mutex_lock(&dmabuf_debugfs_list_mutex); + list_add(&dmabuf->list_node, &dmabuf_debugfs_list); + mutex_unlock(&dmabuf_debugfs_list_mutex); }
static void __dma_buf_debugfs_list_del(struct dma_buf *dmabuf) @@ -51,9 +51,9 @@ static void __dma_buf_debugfs_list_del(struct dma_buf *dmabuf) if (!dmabuf) return;
- mutex_lock(&debugfs_list_mutex); + mutex_lock(&dmabuf_debugfs_list_mutex); list_del(&dmabuf->list_node); - mutex_unlock(&debugfs_list_mutex); + mutex_unlock(&dmabuf_debugfs_list_mutex); } #else static void __dma_buf_debugfs_list_add(struct dma_buf *dmabuf) @@ -1630,7 +1630,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) size_t size = 0; int ret;
- ret = mutex_lock_interruptible(&debugfs_list_mutex); + ret = mutex_lock_interruptible(&dmabuf_debugfs_list_mutex);
if (ret) return ret; @@ -1639,7 +1639,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) seq_printf(s, "%-8s\t%-8s\t%-8s\t%-8s\texp_name\t%-8s\tname\n", "size", "flags", "mode", "count", "ino");
- list_for_each_entry(buf_obj, &debugfs_list, list_node) { + list_for_each_entry(buf_obj, &dmabuf_debugfs_list, list_node) {
ret = dma_resv_lock_interruptible(buf_obj->resv, NULL); if (ret) @@ -1676,11 +1676,11 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
seq_printf(s, "\nTotal %d objects, %zu bytes\n", count, size);
- mutex_unlock(&debugfs_list_mutex); + mutex_unlock(&dmabuf_debugfs_list_mutex); return 0;
error_unlock: - mutex_unlock(&debugfs_list_mutex); + mutex_unlock(&dmabuf_debugfs_list_mutex); return ret; }
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 36216d28d8bd..7754608453dc 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -18,6 +18,7 @@ #include <linux/err.h> #include <linux/scatterlist.h> #include <linux/list.h> +#include <linux/mutex.h> #include <linux/dma-mapping.h> #include <linux/fs.h> #include <linux/dma-fence.h> @@ -556,6 +557,11 @@ struct dma_buf_export_info { struct dma_buf_export_info name = { .exp_name = KBUILD_MODNAME, \ .owner = THIS_MODULE }
+#if IS_ENABLED(CONFIG_DEBUG_FS) +extern struct list_head dmabuf_debugfs_list; +extern struct mutex dmabuf_debugfs_list_mutex; +#endif + /** * get_dma_buf - convenience wrapper for get_file. * @dmabuf: [in] pointer to dma_buf
The dmabuf iterator traverses the list of all DMA buffers. The list is maintained only when CONFIG_DEBUG_FS is enabled.
DMA buffers are refcounted through their associated struct file. A reference is taken on each buffer as the list is iterated to ensure each buffer persists for the duration of the bpf program execution without holding the list mutex.
Signed-off-by: T.J. Mercier tjmercier@google.com --- include/linux/btf_ids.h | 1 + kernel/bpf/Makefile | 3 + kernel/bpf/dmabuf_iter.c | 130 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 134 insertions(+) create mode 100644 kernel/bpf/dmabuf_iter.c
diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h index 139bdececdcf..899ead57d89d 100644 --- a/include/linux/btf_ids.h +++ b/include/linux/btf_ids.h @@ -284,5 +284,6 @@ extern u32 bpf_cgroup_btf_id[]; extern u32 bpf_local_storage_map_btf_id[]; extern u32 btf_bpf_map_id[]; extern u32 bpf_kmem_cache_btf_id[]; +extern u32 bpf_dmabuf_btf_id[];
#endif diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 70502f038b92..5b30d37ef055 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -53,6 +53,9 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o +ifeq ($(CONFIG_DEBUG_FS),y) +obj-$(CONFIG_BPF_SYSCALL) += dmabuf_iter.o +endif
CFLAGS_REMOVE_percpu_freelist.o = $(CC_FLAGS_FTRACE) CFLAGS_REMOVE_bpf_lru_list.o = $(CC_FLAGS_FTRACE) diff --git a/kernel/bpf/dmabuf_iter.c b/kernel/bpf/dmabuf_iter.c new file mode 100644 index 000000000000..b4b8be1d6aa4 --- /dev/null +++ b/kernel/bpf/dmabuf_iter.c @@ -0,0 +1,130 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2025 Google LLC */ +#include <linux/bpf.h> +#include <linux/btf_ids.h> +#include <linux/dma-buf.h> +#include <linux/kernel.h> +#include <linux/seq_file.h> + +BTF_ID_LIST_GLOBAL_SINGLE(bpf_dmabuf_btf_id, struct, dma_buf) +DEFINE_BPF_ITER_FUNC(dmabuf, struct bpf_iter_meta *meta, struct dma_buf *dmabuf) + +static void *dmabuf_iter_seq_start(struct seq_file *seq, loff_t *pos) +{ + struct dma_buf *dmabuf, *ret = NULL; + + if (*pos) { + *pos = 0; + return NULL; + } + /* Look for the first buffer we can obtain a reference to. + * The list mutex does not protect a dmabuf's refcount, so it can be + * zeroed while we are iterating. Therefore we cannot call get_dma_buf() + * since the caller of this program may not already own a reference to + * the buffer. + */ + mutex_lock(&dmabuf_debugfs_list_mutex); + list_for_each_entry(dmabuf, &dmabuf_debugfs_list, list_node) { + if (file_ref_get(&dmabuf->file->f_ref)) { + ret = dmabuf; + break; + } + } + mutex_unlock(&dmabuf_debugfs_list_mutex); + + return ret; +} + +static void *dmabuf_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct dma_buf *dmabuf = v, *ret = NULL; + + ++*pos; + + mutex_lock(&dmabuf_debugfs_list_mutex); + dma_buf_put(dmabuf); + while (!list_is_last(&dmabuf->list_node, &dmabuf_debugfs_list)) { + dmabuf = list_next_entry(dmabuf, list_node); + if (file_ref_get(&dmabuf->file->f_ref)) { + ret = dmabuf; + break; + } + } + mutex_unlock(&dmabuf_debugfs_list_mutex); + + return ret; +} + +struct bpf_iter__dmabuf { + __bpf_md_ptr(struct bpf_iter_meta *, meta); + __bpf_md_ptr(struct dma_buf *, dmabuf); +}; + +static int __dmabuf_seq_show(struct seq_file *seq, void *v, bool in_stop) +{ + struct bpf_iter_meta meta = { + .seq = seq, + }; + struct bpf_iter__dmabuf ctx = { + .meta = &meta, + .dmabuf = v, + }; + struct bpf_prog *prog = bpf_iter_get_info(&meta, in_stop); + + if (prog) + return bpf_iter_run_prog(prog, &ctx); + + return 0; +} + +static int dmabuf_iter_seq_show(struct seq_file *seq, void *v) +{ + return __dmabuf_seq_show(seq, v, false); +} + +static void dmabuf_iter_seq_stop(struct seq_file *seq, void *v) +{ + struct dma_buf *dmabuf = v; + + if (dmabuf) + dma_buf_put(dmabuf); +} + +static const struct seq_operations dmabuf_iter_seq_ops = { + .start = dmabuf_iter_seq_start, + .next = dmabuf_iter_seq_next, + .stop = dmabuf_iter_seq_stop, + .show = dmabuf_iter_seq_show, +}; + +static void bpf_iter_dmabuf_show_fdinfo(const struct bpf_iter_aux_info *aux, + struct seq_file *seq) +{ + seq_puts(seq, "dmabuf iter\n"); +} + +static const struct bpf_iter_seq_info dmabuf_iter_seq_info = { + .seq_ops = &dmabuf_iter_seq_ops, + .init_seq_private = NULL, + .fini_seq_private = NULL, + .seq_priv_size = 0, +}; + +static struct bpf_iter_reg bpf_dmabuf_reg_info = { + .target = "dmabuf", + .show_fdinfo = bpf_iter_dmabuf_show_fdinfo, + .ctx_arg_info_size = 1, + .ctx_arg_info = { + { offsetof(struct bpf_iter__dmabuf, dmabuf), + PTR_TO_BTF_ID_OR_NULL }, + }, + .seq_info = &dmabuf_iter_seq_info, +}; + +static int __init dmabuf_iter_init(void) +{ + bpf_dmabuf_reg_info.ctx_arg_info[0].btf_id = bpf_dmabuf_btf_id[0]; + return bpf_iter_reg_target(&bpf_dmabuf_reg_info); +} + +late_initcall(dmabuf_iter_init);
Hi Mercier,
kernel test robot noticed the following build errors:
[auto build test ERROR on bpf-next/net] [also build test ERROR on bpf-next/master bpf/master linus/master v6.15-rc2 next-20250415] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/T-J-Mercier/dma-buf-Rename-an... base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git net patch link: https://lore.kernel.org/r/20250414225227.3642618-3-tjmercier%40google.com patch subject: [PATCH 2/4] bpf: Add dmabuf iterator config: i386-buildonly-randconfig-005-20250416 compiler: clang version 20.1.2 (https://github.com/llvm/llvm-project 58df0ef89dd64126512e4ee27b4ac3fd8ddf6247) reproduce (this is a W=1 build):
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202504161015.x2XLaha2-lkp@intel.com/
All errors (new ones prefixed by >>):
ld.lld: error: undefined symbol: dmabuf_debugfs_list_mutex
referenced by dmabuf_iter.c:44 (kernel/bpf/dmabuf_iter.c:44) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:53 (kernel/bpf/dmabuf_iter.c:53) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:26 (kernel/bpf/dmabuf_iter.c:26) vmlinux.o:(dmabuf_iter_seq_start) referenced 1 more times
--
ld.lld: error: undefined symbol: dma_buf_put
referenced by dmabuf_iter.c:45 (kernel/bpf/dmabuf_iter.c:45) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:90 (kernel/bpf/dmabuf_iter.c:90) vmlinux.o:(dmabuf_iter_seq_stop)
--
ld.lld: error: undefined symbol: dmabuf_debugfs_list
referenced by list.h:354 (include/linux/list.h:354) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:0 (kernel/bpf/dmabuf_iter.c:0) vmlinux.o:(dmabuf_iter_seq_start) referenced by list.h:364 (include/linux/list.h:364) vmlinux.o:(dmabuf_iter_seq_start)
On Tue, Apr 15, 2025 at 9:43 PM kernel test robot lkp@intel.com wrote:
Hi Mercier,
kernel test robot noticed the following build errors:
[auto build test ERROR on bpf-next/net] [also build test ERROR on bpf-next/master bpf/master linus/master v6.15-rc2 next-20250415] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/T-J-Mercier/dma-buf-Rename-an... base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git net patch link: https://lore.kernel.org/r/20250414225227.3642618-3-tjmercier%40google.com patch subject: [PATCH 2/4] bpf: Add dmabuf iterator config: i386-buildonly-randconfig-005-20250416 compiler: clang version 20.1.2 (https://github.com/llvm/llvm-project 58df0ef89dd64126512e4ee27b4ac3fd8ddf6247) reproduce (this is a W=1 build):
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202504161015.x2XLaha2-lkp@intel.com/
All errors (new ones prefixed by >>):
ld.lld: error: undefined symbol: dmabuf_debugfs_list_mutex
referenced by dmabuf_iter.c:44 (kernel/bpf/dmabuf_iter.c:44) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:53 (kernel/bpf/dmabuf_iter.c:53) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:26 (kernel/bpf/dmabuf_iter.c:26) vmlinux.o:(dmabuf_iter_seq_start) referenced 1 more times
--
ld.lld: error: undefined symbol: dma_buf_put
referenced by dmabuf_iter.c:45 (kernel/bpf/dmabuf_iter.c:45) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:90 (kernel/bpf/dmabuf_iter.c:90) vmlinux.o:(dmabuf_iter_seq_stop)
--
ld.lld: error: undefined symbol: dmabuf_debugfs_list
referenced by list.h:354 (include/linux/list.h:354) vmlinux.o:(dmabuf_iter_seq_next) referenced by dmabuf_iter.c:0 (kernel/bpf/dmabuf_iter.c:0) vmlinux.o:(dmabuf_iter_seq_start) referenced by list.h:364 (include/linux/list.h:364) vmlinux.o:(dmabuf_iter_seq_start)
-- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
This is due to no CONFIG_DMA_SHARED_BUFFER. Fixed by:
--- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -53,7 +53,7 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o -ifeq ($(CONFIG_DEBUG_FS),y) +ifeq ($(CONFIG_DMA_SHARED_BUFFER)$(CONFIG_DEBUG_FS),yy) obj-$(CONFIG_BPF_SYSCALL) += dmabuf_iter.o endif
This test creates a udmabuf and uses a BPF program that prints dmabuf metadata with the new dmabuf_iter to verify it can be found.
Signed-off-by: T.J. Mercier tjmercier@google.com --- tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/dmabuf_iter.c | 116 ++++++++++++++++++ .../testing/selftests/bpf/progs/dmabuf_iter.c | 31 +++++ 3 files changed, 148 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/progs/dmabuf_iter.c
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index c378d5d07e02..a791c60813df 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -106,6 +106,7 @@ CONFIG_SECURITY=y CONFIG_SECURITYFS=y CONFIG_SYN_COOKIES=y CONFIG_TEST_BPF=m +CONFIG_UDMABUF=y CONFIG_USERFAULTFD=y CONFIG_VSOCKETS=y CONFIG_VXLAN=y diff --git a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c new file mode 100644 index 000000000000..af215a4e0520 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c @@ -0,0 +1,116 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Google */ + +#include <test_progs.h> +#include <bpf/libbpf.h> +#include <bpf/btf.h> +#include "dmabuf_iter.skel.h" + +#include <fcntl.h> +#include <stdbool.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/ioctl.h> +#include <sys/mman.h> +#include <unistd.h> + +#include <linux/dma-buf.h> +#include <linux/udmabuf.h> + + +static void subtest_dmabuf_iter_check_udmabuf(struct dmabuf_iter *skel) +{ + static const char test_buffer_name[] = "udmabuf_test_buffer_for_iter"; + const size_t test_buffer_size = 10 * getpagesize(); + + ASSERT_LE(sizeof(test_buffer_name), DMA_BUF_NAME_LEN, "NAMETOOLONG"); + + int memfd = memfd_create("memfd_test", MFD_ALLOW_SEALING); + ASSERT_OK_FD(memfd, "memfd_create"); + + ASSERT_OK(ftruncate(memfd, test_buffer_size), "ftruncate"); + + ASSERT_OK(fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK), "seal"); + + int dev_udmabuf = open("/dev/udmabuf", O_RDONLY); + ASSERT_OK_FD(dev_udmabuf, "open udmabuf"); + + struct udmabuf_create create; + create.memfd = memfd; + create.flags = UDMABUF_FLAGS_CLOEXEC; + create.offset = 0; + create.size = test_buffer_size; + + int udmabuf = ioctl(dev_udmabuf, UDMABUF_CREATE, &create); + close(dev_udmabuf); + ASSERT_OK_FD(udmabuf, "udmabuf_create"); + + ASSERT_OK(ioctl(udmabuf, DMA_BUF_SET_NAME_B, test_buffer_name), "name"); + + int iter_fd = bpf_iter_create(bpf_link__fd(skel->links.dmabuf_collector)); + ASSERT_OK_FD(iter_fd, "iter_create"); + + FILE *iter_file = fdopen(iter_fd, "r"); + ASSERT_OK_PTR(iter_file, "fdopen"); + + char *line = NULL; + size_t linesize = 0; + bool found_test_udmabuf = false; + while (getline(&line, &linesize, iter_file) != -1) { + long inode, size; + char name[DMA_BUF_NAME_LEN], exporter[32]; + + int nelements = sscanf(line, "ino:%ld size:%ld name:%s exp_name:%s", + &inode, &size, name, exporter); + + if (nelements == 4 && size == test_buffer_size && + !strcmp(name, test_buffer_name) && + !strcmp(exporter, "udmabuf")) { + found_test_udmabuf = true; + break; + } + } + + ASSERT_TRUE(found_test_udmabuf, "found_test_buffer"); + + free(line); + fclose(iter_file); + close(iter_fd); + close(udmabuf); + close(memfd); +} + +void test_dmabuf_iter(void) +{ + struct dmabuf_iter *skel = NULL; + char buf[256]; + int iter_fd; + + skel = dmabuf_iter__open_and_load(); + if (!ASSERT_OK_PTR(skel, "dmabuf_iter__open_and_load")) + return; + + if (!ASSERT_OK(dmabuf_iter__attach(skel), "skel_attach")) + goto destroy; + + iter_fd = bpf_iter_create(bpf_link__fd(skel->links.dmabuf_collector)); + if (!ASSERT_GE(iter_fd, 0, "iter_create")) + goto destroy; + + memset(buf, 0, sizeof(buf)); + while (read(iter_fd, buf, sizeof(buf) > 0)) { + /* Read out all contents */ + } + + /* Next reads should return 0 */ + ASSERT_EQ(read(iter_fd, buf, sizeof(buf)), 0, "read"); + + if (test__start_subtest("check_udmabuf")) + subtest_dmabuf_iter_check_udmabuf(skel); + + close(iter_fd); + +destroy: + dmabuf_iter__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/dmabuf_iter.c b/tools/testing/selftests/bpf/progs/dmabuf_iter.c new file mode 100644 index 000000000000..b2af14ceb609 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/dmabuf_iter.c @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Google LLC */ +#include <vmlinux.h> +#include <bpf/bpf_core_read.h> +#include <bpf/bpf_helpers.h> + +char _license[] SEC("license") = "GPL"; + +SEC("iter/dmabuf") +int dmabuf_collector(struct bpf_iter__dmabuf *ctx) +{ + struct seq_file *seq = ctx->meta->seq; + const struct dma_buf *dmabuf = ctx->dmabuf; + + if (dmabuf) { + size_t size; + unsigned long inode; + const char *name, *exp_name; + + if (bpf_core_read(&size, sizeof(size), &dmabuf->size) || + BPF_CORE_READ_INTO(&inode, dmabuf, file, f_inode, i_ino) || + bpf_core_read(&name, sizeof(name), &dmabuf->name) || + bpf_core_read(&exp_name, sizeof(exp_name), &dmabuf->exp_name)) + return 1; + + BPF_SEQ_PRINTF(seq, "ino:%lu size:%llu name:%s exp_name:%s\n", + inode, size, name ? name : "", exp_name ? exp_name : ""); + } + + return 0; +}
I think Android is probably the only remaining user of the dmabuf sysfs files. The BPF infrastructure added earlier in this series will allow us to get the same information much more cheaply.
This patch is a RFC because I'd like to keep this for at least one more longterm stable release (6.18?) before actually removing it so that we can have one kernel version that supports both options to facilitate a transition from the sysfs files to a BPF program.
Signed-off-by: T.J. Mercier tjmercier@google.com --- .../ABI/testing/sysfs-kernel-dmabuf-buffers | 24 --- Documentation/driver-api/dma-buf.rst | 5 - drivers/dma-buf/Kconfig | 15 -- drivers/dma-buf/Makefile | 1 - drivers/dma-buf/dma-buf-sysfs-stats.c | 202 ------------------ drivers/dma-buf/dma-buf-sysfs-stats.h | 35 --- drivers/dma-buf/dma-buf.c | 18 -- 7 files changed, 300 deletions(-) delete mode 100644 Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.c delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.h
diff --git a/Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers b/Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers deleted file mode 100644 index 5d3bc997dc64..000000000000 --- a/Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers +++ /dev/null @@ -1,24 +0,0 @@ -What: /sys/kernel/dmabuf/buffers -Date: May 2021 -KernelVersion: v5.13 -Contact: Hridya Valsaraju hridya@google.com -Description: The /sys/kernel/dmabuf/buffers directory contains a - snapshot of the internal state of every DMA-BUF. - /sys/kernel/dmabuf/buffers/<inode_number> will contain the - statistics for the DMA-BUF with the unique inode number - <inode_number> -Users: kernel memory tuning/debugging tools - -What: /sys/kernel/dmabuf/buffers/<inode_number>/exporter_name -Date: May 2021 -KernelVersion: v5.13 -Contact: Hridya Valsaraju hridya@google.com -Description: This file is read-only and contains the name of the exporter of - the DMA-BUF. - -What: /sys/kernel/dmabuf/buffers/<inode_number>/size -Date: May 2021 -KernelVersion: v5.13 -Contact: Hridya Valsaraju hridya@google.com -Description: This file is read-only and specifies the size of the DMA-BUF in - bytes. diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 29abf1eebf9f..2f36c21d9948 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -125,11 +125,6 @@ Implicit Fence Poll Support .. kernel-doc:: drivers/dma-buf/dma-buf.c :doc: implicit fence polling
-DMA-BUF statistics -~~~~~~~~~~~~~~~~~~ -.. kernel-doc:: drivers/dma-buf/dma-buf-sysfs-stats.c - :doc: overview - DMA Buffer ioctls ~~~~~~~~~~~~~~~~~
diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index fee04fdb0822..03e38c0d1fff 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -76,21 +76,6 @@ menuconfig DMABUF_HEAPS allows userspace to allocate dma-bufs that can be shared between drivers.
-menuconfig DMABUF_SYSFS_STATS - bool "DMA-BUF sysfs statistics (DEPRECATED)" - depends on DMA_SHARED_BUFFER - help - Choose this option to enable DMA-BUF sysfs statistics - in location /sys/kernel/dmabuf/buffers. - - /sys/kernel/dmabuf/buffers/<inode_number> will contain - statistics for the DMA-BUF with the unique inode number - <inode_number>. - - This option is deprecated and should sooner or later be removed. - Android is the only user of this and it turned out that this resulted - in quite some performance problems. - source "drivers/dma-buf/heaps/Kconfig"
endmenu diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 70ec901edf2c..8ab2bfecb1c9 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -6,7 +6,6 @@ obj-$(CONFIG_DMABUF_HEAPS) += heaps/ obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o obj-$(CONFIG_UDMABUF) += udmabuf.o -obj-$(CONFIG_DMABUF_SYSFS_STATS) += dma-buf-sysfs-stats.o
dmabuf_selftests-y := \ selftest.o \ diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.c b/drivers/dma-buf/dma-buf-sysfs-stats.c deleted file mode 100644 index b5b62e40ccc1..000000000000 --- a/drivers/dma-buf/dma-buf-sysfs-stats.c +++ /dev/null @@ -1,202 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * DMA-BUF sysfs statistics. - * - * Copyright (C) 2021 Google LLC. - */ - -#include <linux/dma-buf.h> -#include <linux/dma-resv.h> -#include <linux/kobject.h> -#include <linux/printk.h> -#include <linux/slab.h> -#include <linux/sysfs.h> - -#include "dma-buf-sysfs-stats.h" - -#define to_dma_buf_entry_from_kobj(x) container_of(x, struct dma_buf_sysfs_entry, kobj) - -/** - * DOC: overview - * - * ``/sys/kernel/debug/dma_buf/bufinfo`` provides an overview of every DMA-BUF - * in the system. However, since debugfs is not safe to be mounted in - * production, procfs and sysfs can be used to gather DMA-BUF statistics on - * production systems. - * - * The ``/proc/<pid>/fdinfo/<fd>`` files in procfs can be used to gather - * information about DMA-BUF fds. Detailed documentation about the interface - * is present in Documentation/filesystems/proc.rst. - * - * Unfortunately, the existing procfs interfaces can only provide information - * about the DMA-BUFs for which processes hold fds or have the buffers mmapped - * into their address space. This necessitated the creation of the DMA-BUF sysfs - * statistics interface to provide per-buffer information on production systems. - * - * The interface at ``/sys/kernel/dmabuf/buffers`` exposes information about - * every DMA-BUF when ``CONFIG_DMABUF_SYSFS_STATS`` is enabled. - * - * The following stats are exposed by the interface: - * - * * ``/sys/kernel/dmabuf/buffers/<inode_number>/exporter_name`` - * * ``/sys/kernel/dmabuf/buffers/<inode_number>/size`` - * - * The information in the interface can also be used to derive per-exporter - * statistics. The data from the interface can be gathered on error conditions - * or other important events to provide a snapshot of DMA-BUF usage. - * It can also be collected periodically by telemetry to monitor various metrics. - * - * Detailed documentation about the interface is present in - * Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers. - */ - -struct dma_buf_stats_attribute { - struct attribute attr; - ssize_t (*show)(struct dma_buf *dmabuf, - struct dma_buf_stats_attribute *attr, char *buf); -}; -#define to_dma_buf_stats_attr(x) container_of(x, struct dma_buf_stats_attribute, attr) - -static ssize_t dma_buf_stats_attribute_show(struct kobject *kobj, - struct attribute *attr, - char *buf) -{ - struct dma_buf_stats_attribute *attribute; - struct dma_buf_sysfs_entry *sysfs_entry; - struct dma_buf *dmabuf; - - attribute = to_dma_buf_stats_attr(attr); - sysfs_entry = to_dma_buf_entry_from_kobj(kobj); - dmabuf = sysfs_entry->dmabuf; - - if (!dmabuf || !attribute->show) - return -EIO; - - return attribute->show(dmabuf, attribute, buf); -} - -static const struct sysfs_ops dma_buf_stats_sysfs_ops = { - .show = dma_buf_stats_attribute_show, -}; - -static ssize_t exporter_name_show(struct dma_buf *dmabuf, - struct dma_buf_stats_attribute *attr, - char *buf) -{ - return sysfs_emit(buf, "%s\n", dmabuf->exp_name); -} - -static ssize_t size_show(struct dma_buf *dmabuf, - struct dma_buf_stats_attribute *attr, - char *buf) -{ - return sysfs_emit(buf, "%zu\n", dmabuf->size); -} - -static struct dma_buf_stats_attribute exporter_name_attribute = - __ATTR_RO(exporter_name); -static struct dma_buf_stats_attribute size_attribute = __ATTR_RO(size); - -static struct attribute *dma_buf_stats_default_attrs[] = { - &exporter_name_attribute.attr, - &size_attribute.attr, - NULL, -}; -ATTRIBUTE_GROUPS(dma_buf_stats_default); - -static void dma_buf_sysfs_release(struct kobject *kobj) -{ - struct dma_buf_sysfs_entry *sysfs_entry; - - sysfs_entry = to_dma_buf_entry_from_kobj(kobj); - kfree(sysfs_entry); -} - -static const struct kobj_type dma_buf_ktype = { - .sysfs_ops = &dma_buf_stats_sysfs_ops, - .release = dma_buf_sysfs_release, - .default_groups = dma_buf_stats_default_groups, -}; - -void dma_buf_stats_teardown(struct dma_buf *dmabuf) -{ - struct dma_buf_sysfs_entry *sysfs_entry; - - sysfs_entry = dmabuf->sysfs_entry; - if (!sysfs_entry) - return; - - kobject_del(&sysfs_entry->kobj); - kobject_put(&sysfs_entry->kobj); -} - - -/* Statistics files do not need to send uevents. */ -static int dmabuf_sysfs_uevent_filter(const struct kobject *kobj) -{ - return 0; -} - -static const struct kset_uevent_ops dmabuf_sysfs_no_uevent_ops = { - .filter = dmabuf_sysfs_uevent_filter, -}; - -static struct kset *dma_buf_stats_kset; -static struct kset *dma_buf_per_buffer_stats_kset; -int dma_buf_init_sysfs_statistics(void) -{ - dma_buf_stats_kset = kset_create_and_add("dmabuf", - &dmabuf_sysfs_no_uevent_ops, - kernel_kobj); - if (!dma_buf_stats_kset) - return -ENOMEM; - - dma_buf_per_buffer_stats_kset = kset_create_and_add("buffers", - &dmabuf_sysfs_no_uevent_ops, - &dma_buf_stats_kset->kobj); - if (!dma_buf_per_buffer_stats_kset) { - kset_unregister(dma_buf_stats_kset); - return -ENOMEM; - } - - return 0; -} - -void dma_buf_uninit_sysfs_statistics(void) -{ - kset_unregister(dma_buf_per_buffer_stats_kset); - kset_unregister(dma_buf_stats_kset); -} - -int dma_buf_stats_setup(struct dma_buf *dmabuf, struct file *file) -{ - struct dma_buf_sysfs_entry *sysfs_entry; - int ret; - - if (!dmabuf->exp_name) { - pr_err("exporter name must not be empty if stats needed\n"); - return -EINVAL; - } - - sysfs_entry = kzalloc(sizeof(struct dma_buf_sysfs_entry), GFP_KERNEL); - if (!sysfs_entry) - return -ENOMEM; - - sysfs_entry->kobj.kset = dma_buf_per_buffer_stats_kset; - sysfs_entry->dmabuf = dmabuf; - - dmabuf->sysfs_entry = sysfs_entry; - - /* create the directory for buffer stats */ - ret = kobject_init_and_add(&sysfs_entry->kobj, &dma_buf_ktype, NULL, - "%lu", file_inode(file)->i_ino); - if (ret) - goto err_sysfs_dmabuf; - - return 0; - -err_sysfs_dmabuf: - kobject_put(&sysfs_entry->kobj); - dmabuf->sysfs_entry = NULL; - return ret; -} diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.h b/drivers/dma-buf/dma-buf-sysfs-stats.h deleted file mode 100644 index 7a8a995b75ba..000000000000 --- a/drivers/dma-buf/dma-buf-sysfs-stats.h +++ /dev/null @@ -1,35 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-only */ -/* - * DMA-BUF sysfs statistics. - * - * Copyright (C) 2021 Google LLC. - */ - -#ifndef _DMA_BUF_SYSFS_STATS_H -#define _DMA_BUF_SYSFS_STATS_H - -#ifdef CONFIG_DMABUF_SYSFS_STATS - -int dma_buf_init_sysfs_statistics(void); -void dma_buf_uninit_sysfs_statistics(void); - -int dma_buf_stats_setup(struct dma_buf *dmabuf, struct file *file); - -void dma_buf_stats_teardown(struct dma_buf *dmabuf); -#else - -static inline int dma_buf_init_sysfs_statistics(void) -{ - return 0; -} - -static inline void dma_buf_uninit_sysfs_statistics(void) {} - -static inline int dma_buf_stats_setup(struct dma_buf *dmabuf, struct file *file) -{ - return 0; -} - -static inline void dma_buf_stats_teardown(struct dma_buf *dmabuf) {} -#endif -#endif // _DMA_BUF_SYSFS_STATS_H diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index affb47eb8629..c51967c6cf85 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -31,8 +31,6 @@ #include <uapi/linux/dma-buf.h> #include <uapi/linux/magic.h>
-#include "dma-buf-sysfs-stats.h" - static inline int is_dma_buf_file(struct file *);
#if IS_ENABLED(CONFIG_DEBUG_FS) @@ -98,7 +96,6 @@ static void dma_buf_release(struct dentry *dentry) */ BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
- dma_buf_stats_teardown(dmabuf); dmabuf->ops->release(dmabuf);
if (dmabuf->resv == (struct dma_resv *)&dmabuf[1]) @@ -681,10 +678,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->resv = resv; }
- ret = dma_buf_stats_setup(dmabuf, file); - if (ret) - goto err_dmabuf; - file->private_data = dmabuf; file->f_path.dentry->d_fsdata = dmabuf; dmabuf->file = file; @@ -693,10 +686,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
return dmabuf;
-err_dmabuf: - if (!resv) - dma_resv_fini(dmabuf->resv); - kfree(dmabuf); err_file: fput(file); err_module: @@ -1727,12 +1716,6 @@ static inline void dma_buf_uninit_debugfs(void)
static int __init dma_buf_init(void) { - int ret; - - ret = dma_buf_init_sysfs_statistics(); - if (ret) - return ret; - dma_buf_mnt = kern_mount(&dma_buf_fs_type); if (IS_ERR(dma_buf_mnt)) return PTR_ERR(dma_buf_mnt); @@ -1746,6 +1729,5 @@ static void __exit dma_buf_deinit(void) { dma_buf_uninit_debugfs(); kern_unmount(dma_buf_mnt); - dma_buf_uninit_sysfs_statistics(); } __exitcall(dma_buf_deinit);
Am 15.04.25 um 00:52 schrieb T.J. Mercier:
Until CONFIG_DMABUF_SYSFS_STATS was added [1] it was only possible to perform per-buffer accounting with debugfs which is not suitable for production environments. Eventually we discovered the overhead with per-buffer sysfs file creation/removal was significantly impacting allocation and free times, and exacerbated kernfs lock contention. [2] dma_buf_stats_setup() is responsible for 39% of single-page buffer creation duration, or 74% of single-page dma_buf_export() duration when stressing dmabuf allocations and frees.
I prototyped a change from per-buffer to per-exporter statistics with a RCU protected list of exporter allocations that accommodates most (but not all) of our use-cases and avoids almost all of the sysfs overhead. While that adds less overhead than per-buffer sysfs, and less even than the maintenance of the dmabuf debugfs_list, it's still *additional* overhead on top of the debugfs_list and doesn't give us per-buffer info.
This series uses the existing dmabuf debugfs_list to implement a BPF dmabuf iterator, which adds no overhead to buffer allocation/free and provides per-buffer info.
Really interesting suggestion. I was expecting something like cgroups, but bpf is certainly an option as well.
How do you then use bpf to account the buffers? E.g. are you interacting with cgroups or have sysfs procedure to expose the list or how does that work?
Additional to that why using DMA-buf for accounting in the first place? See DMA-buf is for sharing buffers and only a minimal fraction of buffers usually need to get shared. Everything else is just massive overhead.
While the kernel must have CONFIG_DEBUG_FS for the dmabuf_iter to be available, debugfs does not need to be mounted. The BPF program loaded by userspace that extracts per-buffer information gets to define its own interface which avoids the lack of ABI stability with debugfs (even if it were mounted).
I think we can make the buffer list independent of CONFIG_DEBUG_FS.
As this is a replacement for our use of CONFIG_DMABUF_SYSFS_STATS, the last patch is a RFC for removing it from the kernel. Please see my suggestion there regarding the timeline for that.
Oh, yes please!
Regards, Christian.
[1] https://lore.kernel.org/linux-media/20201210044400.1080308-1-hridya@google.c... [2] https://lore.kernel.org/all/20220516171315.2400578-1-tjmercier@google.com/
T.J. Mercier (4): dma-buf: Rename and expose debugfs symbols bpf: Add dmabuf iterator selftests/bpf: Add test for dmabuf_iter RFC: dma-buf: Remove DMA-BUF statistics
.../ABI/testing/sysfs-kernel-dmabuf-buffers | 24 --- Documentation/driver-api/dma-buf.rst | 5 - drivers/dma-buf/Kconfig | 15 -- drivers/dma-buf/Makefile | 1 - drivers/dma-buf/dma-buf-sysfs-stats.c | 202 ------------------ drivers/dma-buf/dma-buf-sysfs-stats.h | 35 --- drivers/dma-buf/dma-buf.c | 40 +--- include/linux/btf_ids.h | 1 + include/linux/dma-buf.h | 6 + kernel/bpf/Makefile | 3 + kernel/bpf/dmabuf_iter.c | 130 +++++++++++ tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/dmabuf_iter.c | 116 ++++++++++ .../testing/selftests/bpf/progs/dmabuf_iter.c | 31 +++ 14 files changed, 299 insertions(+), 311 deletions(-) delete mode 100644 Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.c delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.h create mode 100644 kernel/bpf/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/progs/dmabuf_iter.c
On Tue, Apr 15, 2025 at 2:03 AM Christian König christian.koenig@amd.com wrote:
Am 15.04.25 um 00:52 schrieb T.J. Mercier:
Until CONFIG_DMABUF_SYSFS_STATS was added [1] it was only possible to perform per-buffer accounting with debugfs which is not suitable for production environments. Eventually we discovered the overhead with per-buffer sysfs file creation/removal was significantly impacting allocation and free times, and exacerbated kernfs lock contention. [2] dma_buf_stats_setup() is responsible for 39% of single-page buffer creation duration, or 74% of single-page dma_buf_export() duration when stressing dmabuf allocations and frees.
I prototyped a change from per-buffer to per-exporter statistics with a RCU protected list of exporter allocations that accommodates most (but not all) of our use-cases and avoids almost all of the sysfs overhead. While that adds less overhead than per-buffer sysfs, and less even than the maintenance of the dmabuf debugfs_list, it's still *additional* overhead on top of the debugfs_list and doesn't give us per-buffer info.
This series uses the existing dmabuf debugfs_list to implement a BPF dmabuf iterator, which adds no overhead to buffer allocation/free and provides per-buffer info.
Really interesting suggestion. I was expecting something like cgroups, but bpf is certainly an option as well.
How do you then use bpf to account the buffers? E.g. are you interacting with cgroups or have sysfs procedure to expose the list or how does that work?
Where currently we read through all of /sys/kernel/dmabuf/buffers/, with this we can load or pin a bpf program (like tools/testing/selftests/bpf/progs/dmabuf_iter.c) and then just cat (and parse) /sys/fs/bpf/dmabufs to get all per-buffer info one go.
The attribution of buffers to processes is currently done by looking through procfs for fd and map references to dmabufs. That part is still slow, and provides no limitation on who can allocate how much, so I think cgroups is still the main potential tool for that. We have a program that does all the scanning work which is called on-demand for some use cases, and also manually by users: https://cs.android.com/android/platform/superproject/main/+/main:system/memo...
The per-buffer information is used for accounting kernel-only buffers that don't show up in procfs, and for partially mapped buffers without fd references where the total buffer size isn't otherwise known. Also sometimes (manual debugging or bugreports) it's useful just to know how much memory in total is tied up in dmabufs regardless of who allocated it because it can be gigabytes due to bugs or crazy program behaviors; the per buffer info is a faster way to get that then reading through all of procfs even if you assume everything is viewable in procfs.
Additional to that why using DMA-buf for accounting in the first place? See DMA-buf is for sharing buffers and only a minimal fraction of buffers usually need to get shared. Everything else is just massive overhead.
Well we need some way to account all DMA-buf memory because it consumes a significant portion of total device memory. Even more so lately where they're used to store >1G AI models for execution on accelerator hardware. I've attached an example of dmabuf_dump output to give you an idea of how many buffers we're talking about, and most of those are (or will be, when an app goes to foreground) shared among multiple processes and/or drivers.
While the kernel must have CONFIG_DEBUG_FS for the dmabuf_iter to be available, debugfs does not need to be mounted. The BPF program loaded by userspace that extracts per-buffer information gets to define its own interface which avoids the lack of ABI stability with debugfs (even if it were mounted).
I think we can make the buffer list independent of CONFIG_DEBUG_FS.
This would be nice. It's a fairly small overhead, and we can make it less with RCU too. (__dma_buf_debugfs_list_add.png)
As this is a replacement for our use of CONFIG_DMABUF_SYSFS_STATS, the last patch is a RFC for removing it from the kernel. Please see my suggestion there regarding the timeline for that.
Oh, yes please!
I thought you might be happy about this. :)
Regards, Christian.
[1] https://lore.kernel.org/linux-media/20201210044400.1080308-1-hridya@google.c... [2] https://lore.kernel.org/all/20220516171315.2400578-1-tjmercier@google.com/
T.J. Mercier (4): dma-buf: Rename and expose debugfs symbols bpf: Add dmabuf iterator selftests/bpf: Add test for dmabuf_iter RFC: dma-buf: Remove DMA-BUF statistics
.../ABI/testing/sysfs-kernel-dmabuf-buffers | 24 --- Documentation/driver-api/dma-buf.rst | 5 - drivers/dma-buf/Kconfig | 15 -- drivers/dma-buf/Makefile | 1 - drivers/dma-buf/dma-buf-sysfs-stats.c | 202 ------------------ drivers/dma-buf/dma-buf-sysfs-stats.h | 35 --- drivers/dma-buf/dma-buf.c | 40 +--- include/linux/btf_ids.h | 1 + include/linux/dma-buf.h | 6 + kernel/bpf/Makefile | 3 + kernel/bpf/dmabuf_iter.c | 130 +++++++++++ tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/dmabuf_iter.c | 116 ++++++++++ .../testing/selftests/bpf/progs/dmabuf_iter.c | 31 +++ 14 files changed, 299 insertions(+), 311 deletions(-) delete mode 100644 Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.c delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.h create mode 100644 kernel/bpf/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/progs/dmabuf_iter.c
linaro-mm-sig@lists.linaro.org