On Wed, May 13, 2026 at 2:54 AM T.J. Mercier tjmercier@google.com wrote:
On Tue, May 12, 2026 at 3:14 AM Christian König christian.koenig@amd.com wrote:
On 5/12/26 11:10, Albert Esteve wrote:
On embedded platforms a central process often allocates dma-buf memory on behalf of client applications. Without a way to attribute the charge to the requesting client's cgroup, the cost lands on the allocator, making per-cgroup memory limits ineffective for the actual consumers.
Add charge_pid_fd to struct dma_heap_allocation_data. When set to a valid pidfd, DMA_HEAP_IOCTL_ALLOC resolves the target task's memcg and charges the buffer there via mem_cgroup_charge_dmabuf() inside dma_heap_buffer_alloc(). Without charge_pid_fd, and with the mem_accounting module parameter enabled, the buffer is charged to the allocator's own cgroup.
Additionally, commit 3c227be90659 ("dma-buf: system_heap: account for system heap allocation in memcg") adds __GFP_ACCOUNT to system-heap page allocations. Keeping __GFP_ACCOUNT would charge the same pages twice (once to kmem, once to MEMCG_DMABUF), thus remove it and route all accounting through a single MEMCG_DMABUF path.
Usage examples:
Central allocator charging to a client at allocation time. The allocator knows the client's PID (e.g., from binder's sender_pid) and uses pidfd to attribute the charge:
pid_t client_pid = txn->sender_pid; int pidfd = pidfd_open(client_pid, 0);
struct dma_heap_allocation_data alloc = { .len = buffer_size, .fd_flags = O_RDWR | O_CLOEXEC, .charge_pid_fd = pidfd, }; ioctl(heap_fd, DMA_HEAP_IOCTL_ALLOC, &alloc); close(pidfd); /* alloc.fd is now charged to client's cgroup */
Default allocation (no pidfd, mem_accounting=1). When charge_pid_fd is not set and the mem_accounting module parameter is enabled, the buffer is charged to the allocator's own cgroup:
struct dma_heap_allocation_data alloc = { .len = buffer_size, .fd_flags = O_RDWR | O_CLOEXEC, }; ioctl(heap_fd, DMA_HEAP_IOCTL_ALLOC, &alloc); /* charged to current process's cgroup */
Current limitations:
- Single-owner model: a dma-buf carries one memcg charge regardless of how many processes share it. Means only the first owner (and exporter) of the shared buffer bears the charge.
- Only memcg accounting supported. While this makes sense for system heap buffers, other heaps (e.g., CMA heaps) will require selectively charging also for the dmem controller.
Well that doesn't looks soo bad, it at least seems to tackle the problem at hand for Android and some of other embedded use cases.
Yeah I think this might work. I know of 3 cases, and it trivially solves the first two. The third requires some work on our end to extend our userspace interfaces to include the pidfd but it seems doable. I'm checking with our graphics folks.
- Direct allocation from user (e.g. app -> allocation ioctl on
/dev/dma_heap/foo) No changes required to userspace. mem_accounting=1 charges the app.
- Single hop remote allocation (e.g. app -> AHardwareBuffer_allocate
-> gralloc) gralloc has the caller's pid as described in the commit message. Open a pidfd and pass it in the dma_heap_allocation_data.
- Double hop remote allocation (e.g. app -> dequeueBuffer ->
SurfaceFlinger -> gralloc) In this case gralloc knows SurfaceFlinger's pid, but not the app's. So we need to add the app's pidfd to the SurfaceFlinger -> gralloc interface, or transfer the memcg charge from SurfaceFlinger to the app after the allocation. It'd be nice to avoid the charge transfer option entirely, but if we need it that doesn't seem so bad in this case because it's a bulk charge for the entire dmabuf rather than per-page. So the exporter doesn't need to get involved (we wouldn't need a new dma_buf_op) and we wouldn't have to worry about looping and locking for each page.
Hi T.J.,
Your description of the three different cases sounds very interesting. It helps me understand how difficult it can be to correctly charge dma-buf in the current user scenarios.
I’m wondering where I can find Android userspace code that transfers the PID of RPC callers. Do we have any existing sample code in Android for this?
I'm just not sure if this is future prove and will work for all use cases, e.g. cloud gaming, native context for automotive etc...
Thanks Barry