On 5/19/25 06:08, wangtao wrote:
-----Original Message----- From: Christian König christian.koenig@amd.com Sent: Friday, May 16, 2025 6:29 PM To: wangtao tao.wangtao@honor.com; sumit.semwal@linaro.org; benjamin.gaignard@collabora.com; Brian.Starkey@arm.com; jstultz@google.com; tjmercier@google.com Cc: linux-media@vger.kernel.org; dri-devel@lists.freedesktop.org; linaro- mm-sig@lists.linaro.org; linux-kernel@vger.kernel.org; wangbintian(BintianWang) bintian.wang@honor.com; yipengxiang yipengxiang@honor.com; liulu liulu.liu@honor.com; hanfeng feng.han@honor.com Subject: Re: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap
On 5/16/25 11:49, wangtao wrote:
Please try using udmabuf with sendfile() as confirmed to be working by
T.J.
[wangtao] Using buffer IO with dmabuf file read/write requires one
memory copy.
Direct IO removes this copy to enable zero-copy. The sendfile system call reduces memory copies from two (read/write) to one. However, with udmabuf, sendfile still keeps at least one copy, failing zero-copy.
Then please work on fixing this.
[wangtao] What needs fixing? Does sendfile achieve zero-copy? sendfile reduces memory copies (from 2 to 1) for network sockets, but still requires one copy and cannot achieve zero copies.
Well why not? See sendfile() is the designated Linux uAPI for moving data between two files, maybe splice() is also appropriate.
The memory file descriptor and your destination file are both a files. So those uAPIs apply.
[wangtao] I realize our disagreement lies here: You believe sendfile enables zero-copy for regular file → socket/file:
No what I mean is that it should be possible to solve this using sendfile() or splice() and not come uo with a hacky IOCTL to bypass well tested and agreed upon system calls.
sendfile(dst_socket, src_disk) [disk] --DMA--> [page buffer] --DMA--> [NIC] sendfile(dst_disk, src_disk) [disk] --DMA--> [page buffer] --DMA--> [DISK]
But for regular file → memory file (e.g., tmpfs/shmem), a CPU copy is unavoidable: sendfile(dst_memfile, src_disk) [disk] --DMA--> [page buffer] --CPU copy--> [DISK] Without memory-to-memory DMA, this wastes CPU/power — critical for embedded devices.
Now what you suggest is to add a new IOCTL to do this in a very specific manner just for the system DMA-buf heap. And as far as I can see that is in general a complete no-go.
I mean I understand why you do this. Instead of improving the existing functionality you're just hacking something together because it is simple for you.
It might be possible to implement that generic for DMA-buf heaps if udmabuf allocation overhead can't be reduced, but that is then just the second step.
[wangtao] On dmabuf:
- DMABUF lacks Direct I/O support, hence our proposal.
- memfd supports Direct I/O but doesn’t fit our use case.
- udmabuf via memfd works but needs systemic changes (low ROI) and has slow allocation.
Your objections:
- Adding an IOCTL? This targets dmabuf specifically, and our fix is simple. sendfile doesn’t resolve it.
- Accessing sgtable pages in the exporter? As the dmabuf creator, the exporter fully controls sgtable/page data. We can restrict access to cases with no external users.
Could you clarify which point you oppose?
Both. I might be repeating myself, but I think what you do here is a no-go and reimplements core system call functionality by a way which we certainly shouldn't allow.
T.J's testing shows that sendfile() seems to work at least in one direction. The other use case can certainly be optimized. So if you want to improve this work on that instead.
Regards, Christian
Regards, Christian.