 
            Hi Pavel,
Am Donnerstag, dem 10.07.2025 um 10:24 +0200 schrieb Pavel Machek:
Hi!
It seems that DMA-BUFs are always uncached on arm64... which is a problem.
I'm trying to get useful camera support on Librem 5, and that includes recording vidos (and taking photos).
memcpy() from normal memory is about 2msec/1MB. Unfortunately, for DMA-BUFs it is 20msec/1MB, and that basically means I can't easily do 760p video recording. Plus, copying full-resolution photo buffer takes more than 200msec!
There's possibility to do some processing on GPU, and its implemented here:
https://gitlab.com/tui/tui/-/tree/master/icam?ref_type=heads
but that hits the same problem in the end -- data is in DMA-BUF, uncached, and takes way too long to copy out.
And that's ... wrong. DMA ended seconds ago, complete cache flush would be way cheaper than copying single frame out, and I still have to deal with uncached frames.
So I have two questions:
- Is my analysis correct that, no matter how I get frame from v4l and
process it on GPU, I'll have to copy it from uncached memory in the end?
If you need to touch the buffers using the CPU then you are either stuck with uncached memory or you need to implement bracketed access to do the necessary cache maintenance. Be aware that completely flushing the cache is not really an option, as that would impact other workloads, so you have to flush the cache by walking the virtual address space of the buffer, which may take a significant amount of CPU time.
However, if you are only going to use the buffer with the GPU I see no reason to touch it from the CPU side. Why would you even need to copy the content? After all dma-bufs are meant to enable zero-copy between DMA capable accelerators. You can simply import the V4L2 buffer into a GL texture using EGL_EXT_image_dma_buf_import. Using this path you don't need to bother with the cache at all, as the GPU will directly read the video buffers from RAM.
Regards, Lucas
- Does anyone have patches / ideas / roadmap how to solve that? It
makes GPU unusable for computing, and camera basically unusable for video.
Best regards, Pavel