Am 03.11.22 um 23:16 schrieb Nicolas Dufresne:
[SNIP]
We already had numerous projects where we reported this practice as bugs to the GStreamer and FFMPEG project because it won't work on x86 with dGPUs.
Links ? Remember that I do read every single bugs and emails around GStreamer project. I do maintain older and newer V4L2 support in there. I also did contribute a lot to the mechanism GStreamer have in-place to reverse the allocation. In fact, its implemented, the problem being that on generic Linux, the receiver element, like the GL element and the display sink don't have any API they can rely on to allocate memory. Thus, they don't implement what we call the allocation offer in GStreamer term. Very often though, on other modern OS, or APIs like VA, the memory offer is replaced by a context. So the allocation is done from a "context" which is neither an importer or an exporter. This is mostly found on MacOS and Windows.
Was there APIs suggested to actually make it manageable by userland to allocate from the GPU? Yes, this what Linux Device Allocator idea is for. Is that API ready, no.
Well, that stuff is absolutely ready: https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/heaps/system_... What do you think I'm talking about all the time?
DMA-buf has a lengthy section about CPU access to buffers and clearly documents how all of that is supposed to work: https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-buf.c#L11... This includes braketing of CPU access with dma_buf_begin_cpu_access() and dma_buf_end_cpu_access(), as well as transaction management between devices and the CPU and even implicit synchronization.
This specification is then implemented by the different drivers including V4L2: https://elixir.bootlin.com/linux/latest/source/drivers/media/common/videobuf...
As well as the different DRM drivers: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/i915/gem/i915... https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/am...
This design was then used by us with various media players on different customer projects, including QNAP https://www.qnap.com/en/product/ts-877 as well as the newest Tesla https://www.amd.com/en/products/embedded-automotive-solutions
I won't go into the details here, but we are using exactly the approach I've outlined to let userspace control the DMA between the different device in question. I'm one of the main designers of that and our multimedia and mesa team has up-streamed quite a number of changes for this project.
I'm not that well into different ARM based solutions because we are just recently getting results that this starts to work with AMD GPUs, but I'm pretty sure that the design should be able to handle that as well.
So we have clearly prove that this design works, even with special requirements which are way more complex than what we are discussing here. We had cases where we used GStreamer to feed DMA-buf handles into multiple devices with different format requirements and that seems to work fine.
-----
But enough of this rant. As I wrote Lucas as well this doesn't help us any further in the technical discussion.
The only technical argument I have is that if some userspace applications fail to use the provided UAPI while others use it correctly then this is clearly not a good reason to change the UAPI, but rather an argument to change the applications.
If the application should be kept simple and device independent then allocating the buffer from the device independent DMA heaps would be enough as well. Cause that provider implements the necessary handling for dma_buf_begin_cpu_access() and dma_buf_end_cpu_access().
I'm a bit surprised that we are arguing about stuff like this because we spend a lot of effort trying to document this. Daniel gave me the job to fix this documentation, but after reading through it multiple times now I can't seem to find where the design and the desired behavior is unclear.
What is clearly a bug in the kernel is that we don't reject things which won't work correctly and this is what this patch here addresses. What we could talk about is backward compatibility for this patch, cause it might look like it breaks things which previously used to work at least partially.
Regards, Christian.