On Tue, 7 May 2024 at 18:15, Bryan O'Donoghue bryan.odonoghue@linaro.org wrote:
On 07/05/2024 16:09, Dmitry Baryshkov wrote:
Ah, I see. Then why do you require the DMA-ble buffer at all? If you are providing data to VPU or DRM, then you should be able to get the buffer from the data-consuming device.
Because we don't necessarily know what the consuming device is, if any.
Could be VPU, could be Zoom/Hangouts via pipewire, could for argument sake be GPU or DSP.
Also if we introduce a dependency on another device to allocate the output buffers - say always taking the output buffer from the GPU, then we've added another dependency which is more difficult to guarantee across different arches.
Yes. And it should be expected. It's a consumer who knows the restrictions on the buffer. As I wrote, Zoom/Hangouts should not require a DMA buffer at all. Applications should be able to allocate the buffer out of the generic memory. GPUs might also have different requirements. Consider GPUs with VRAM. It might be beneficial to allocate a buffer out of VRAM rather than generic DMA mem.
On Tue, May 07, 2024 at 06:19:18PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 18:15, Bryan O'Donoghue wrote:
On 07/05/2024 16:09, Dmitry Baryshkov wrote:
Ah, I see. Then why do you require the DMA-ble buffer at all? If you are providing data to VPU or DRM, then you should be able to get the buffer from the data-consuming device.
Because we don't necessarily know what the consuming device is, if any.
Could be VPU, could be Zoom/Hangouts via pipewire, could for argument sake be GPU or DSP.
Also if we introduce a dependency on another device to allocate the output buffers - say always taking the output buffer from the GPU, then we've added another dependency which is more difficult to guarantee across different arches.
Yes. And it should be expected. It's a consumer who knows the restrictions on the buffer. As I wrote, Zoom/Hangouts should not require a DMA buffer at all.
Why not ? If you want to capture to a buffer that you then compose on the screen without copying data, dma-buf is the way to go. That's the Linux solution for buffer sharing.
Applications should be able to allocate the buffer out of the generic memory.
If applications really want to copy data and degrade performance, they are free to shoot themselves in the foot of course. Applications (or compositors) need to support copying as a fallback in the worst case, but all components should at least aim for the zero-copy case.
GPUs might also have different requirements. Consider GPUs with VRAM. It might be beneficial to allocate a buffer out of VRAM rather than generic DMA mem.
Absolutely. For that we need a centralized device memory allocator in userspace. An effort was started by James Jones in 2016, see [1]. It has unfortunately stalled. If I didn't have a camera framework to develop, I would try to tackle that issue :-)
[1] https://www.x.org/wiki/Events/XDC2016/Program/Unix_Device_Memory_Allocation....
On Tue, 7 May 2024 at 21:40, Laurent Pinchart laurent.pinchart@ideasonboard.com wrote:
On Tue, May 07, 2024 at 06:19:18PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 18:15, Bryan O'Donoghue wrote:
On 07/05/2024 16:09, Dmitry Baryshkov wrote:
Ah, I see. Then why do you require the DMA-ble buffer at all? If you are providing data to VPU or DRM, then you should be able to get the buffer from the data-consuming device.
Because we don't necessarily know what the consuming device is, if any.
Could be VPU, could be Zoom/Hangouts via pipewire, could for argument sake be GPU or DSP.
Also if we introduce a dependency on another device to allocate the output buffers - say always taking the output buffer from the GPU, then we've added another dependency which is more difficult to guarantee across different arches.
Yes. And it should be expected. It's a consumer who knows the restrictions on the buffer. As I wrote, Zoom/Hangouts should not require a DMA buffer at all.
Why not ? If you want to capture to a buffer that you then compose on the screen without copying data, dma-buf is the way to go. That's the Linux solution for buffer sharing.
Yes. But it should be allocated by the DRM driver. As Sima wrote, there is no guarantee that the buffer allocated from dma-heaps is accessible to the GPU.
Applications should be able to allocate the buffer out of the generic memory.
If applications really want to copy data and degrade performance, they are free to shoot themselves in the foot of course. Applications (or compositors) need to support copying as a fallback in the worst case, but all components should at least aim for the zero-copy case.
I'd say that they should aim for the optimal case. It might include both zero-copying access from another DMA master or simple software processing of some kind.
GPUs might also have different requirements. Consider GPUs with VRAM. It might be beneficial to allocate a buffer out of VRAM rather than generic DMA mem.
Absolutely. For that we need a centralized device memory allocator in userspace. An effort was started by James Jones in 2016, see [1]. It has unfortunately stalled. If I didn't have a camera framework to develop, I would try to tackle that issue :-)
I'll review the talk. However the fact that the effort has stalled most likely means that 'one fits them all' approach didn't really fly well. We have too many usecases.
[1] https://www.x.org/wiki/Events/XDC2016/Program/Unix_Device_Memory_Allocation....
On Tue, May 07, 2024 at 10:59:42PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 21:40, Laurent Pinchart wrote:
On Tue, May 07, 2024 at 06:19:18PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 18:15, Bryan O'Donoghue wrote:
On 07/05/2024 16:09, Dmitry Baryshkov wrote:
Ah, I see. Then why do you require the DMA-ble buffer at all? If you are providing data to VPU or DRM, then you should be able to get the buffer from the data-consuming device.
Because we don't necessarily know what the consuming device is, if any.
Could be VPU, could be Zoom/Hangouts via pipewire, could for argument sake be GPU or DSP.
Also if we introduce a dependency on another device to allocate the output buffers - say always taking the output buffer from the GPU, then we've added another dependency which is more difficult to guarantee across different arches.
Yes. And it should be expected. It's a consumer who knows the restrictions on the buffer. As I wrote, Zoom/Hangouts should not require a DMA buffer at all.
Why not ? If you want to capture to a buffer that you then compose on the screen without copying data, dma-buf is the way to go. That's the Linux solution for buffer sharing.
Yes. But it should be allocated by the DRM driver. As Sima wrote, there is no guarantee that the buffer allocated from dma-heaps is accessible to the GPU.
No disagreement there. From a libcamera point of view, we want to import buffers allocated externally. It's for use cases where applications can't provide dma buf instances easily that we need to allocate them through the libcamera buffer allocator helper. That helper has to a) provide dma buf fds, and b) make a best effort to allocate buffers that will have a decent chance of being usable by other devices. We're open to exploring other solutions than dma heaps, although I wonder what the dma heaps are for if nobody enables them :-)
Applications should be able to allocate the buffer out of the generic memory.
If applications really want to copy data and degrade performance, they are free to shoot themselves in the foot of course. Applications (or compositors) need to support copying as a fallback in the worst case, but all components should at least aim for the zero-copy case.
I'd say that they should aim for the optimal case. It might include both zero-copying access from another DMA master or simple software processing of some kind.
GPUs might also have different requirements. Consider GPUs with VRAM. It might be beneficial to allocate a buffer out of VRAM rather than generic DMA mem.
Absolutely. For that we need a centralized device memory allocator in userspace. An effort was started by James Jones in 2016, see [1]. It has unfortunately stalled. If I didn't have a camera framework to develop, I would try to tackle that issue :-)
I'll review the talk. However the fact that the effort has stalled most likely means that 'one fits them all' approach didn't really fly well. We have too many usecases.
[1] https://www.x.org/wiki/Events/XDC2016/Program/Unix_Device_Memory_Allocation....
On Tue, May 07, 2024 at 10:59:42PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 21:40, Laurent Pinchart laurent.pinchart@ideasonboard.com wrote:
On Tue, May 07, 2024 at 06:19:18PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 18:15, Bryan O'Donoghue wrote:
On 07/05/2024 16:09, Dmitry Baryshkov wrote:
Ah, I see. Then why do you require the DMA-ble buffer at all? If you are providing data to VPU or DRM, then you should be able to get the buffer from the data-consuming device.
Because we don't necessarily know what the consuming device is, if any.
Could be VPU, could be Zoom/Hangouts via pipewire, could for argument sake be GPU or DSP.
Also if we introduce a dependency on another device to allocate the output buffers - say always taking the output buffer from the GPU, then we've added another dependency which is more difficult to guarantee across different arches.
Yes. And it should be expected. It's a consumer who knows the restrictions on the buffer. As I wrote, Zoom/Hangouts should not require a DMA buffer at all.
Why not ? If you want to capture to a buffer that you then compose on the screen without copying data, dma-buf is the way to go. That's the Linux solution for buffer sharing.
Yes. But it should be allocated by the DRM driver. As Sima wrote, there is no guarantee that the buffer allocated from dma-heaps is accessible to the GPU.
Applications should be able to allocate the buffer out of the generic memory.
If applications really want to copy data and degrade performance, they are free to shoot themselves in the foot of course. Applications (or compositors) need to support copying as a fallback in the worst case, but all components should at least aim for the zero-copy case.
I'd say that they should aim for the optimal case. It might include both zero-copying access from another DMA master or simple software processing of some kind.
GPUs might also have different requirements. Consider GPUs with VRAM. It might be beneficial to allocate a buffer out of VRAM rather than generic DMA mem.
Absolutely. For that we need a centralized device memory allocator in userspace. An effort was started by James Jones in 2016, see [1]. It has unfortunately stalled. If I didn't have a camera framework to develop, I would try to tackle that issue :-)
I'll review the talk. However the fact that the effort has stalled most likely means that 'one fits them all' approach didn't really fly well. We have too many usecases.
I think there's two reasons:
- It's a really hard problem with many aspects. Where you need to allocate the buffer is just one of the myriad of issues a common allocator needs to solve.
- Every linux-based os has their own solution for these, and the one that suffers most has an entirely different one from everyone else: Android uses binder services to allow apps to make these allocations, keep track of them and make sure there's no abuse. And if there is, it can just nuke the app.
Cheers, Sima
On Wed, May 08, 2024 at 10:39:58AM +0200, Daniel Vetter wrote:
On Tue, May 07, 2024 at 10:59:42PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 21:40, Laurent Pinchart wrote:
On Tue, May 07, 2024 at 06:19:18PM +0300, Dmitry Baryshkov wrote:
On Tue, 7 May 2024 at 18:15, Bryan O'Donoghue wrote:
On 07/05/2024 16:09, Dmitry Baryshkov wrote:
Ah, I see. Then why do you require the DMA-ble buffer at all? If you are providing data to VPU or DRM, then you should be able to get the buffer from the data-consuming device.
Because we don't necessarily know what the consuming device is, if any.
Could be VPU, could be Zoom/Hangouts via pipewire, could for argument sake be GPU or DSP.
Also if we introduce a dependency on another device to allocate the output buffers - say always taking the output buffer from the GPU, then we've added another dependency which is more difficult to guarantee across different arches.
Yes. And it should be expected. It's a consumer who knows the restrictions on the buffer. As I wrote, Zoom/Hangouts should not require a DMA buffer at all.
Why not ? If you want to capture to a buffer that you then compose on the screen without copying data, dma-buf is the way to go. That's the Linux solution for buffer sharing.
Yes. But it should be allocated by the DRM driver. As Sima wrote, there is no guarantee that the buffer allocated from dma-heaps is accessible to the GPU.
Applications should be able to allocate the buffer out of the generic memory.
If applications really want to copy data and degrade performance, they are free to shoot themselves in the foot of course. Applications (or compositors) need to support copying as a fallback in the worst case, but all components should at least aim for the zero-copy case.
I'd say that they should aim for the optimal case. It might include both zero-copying access from another DMA master or simple software processing of some kind.
GPUs might also have different requirements. Consider GPUs with VRAM. It might be beneficial to allocate a buffer out of VRAM rather than generic DMA mem.
Absolutely. For that we need a centralized device memory allocator in userspace. An effort was started by James Jones in 2016, see [1]. It has unfortunately stalled. If I didn't have a camera framework to develop, I would try to tackle that issue :-)
I'll review the talk. However the fact that the effort has stalled most likely means that 'one fits them all' approach didn't really fly well. We have too many usecases.
I think there's two reasons:
- It's a really hard problem with many aspects. Where you need to allocate the buffer is just one of the myriad of issues a common allocator needs to solve.
The other large problem is picking up an optimal pixel format. I wonder if that could be decoupled from the allocation. That could help moving forward.
- Every linux-based os has their own solution for these, and the one that suffers most has an entirely different one from everyone else: Android uses binder services to allow apps to make these allocations, keep track of them and make sure there's no abuse. And if there is, it can just nuke the app.
linaro-mm-sig@lists.linaro.org