On Mon, 9 Jun 2014 14:06:33 +0300 Pekka Paalanen pekka.paalanen@collabora.co.uk wrote:
On Mon, 9 Jun 2014 11:00:04 +0200 Benjamin Gaignard benjamin.gaignard@linaro.org wrote:
On my hardware the patches you have (+ this one on gstwaylandsink https://bugzilla.gnome.org/show_bug.cgi?id=711155) allow me to do zero copy between the hardware video decoder and the display engine. I don't have implemented GPU yet because my hardware is able to do compose few video overlays planes and it was enough for my tests.
Right.
What I have been thinking is, that the compositor must be able to use the new wl_buffer and we need to guarantee that before-hand. If the compositor fails to use a wl_buffer when the client has already attached it to a wl_surface and it is time to repaint, it is too late and the user will see a glitch. Recovering from that requires asking the client to provide a new wl_buffer of a different kind, which might take time. Or a very rude compositor would just send a protocol error, and then we'd get bug reports like "the video player just disappears when I try to play (and ps. I have an old kernel that doesn't support importing whatever)".
I believe we must allow the compositor to test the wl_buffer before it is usable for the client. That is the reason for the roundtrippy design of the below proposal.
Because we do not even try to communicate all the possible restrictions to the client for it to match, we can leave the validation strictly as a kernel internal issue. Buffer migration inside the kernel might even magically solve some of the mismatches. It does leave the problem of what can the client do, if it doesn't fill all the requirements for the compositor to be able to import the dmabufs. But what restrictions other than color format we can or should communicate, and where does user space get them in the first place... *hand-waving*
But, this also leaves it up to the compositor to choose how/where it wants to import the dmabufs. If a compositor is usually compositing with GL, it will try to import with EGL on whatever GPU it is using. If the compositor uses a software renderer, it can try to mmap the dmabufs (or try this as a fallback, if the EGL import fails). If the compositor is absolutely sure it can rely on the hardware display engine to composite these buffers (note, buffers! You don't know to which surfaces these buffers will be attached to), it can import directly with DRM as FB objects, or V4L, or whatever. A compositor with the fullscreen shell extension but without the sub-surface extension comes to mind.
In summary, the compositor must be able to use the wl_buffer in its default/fallback compositing path. If the wl_buffer is also suitable for direct scanout, e.g. on an overlay, that is "just" a bonus.
With the round-trippy design, I am assuming that you can export-pass-import a set of dmabufs once, and then reuse them as long as you don't need to e.g. resize them. Is this a reasonable assumption? Are there any, for instance, hardware video decoders that just insist on exporting a new buffer for every frame?
I am tracking the proposal in http://cgit.collabora.com/git/user/pq/weston.git/log/?h=linux_dmabuf
So far I added back the event to advertise the supported drm_fourcc formats, since that is probably quite crucial.
Yeah, about that...
https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_im... provides no way for the compositor to query, what formats the EGL implementation might support for importing dmabufs. I'm not sure GBM has that yet either.
So there is no way a compositor could advertise the set of supported formats, since it has no way of knowing, has it?
Any suggested solutions for this? Or would probing (export dmabuf, send to compositor, wait for compositor to ack/reject) for suitable formats be enough?
Thanks, pq
2014-06-06 17:30 GMT+02:00 Pekka Paalanen pekka.paalanen@collabora.co.uk:
Hi,
the previous attempt at introducing a generic wl_dmabuf protocol to Wayland didn't end too well: http://lists.freedesktop.org/archives/wayland-devel/2013-December/012390.htm... http://lists.freedesktop.org/archives/wayland-devel/2013-December/012455.htm... http://lists.freedesktop.org/archives/wayland-devel/2013-December/012566.htm... http://lists.freedesktop.org/archives/wayland-devel/2014-January/012727.html
We are again interested in this, and I did a quick Friday evening draft to open the discussion again. The base of the draft was a quick look at https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_im...
The basic idea is, that a client has one or more dmabufs it wants to share with the compositor, making up a single logical buffer (a single image). The client chooses where and how to export those dmabufs. The dmabuf fds and metadata are sent to the compositor, the compositor assembles and tries to import them. If the import succeeds, a wl_buffer object is created. If the import fails, the client is notified that the compositor can't use these, it would be better to try something else.
I assume that if the "import" succeeds, the compositor is able to use the buffers, e.g. at least turn them into a GL-texture or mmap them, if not also able to scan out or put on a hw overlay. This could be any kind of checking to verify that the buffers are usable. Finding out that it won't work after the client is already using the wl_buffer must not happen, as we have no way to recover from it: the client will get disconnected. So the point is knowing in advance, that the buffers are usable on both sides, preferrably before the client has filled them with data, but I suppose in the usual case the buffer is already filled.
As creating a dmabuf-based wl_buffer requires a roundtrip in this scheme, I assume it only needs to be done rarely, and the same buffer can be re-used many times with proper synchronization.
The crude draft is below. Some questions:
- Does this sound sane to you?
- What other metadata would we need? Thierry had some issues with tiling formats I think.
- This "check if the dmabuf is really usable" is needed, right? We can't just assume that any dmabuf will work?
- Do we need anything for fences here, or is the dmabuf fd enough?
- Does someone already have something similar running?
On Fri, Jun 13, 2014 at 7:04 AM, Pekka Paalanen ppaalanen@gmail.com wrote:
On Mon, 9 Jun 2014 14:06:33 +0300 Pekka Paalanen pekka.paalanen@collabora.co.uk wrote:
On Mon, 9 Jun 2014 11:00:04 +0200 Benjamin Gaignard benjamin.gaignard@linaro.org wrote:
On my hardware the patches you have (+ this one on gstwaylandsink https://bugzilla.gnome.org/show_bug.cgi?id=711155) allow me to do zero copy between the hardware video decoder and the display engine. I don't have implemented GPU yet because my hardware is able to do compose few video overlays planes and it was enough for my tests.
Right.
What I have been thinking is, that the compositor must be able to use the new wl_buffer and we need to guarantee that before-hand. If the compositor fails to use a wl_buffer when the client has already attached it to a wl_surface and it is time to repaint, it is too late and the user will see a glitch. Recovering from that requires asking the client to provide a new wl_buffer of a different kind, which might take time. Or a very rude compositor would just send a protocol error, and then we'd get bug reports like "the video player just disappears when I try to play (and ps. I have an old kernel that doesn't support importing whatever)".
I believe we must allow the compositor to test the wl_buffer before it is usable for the client. That is the reason for the roundtrippy design of the below proposal.
Because we do not even try to communicate all the possible restrictions to the client for it to match, we can leave the validation strictly as a kernel internal issue. Buffer migration inside the kernel might even magically solve some of the mismatches. It does leave the problem of what can the client do, if it doesn't fill all the requirements for the compositor to be able to import the dmabufs. But what restrictions other than color format we can or should communicate, and where does user space get them in the first place... *hand-waving*
But, this also leaves it up to the compositor to choose how/where it wants to import the dmabufs. If a compositor is usually compositing with GL, it will try to import with EGL on whatever GPU it is using. If the compositor uses a software renderer, it can try to mmap the dmabufs (or try this as a fallback, if the EGL import fails). If the compositor is absolutely sure it can rely on the hardware display engine to composite these buffers (note, buffers! You don't know to which surfaces these buffers will be attached to), it can import directly with DRM as FB objects, or V4L, or whatever. A compositor with the fullscreen shell extension but without the sub-surface extension comes to mind.
In summary, the compositor must be able to use the wl_buffer in its default/fallback compositing path. If the wl_buffer is also suitable for direct scanout, e.g. on an overlay, that is "just" a bonus.
With the round-trippy design, I am assuming that you can export-pass-import a set of dmabufs once, and then reuse them as long as you don't need to e.g. resize them. Is this a reasonable assumption? Are there any, for instance, hardware video decoders that just insist on exporting a new buffer for every frame?
I am tracking the proposal in http://cgit.collabora.com/git/user/pq/weston.git/log/?h=linux_dmabuf
So far I added back the event to advertise the supported drm_fourcc formats, since that is probably quite crucial.
Yeah, about that...
https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_im... provides no way for the compositor to query, what formats the EGL implementation might support for importing dmabufs. I'm not sure GBM has that yet either.
So there is no way a compositor could advertise the set of supported formats, since it has no way of knowing, has it?
It wouldn't be too hard, I don't think, to add something at gbm level.
Any suggested solutions for this? Or would probing (export dmabuf, send to compositor, wait for compositor to ack/reject) for suitable formats be enough?
Well, compositor could, I suppose, build up a supported formats list by getting a dummy buffer somehow (either from display or gpu device, it shouldn't really matter), and then iterate though importing that one buffer as different formats. Seems easier than doing it on client side with a round trip to compositor each time. And, well, I'm pretty out of date on wl proto stuff, but seems pretty reasonable that the compositor could tell the client what formats it supported..
BR, -R
Thanks, pq
2014-06-06 17:30 GMT+02:00 Pekka Paalanen pekka.paalanen@collabora.co.uk:
Hi,
the previous attempt at introducing a generic wl_dmabuf protocol to Wayland didn't end too well: http://lists.freedesktop.org/archives/wayland-devel/2013-December/012390.htm... http://lists.freedesktop.org/archives/wayland-devel/2013-December/012455.htm... http://lists.freedesktop.org/archives/wayland-devel/2013-December/012566.htm... http://lists.freedesktop.org/archives/wayland-devel/2014-January/012727.html
We are again interested in this, and I did a quick Friday evening draft to open the discussion again. The base of the draft was a quick look at https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_im...
The basic idea is, that a client has one or more dmabufs it wants to share with the compositor, making up a single logical buffer (a single image). The client chooses where and how to export those dmabufs. The dmabuf fds and metadata are sent to the compositor, the compositor assembles and tries to import them. If the import succeeds, a wl_buffer object is created. If the import fails, the client is notified that the compositor can't use these, it would be better to try something else.
I assume that if the "import" succeeds, the compositor is able to use the buffers, e.g. at least turn them into a GL-texture or mmap them, if not also able to scan out or put on a hw overlay. This could be any kind of checking to verify that the buffers are usable. Finding out that it won't work after the client is already using the wl_buffer must not happen, as we have no way to recover from it: the client will get disconnected. So the point is knowing in advance, that the buffers are usable on both sides, preferrably before the client has filled them with data, but I suppose in the usual case the buffer is already filled.
As creating a dmabuf-based wl_buffer requires a roundtrip in this scheme, I assume it only needs to be done rarely, and the same buffer can be re-used many times with proper synchronization.
The crude draft is below. Some questions:
- Does this sound sane to you?
- What other metadata would we need? Thierry had some issues with tiling formats I think.
- This "check if the dmabuf is really usable" is needed, right? We can't just assume that any dmabuf will work?
- Do we need anything for fences here, or is the dmabuf fd enough?
- Does someone already have something similar running?
On Fri, 13 Jun 2014 07:47:54 -0400 Rob Clark robdclark@gmail.com wrote:
On Fri, Jun 13, 2014 at 7:04 AM, Pekka Paalanen ppaalanen@gmail.com wrote:
On Mon, 9 Jun 2014 14:06:33 +0300 Pekka Paalanen pekka.paalanen@collabora.co.uk wrote:
On Mon, 9 Jun 2014 11:00:04 +0200 Benjamin Gaignard benjamin.gaignard@linaro.org wrote:
On my hardware the patches you have (+ this one on gstwaylandsink https://bugzilla.gnome.org/show_bug.cgi?id=711155) allow me to do zero copy between the hardware video decoder and the display engine. I don't have implemented GPU yet because my hardware is able to do compose few video overlays planes and it was enough for my tests.
Right.
What I have been thinking is, that the compositor must be able to use the new wl_buffer and we need to guarantee that before-hand. If the compositor fails to use a wl_buffer when the client has already attached it to a wl_surface and it is time to repaint, it is too late and the user will see a glitch. Recovering from that requires asking the client to provide a new wl_buffer of a different kind, which might take time. Or a very rude compositor would just send a protocol error, and then we'd get bug reports like "the video player just disappears when I try to play (and ps. I have an old kernel that doesn't support importing whatever)".
I believe we must allow the compositor to test the wl_buffer before it is usable for the client. That is the reason for the roundtrippy design of the below proposal.
Because we do not even try to communicate all the possible restrictions to the client for it to match, we can leave the validation strictly as a kernel internal issue. Buffer migration inside the kernel might even magically solve some of the mismatches. It does leave the problem of what can the client do, if it doesn't fill all the requirements for the compositor to be able to import the dmabufs. But what restrictions other than color format we can or should communicate, and where does user space get them in the first place... *hand-waving*
But, this also leaves it up to the compositor to choose how/where it wants to import the dmabufs. If a compositor is usually compositing with GL, it will try to import with EGL on whatever GPU it is using. If the compositor uses a software renderer, it can try to mmap the dmabufs (or try this as a fallback, if the EGL import fails). If the compositor is absolutely sure it can rely on the hardware display engine to composite these buffers (note, buffers! You don't know to which surfaces these buffers will be attached to), it can import directly with DRM as FB objects, or V4L, or whatever. A compositor with the fullscreen shell extension but without the sub-surface extension comes to mind.
In summary, the compositor must be able to use the wl_buffer in its default/fallback compositing path. If the wl_buffer is also suitable for direct scanout, e.g. on an overlay, that is "just" a bonus.
With the round-trippy design, I am assuming that you can export-pass-import a set of dmabufs once, and then reuse them as long as you don't need to e.g. resize them. Is this a reasonable assumption? Are there any, for instance, hardware video decoders that just insist on exporting a new buffer for every frame?
I am tracking the proposal in http://cgit.collabora.com/git/user/pq/weston.git/log/?h=linux_dmabuf
So far I added back the event to advertise the supported drm_fourcc formats, since that is probably quite crucial.
Yeah, about that...
https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_im... provides no way for the compositor to query, what formats the EGL implementation might support for importing dmabufs. I'm not sure GBM has that yet either.
So there is no way a compositor could advertise the set of supported formats, since it has no way of knowing, has it?
It wouldn't be too hard, I don't think, to add something at gbm level.
Any suggested solutions for this? Or would probing (export dmabuf, send to compositor, wait for compositor to ack/reject) for suitable formats be enough?
Well, compositor could, I suppose, build up a supported formats list by getting a dummy buffer somehow (either from display or gpu device, it shouldn't really matter), and then iterate though importing that one buffer as different formats. Seems easier than doing it on client side with a round trip to compositor each time. And, well, I'm pretty out of date on wl proto stuff, but seems pretty reasonable that the compositor could tell the client what formats it supported..
Oh right, the compositor could probe it on its own during start-up. That sounds good as at least a temporary measure, and if the EGL extension will never expose them directly. Defining the EGL extension to have to support the formats that GBM advertises feels a bit awkward.
The basic design principle in Wayland protocol is that you tell the client all the restrictions before-hand, and if the client still breaks them, it gets kicked out, no questions asked. The answer to all failures is to just kill the client, makes failure handling pretty easy protocol-wise. Both wl_shm and wl_drm pixel format support works like that AFAIK.
Doing the same for dmabuf is more tricky, since there is a lot more than just pixel format, which why I thought we might need an explicit protocol for graceful failure ahead of time.
Thanks, pq
linaro-mm-sig@lists.linaro.org