Hello everyone, This patchset adds support for DMABUF [2] importing to V4L2 stack. The support for DMABUF exporting was moved to separate patchset due to dependency on patches for DMA mapping redesign by Marek Szyprowski [4].
v4: - rebased on mainline 3.4-rc2 - included missing importing support for s5p-fimc and s5p-tv - added patch for changing map/unmap for importers - fixes to Documentation part - coding style fixes - pairing {map/unmap}_dmabuf in vb2-core - fixing variable types and semantic of arguments in videobufb2-dma-contig.c
v3: - rebased on mainline 3.4-rc1 - split 'code refactor' patch to multiple smaller patches - squashed fixes to Sumit's patches - patchset is no longer dependant on 'DMA mapping redesign' - separated path for handling IO and non-IO mappings - add documentation for DMABUF importing to V4L - removed all DMABUF exporter related code - removed usage of dma_get_pages extension
v2: - extended VIDIOC_EXPBUF argument from integer memoffset to struct v4l2_exportbuffer - added patch that breaks DMABUF spec on (un)map_atachment callcacks but allows to work with existing implementation of DMABUF prime in DRM - all dma-contig code refactoring patches were squashed - bugfixes
v1: List of changes since [1]. - support for DMA api extension dma_get_pages, the function is used to retrieve pages used to create DMA mapping. - small fixes/code cleanup to videobuf2 - added prepare and finish callbacks to vb2 allocators, it is used keep consistency between dma-cpu acess to the memory (by Marek Szyprowski) - support for exporting of DMABUF buffer in V4L2 and Videobuf2, originated from [3]. - support for dma-buf exporting in vb2-dma-contig allocator - support for DMABUF for s5p-tv and s5p-fimc (capture interface) drivers, originated from [3] - changed handling for userptr buffers (by Marek Szyprowski, Andrzej Pietrasiewicz) - let mmap method to use dma_mmap_writecombine call (by Marek Szyprowski)
[1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/42966... [2] https://lkml.org/lkml/2011/12/26/29 [3] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/36354... [4] http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819
Andrzej Pietrasiewicz (1): v4l: vb2-dma-contig: add support for scatterlist in userptr mode
Laurent Pinchart (2): v4l: vb2-dma-contig: Shorten vb2_dma_contig prefix to vb2_dc v4l: vb2-dma-contig: Reorder functions
Marek Szyprowski (2): v4l: vb2: add prepare/finish callbacks to allocators v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator
Sumit Semwal (4): v4l: Add DMABUF as a memory type v4l: vb2: add support for shared buffer (dma_buf) v4l: vb: remove warnings about MEMORY_DMABUF v4l: vb2-dma-contig: add support for dma_buf importing
Tomasz Stanislawski (5): Documentation: media: description of DMABUF importing in V4L2 v4l: vb2-dma-contig: Remove unneeded allocation context structure v4l: vb2-dma-contig: change map/unmap behaviour for importers v4l: s5p-tv: mixer: support for dmabuf importing v4l: fimc: support for dmabuf importing
Documentation/DocBook/media/v4l/compat.xml | 4 + Documentation/DocBook/media/v4l/io.xml | 179 +++++++ .../DocBook/media/v4l/vidioc-create-bufs.xml | 1 + Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 + Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 47 +- drivers/media/video/Kconfig | 1 + drivers/media/video/s5p-fimc/fimc-capture.c | 2 +- drivers/media/video/s5p-tv/Kconfig | 1 + drivers/media/video/s5p-tv/mixer_video.c | 2 +- drivers/media/video/videobuf-core.c | 4 + drivers/media/video/videobuf2-core.c | 207 +++++++- drivers/media/video/videobuf2-dma-contig.c | 553 +++++++++++++++++--- include/linux/videodev2.h | 7 + include/media/videobuf2-core.h | 34 ++ 14 files changed, 956 insertions(+), 101 deletions(-)
From: Sumit Semwal sumit.semwal@ti.com
Adds DMABUF memory type to v4l framework. Also adds the related file descriptor in v4l2_plane and v4l2_buffer.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [original work in the PoC for buffer sharing] Signed-off-by: Sumit Semwal sumit.semwal@ti.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org --- include/linux/videodev2.h | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/include/linux/videodev2.h b/include/linux/videodev2.h index c9c9a46..d884d4a 100644 --- a/include/linux/videodev2.h +++ b/include/linux/videodev2.h @@ -185,6 +185,7 @@ enum v4l2_memory { V4L2_MEMORY_MMAP = 1, V4L2_MEMORY_USERPTR = 2, V4L2_MEMORY_OVERLAY = 3, + V4L2_MEMORY_DMABUF = 4, };
/* see also http://vektor.theorem.ca/graphics/ycbcr/ */ @@ -588,6 +589,8 @@ struct v4l2_requestbuffers { * should be passed to mmap() called on the video node) * @userptr: when memory is V4L2_MEMORY_USERPTR, a userspace pointer * pointing to this plane + * @fd: when memory is V4L2_MEMORY_DMABUF, a userspace file + * descriptor associated with this plane * @data_offset: offset in the plane to the start of data; usually 0, * unless there is a header in front of the data * @@ -602,6 +605,7 @@ struct v4l2_plane { union { __u32 mem_offset; unsigned long userptr; + int fd; } m; __u32 data_offset; __u32 reserved[11]; @@ -624,6 +628,8 @@ struct v4l2_plane { * (or a "cookie" that should be passed to mmap() as offset) * @userptr: for non-multiplanar buffers with memory == V4L2_MEMORY_USERPTR; * a userspace pointer pointing to this buffer + * @fd: for non-multiplanar buffers with memory == V4L2_MEMORY_DMABUF; + * a userspace file descriptor associated with this buffer * @planes: for multiplanar buffers; userspace pointer to the array of plane * info structs for this buffer * @length: size in bytes of the buffer (NOT its payload) for single-plane @@ -650,6 +656,7 @@ struct v4l2_buffer { __u32 offset; unsigned long userptr; struct v4l2_plane *planes; + int fd; } m; __u32 length; __u32 input;
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:43 Tomasz Stanislawski wrote:
From: Sumit Semwal sumit.semwal@ti.com
Adds DMABUF memory type to v4l framework. Also adds the related file descriptor in v4l2_plane and v4l2_buffer.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [original work in the PoC for buffer sharing] Signed-off-by: Sumit Semwal sumit.semwal@ti.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org
Acked-by: Laurent Pinchart laurent.pinchart@ideasonboard.com
This patch adds description and usage examples for importing DMABUF file descriptor in V4L2.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com --- Documentation/DocBook/media/v4l/compat.xml | 4 + Documentation/DocBook/media/v4l/io.xml | 179 ++++++++++++++++++++ .../DocBook/media/v4l/vidioc-create-bufs.xml | 1 + Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 ++ Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 47 +++--- 5 files changed, 224 insertions(+), 22 deletions(-)
diff --git a/Documentation/DocBook/media/v4l/compat.xml b/Documentation/DocBook/media/v4l/compat.xml index bce97c5..2a2083d 100644 --- a/Documentation/DocBook/media/v4l/compat.xml +++ b/Documentation/DocBook/media/v4l/compat.xml @@ -2523,6 +2523,10 @@ ioctls.</para> <listitem> <para>Selection API. <xref linkend="selection-api" /></para> </listitem> + <listitem> + <para>Importing DMABUF file descriptors as a new IO method described + in <xref linkend="dmabuf" />.</para> + </listitem> </itemizedlist> </section>
diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -472,6 +472,162 @@ rest should be evident.</para> </footnote></para> </section>
+ <section id="dmabuf"> + <title>Streaming I/O (DMA buffer importing)</title> + + <note> + <title>Experimental</title> + <para>This is an <link linkend="experimental"> experimental </link> + interface and may change in the future.</para> + </note> + +<para>The DMABUF framework provides a generic mean for sharing buffers between + multiple devices. Device drivers that support DMABUF can export a DMA buffer +to userspace as a file descriptor (known as the exporter role), import a DMA +buffer from userspace using a file descriptor previously exported for a +different or the same device (known as the importer role), or both. This +section describes the DMABUF importer role API in V4L2.</para> + +<para>Input and output devices support the streaming I/O method when the +<constant>V4L2_CAP_STREAMING</constant> flag in the +<structfield>capabilities</structfield> field of &v4l2-capability; returned by +the &VIDIOC-QUERYCAP; ioctl is set. Whether importing DMA buffers through +DMABUF file descriptors is supported is determined by calling the +&VIDIOC-REQBUFS; ioctl with the memory type set to +<constant>V4L2_MEMORY_DMABUF</constant>.</para> + + <para>This I/O method is dedicated for sharing DMA buffers between V4L and +other APIs. Buffers (planes) are allocated by a driver on behalf of the +application, and exported to the application as file descriptors using an API +specific to the allocator driver. Only those file descriptor are exchanged, +these files and meta-information are passed in &v4l2-buffer; (or in +&v4l2-plane; in the multi-planar API case). The driver must be switched into +DMABUF I/O mode by calling the &VIDIOC-REQBUFS; with the desired buffer type. +No buffers (planes) are allocated beforehand, consequently they are not indexed +and cannot be queried like mapped buffers with the +<constant>VIDIOC_QUERYBUF</constant> ioctl.</para> + + <example> + <title>Initiating streaming I/O with DMABUF file descriptors</title> + + <programlisting> +&v4l2-requestbuffers; reqbuf; + +memset (&reqbuf, 0, sizeof (reqbuf)); +reqbuf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; +reqbuf.memory = V4L2_MEMORY_DMABUF; + +if (ioctl (fd, &VIDIOC-REQBUFS;, &reqbuf) == -1) { + if (errno == EINVAL) + printf ("Video capturing or DMABUF streaming is not supported\n"); + else + perror ("VIDIOC_REQBUFS"); + + exit (EXIT_FAILURE); +} + </programlisting> + </example> + + <para>Buffer (plane) file is passed on the fly with the &VIDIOC-QBUF; +ioctl. In case of multiplanar buffers, every plane can be associated with a +different DMABUF descriptor.Although buffers are commonly cycled, applications +can pass different DMABUF descriptor at each <constant>VIDIOC_QBUF</constant> +call.</para> + + <example> + <title>Queueing DMABUF using single plane API</title> + + <programlisting> +int buffer_queue(int v4lfd, int index, int dmafd) +{ + &v4l2-buffer; buf; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.fd = dmafd; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + </programlisting> + </example> + + <example> + <title>Queueing DMABUF using multi plane API</title> + + <programlisting> +int buffer_queue_mp(int v4lfd, int index, int dmafd[], int n_planes) +{ + &v4l2-buffer; buf; + &v4l2-plane; planes[VIDEO_MAX_PLANES]; + int i; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.planes = planes; + buf.length = n_planes; + + memset(&planes, 0, sizeof planes); + + for (i = 0; i < n_planes; ++i) + buf.m.planes[i].m.fd = dmafd[i]; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + </programlisting> + </example> + + <para>Filled or displayed buffers are dequeued with the +&VIDIOC-DQBUF; ioctl. The driver can unpin the buffer at any +time between the completion of the DMA and this ioctl. The memory is +also unpinned when &VIDIOC-STREAMOFF; is called, &VIDIOC-REQBUFS;, or +when the device is closed.</para> + + <para>For capturing applications it is customary to enqueue a +number of empty buffers, to start capturing and enter the read loop. +Here the application waits until a filled buffer can be dequeued, and +re-enqueues the buffer when the data is no longer needed. Output +applications fill and enqueue buffers, when enough buffers are stacked +up output is started. In the write loop, when the application +runs out of free buffers it must wait until an empty buffer can be +dequeued and reused. Two methods exist to suspend execution of the +application until one or more buffers can be dequeued. By default +<constant>VIDIOC_DQBUF</constant> blocks when no buffer is in the +outgoing queue. When the <constant>O_NONBLOCK</constant> flag was +given to the &func-open; function, <constant>VIDIOC_DQBUF</constant> +returns immediately with an &EAGAIN; when no buffer is available. The +&func-select; or &func-poll; function are always available.</para> + + <para>To start and stop capturing or output applications call the +&VIDIOC-STREAMON; and &VIDIOC-STREAMOFF; ioctls. Note that +<constant>VIDIOC_STREAMOFF</constant> removes all buffers from both queues and +unlocks/unpins all buffers as a side effect. Since there is no notion of doing +anything "now" on a multitasking system, if an application needs to synchronize +with another event it should examine the &v4l2-buffer; +<structfield>timestamp</structfield> of captured buffers, or set the field +before enqueuing buffers for output.</para> + + <para>Drivers implementing DMABUF importing I/O must support the +<constant>VIDIOC_REQBUFS</constant>, <constant>VIDIOC_QBUF</constant>, +<constant>VIDIOC_DQBUF</constant>, <constant>VIDIOC_STREAMON</constant> and +<constant>VIDIOC_STREAMOFF</constant> ioctl, the <function>select()</function> +and <function>poll()</function> function.</para> + + </section> + <section id="async"> <title>Asynchronous I/O</title>
@@ -671,6 +827,14 @@ memory, set by the application. See <xref linkend="userp" /> for details. <structname>v4l2_buffer</structname> structure.</entry> </row> <row> + <entry></entry> + <entry>int</entry> + <entry><structfield>fd</structfield></entry> + <entry>For the single-plane API and when +<structfield>memory</structfield> is <constant>V4L2_MEMORY_DMABUF</constant> this +is the file descriptor associated with a DMABUF buffer.</entry> + </row> + <row> <entry>__u32</entry> <entry><structfield>length</structfield></entry> <entry></entry> @@ -746,6 +910,15 @@ should set this to 0.</entry> </entry> </row> <row> + <entry></entry> + <entry>int</entry> + <entry><structfield>fd</structfield></entry> + <entry>When the memory type in the containing &v4l2-buffer; is + <constant>V4L2_MEMORY_DMABUF</constant>, this is a file + descriptor associated with a DMABUF buffer, similar to the + <structfield>fd</structfield> field in &v4l2-buffer;.</entry> + </row> + <row> <entry>__u32</entry> <entry><structfield>data_offset</structfield></entry> <entry></entry> @@ -980,6 +1153,12 @@ pointer</link> I/O.</entry> <entry>3</entry> <entry>[to do]</entry> </row> + <row> + <entry><constant>V4L2_MEMORY_DMABUF</constant></entry> + <entry>2</entry> + <entry>The buffer is used for <link linkend="dmabuf">DMA shared +buffer</link> I/O.</entry> + </row> </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml index 73ae8a6..adc92be 100644 --- a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml @@ -98,6 +98,7 @@ information.</para> <entry><structfield>memory</structfield></entry> <entry>Applications set this field to <constant>V4L2_MEMORY_MMAP</constant> or +<constant>V4L2_MEMORY_DMABUF</constant> or <constant>V4L2_MEMORY_USERPTR</constant>.</entry> </row> <row> diff --git a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml index 9caa49a..cb5f5ff 100644 --- a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml +++ b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml @@ -112,6 +112,21 @@ they cannot be swapped out to disk. Buffers remain locked until dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the device is closed.</para>
+ <para>To enqueue a <link linkend="dmabuf">DMABUF</link> buffer applications +set the <structfield>memory</structfield> field to +<constant>V4L2_MEMORY_DMABUF</constant> and the <structfield>m.fd</structfield> +to a file descriptor associated with a DMABUF buffer. When the multi-planar API is +used and <structfield>m.fd</structfield> of the passed array of &v4l2-plane; +have to be used instead. When <constant>VIDIOC_QBUF</constant> is called with a +pointer to this structure the driver sets the +<constant>V4L2_BUF_FLAG_QUEUED</constant> flag and clears the +<constant>V4L2_BUF_FLAG_MAPPED</constant> and +<constant>V4L2_BUF_FLAG_DONE</constant> flags in the +<structfield>flags</structfield> field, or it returns an error code. This +ioctl locks the buffer. Buffers remain locked until dequeued, +until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the +device is closed.</para> + <para>Applications call the <constant>VIDIOC_DQBUF</constant> ioctl to dequeue a filled (capturing) or displayed (output) buffer from the driver's outgoing queue. They just set the diff --git a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml index 7be4b1d..e3e709b 100644 --- a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml @@ -48,28 +48,30 @@ <refsect1> <title>Description</title>
- <para>This ioctl is used to initiate <link linkend="mmap">memory -mapped</link> or <link linkend="userp">user pointer</link> -I/O. Memory mapped buffers are located in device memory and must be -allocated with this ioctl before they can be mapped into the -application's address space. User buffers are allocated by -applications themselves, and this ioctl is merely used to switch the -driver into user pointer I/O mode and to setup some internal structures.</para> +<para>This ioctl is used to initiate <link linkend="mmap">memory mapped</link>, +<link linkend="userp">user pointer</link> or <link +linkend="dmabuf">DMABUF</link> based I/O. Memory mapped buffers are located in +device memory and must be allocated with this ioctl before they can be mapped +into the application's address space. User buffers are allocated by +applications themselves, and this ioctl is merely used to switch the driver +into user pointer I/O mode and to setup some internal structures. +Similarly, DMABUF buffers are allocated by applications through a device +driver, and this ioctl only configures the driver into DMABUF I/O mode without +performing any direct allocation.</para>
- <para>To allocate device buffers applications initialize all -fields of the <structname>v4l2_requestbuffers</structname> structure. -They set the <structfield>type</structfield> field to the respective -stream or buffer type, the <structfield>count</structfield> field to -the desired number of buffers, <structfield>memory</structfield> -must be set to the requested I/O method and the <structfield>reserved</structfield> array -must be zeroed. When the ioctl -is called with a pointer to this structure the driver will attempt to allocate -the requested number of buffers and it stores the actual number -allocated in the <structfield>count</structfield> field. It can be -smaller than the number requested, even zero, when the driver runs out -of free memory. A larger number is also possible when the driver requires -more buffers to function correctly. For example video output requires at least two buffers, -one displayed and one filled by the application.</para> + <para>To allocate device buffers applications initialize all fields of the +<structname>v4l2_requestbuffers</structname> structure. They set the +<structfield>type</structfield> field to the respective stream or buffer type, +the <structfield>count</structfield> field to the desired number of buffers, +<structfield>memory</structfield> must be set to the requested I/O method and +the <structfield>reserved</structfield> array must be zeroed. When the ioctl is +called with a pointer to this structure the driver will attempt to allocate the +requested number of buffers and it stores the actual number allocated in the +<structfield>count</structfield> field. It can be smaller than the number +requested, even zero, when the driver runs out of free memory. A larger number +is also possible when the driver requires more buffers to function correctly. +For example video output requires at least two buffers, one displayed and one +filled by the application.</para> <para>When the I/O method is not supported the ioctl returns an &EINVAL;.</para>
@@ -102,7 +104,8 @@ as the &v4l2-format; <structfield>type</structfield> field. See <xref <entry>&v4l2-memory;</entry> <entry><structfield>memory</structfield></entry> <entry>Applications set this field to -<constant>V4L2_MEMORY_MMAP</constant> or +<constant>V4L2_MEMORY_MMAP</constant>, +<constant>V4L2_MEMORY_DMABUF</constant> or <constant>V4L2_MEMORY_USERPTR</constant>.</entry> </row> <row>
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:44 Tomasz Stanislawski wrote:
This patch adds description and usage examples for importing DMABUF file descriptor in V4L2.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com
[snip]
diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -472,6 +472,162 @@ rest should be evident.</para> </footnote></para>
</section>
<section id="dmabuf">
<title>Streaming I/O (DMA buffer importing)</title>
This section is very similar to the Streaming I/O (User Pointers) section. Do you think we should merge the two ? I could handle that if you want.
On 04/17/2012 01:25 AM, Laurent Pinchart wrote:
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:44 Tomasz Stanislawski wrote:
This patch adds description and usage examples for importing DMABUF file descriptor in V4L2.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com
[snip]
diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -472,6 +472,162 @@ rest should be evident.</para> </footnote></para>
</section>
<section id="dmabuf">
<title>Streaming I/O (DMA buffer importing)</title>
This section is very similar to the Streaming I/O (User Pointers) section. Do you think we should merge the two ? I could handle that if you want.
Hi Laurent,
One may find similar sentences in MMAP, USERPTR and DMABUF. Maybe the common parts like description of STREAMON/OFF, QBUF/DQBUF shuffling should be moved to separate section like "Streaming" :).
Maybe it is worth to introduce a separate patch for this change.
Frankly, I would prefer to keep the Doc in the current form till importer support gets merged. Later the Doc could be fixed.
BTW. What is the sense of merging userptr and dmabuf section if userptr is going to dropped in long-term?
Regards, Tomasz Stanislawski
Em 19-04-2012 11:32, Tomasz Stanislawski escreveu:
On 04/17/2012 01:25 AM, Laurent Pinchart wrote:
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:44 Tomasz Stanislawski wrote:
This patch adds description and usage examples for importing DMABUF file descriptor in V4L2.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com
[snip]
diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -472,6 +472,162 @@ rest should be evident.</para> </footnote></para>
</section>
<section id="dmabuf">
<title>Streaming I/O (DMA buffer importing)</title>
This section is very similar to the Streaming I/O (User Pointers) section. Do you think we should merge the two ? I could handle that if you want.
Hi Laurent,
One may find similar sentences in MMAP, USERPTR and DMABUF. Maybe the common parts like description of STREAMON/OFF, QBUF/DQBUF shuffling should be moved to separate section like "Streaming" :).
Maybe it is worth to introduce a separate patch for this change.
Frankly, I would prefer to keep the Doc in the current form till importer support gets merged. Later the Doc could be fixed.
BTW. What is the sense of merging userptr and dmabuf section if userptr is going to dropped in long-term?
I didn't read the rest of the thread, so sorry, if I'm making wrong assumptions... Am I understanding wrong or are you saying that you want to drop userptr from V4L2 API in long-term?
Regards, Tomasz Stanislawski
Em 19-04-2012 11:32, Tomasz Stanislawski escreveu:
Hi Laurent,
One may find similar sentences in MMAP, USERPTR and DMABUF. Maybe the common parts like description of STREAMON/OFF, QBUF/DQBUF shuffling should be moved to separate section like "Streaming" :).
Maybe it is worth to introduce a separate patch for this change.
Frankly, I would prefer to keep the Doc in the current form till importer support gets merged. Later the Doc could be fixed.
BTW. What is the sense of merging userptr and dmabuf section if userptr is going to dropped in long-term?
I didn't read yet the rest of the thread, so sorry, if I'm making wrong assumptions... Am I understanding wrong or are you saying that you want to drop userptr from V4L2 API in long-term? If so, why?
Regards, Mauro
Hi Mauro,
On 04/19/2012 10:37 PM, Mauro Carvalho Chehab wrote:
Em 19-04-2012 11:32, Tomasz Stanislawski escreveu:
Hi Laurent,
One may find similar sentences in MMAP, USERPTR and DMABUF. Maybe the common parts like description of STREAMON/OFF, QBUF/DQBUF shuffling should be moved to separate section like "Streaming" :).
Maybe it is worth to introduce a separate patch for this change.
Frankly, I would prefer to keep the Doc in the current form till importer support gets merged. Later the Doc could be fixed.
BTW. What is the sense of merging userptr and dmabuf section if userptr is going to dropped in long-term?
I didn't read yet the rest of the thread, so sorry, if I'm making wrong assumptions... Am I understanding wrong or are you saying that you want to drop userptr from V4L2 API in long-term? If so, why?
Dropping userptr is just some brainstorming idea. It was found out that userptr is not a good mean for buffer exchange between to two devices. The USERPTR simplifies userspace code but introduce a lot of complexity problems for the kernel drivers and frameworks.
The problem is that memory mmaped to the userspace may not be a part of the system memory. It often happens for devices that use remap_pfn or dma_mmap_* to mmap the memory to the userspace.
It is was empirically conjured the it is not possible to access this kind of memory by the other device without a platform-specific hacks or workarounds.
The DMABUF was introduced to help in such a case.
The basic short-term idea is to drop userptr support for buffers that are MMAPed by other device.
The userptr will be used for memory allocated using malloc (anonymous pages) or (maybe) mmaped files. There are of course cache synchronization problems but there are a lesser concern.
However this approach will work only for devices that have its own IOMMU which can be configured to access system memory. Otherwise, the memory has to copied anyway to device's own buffers.
Moreover copying a large amount of data should not happen in the kernel-space.
All the reasons make userptr an unreliable and complex to implement feature.
So my rough-idea was to remove USERPTR support from kernel drivers (if possible of course) and to provide an emulation layer in the userspace code like libv4l2.
Please note that it is only a rough idea. Just brainstorming :)
It is *too early* to start any discussion on this topic. Especially until DMABUF is mature enough to become a good alternative for userptr.
Regards, Tomasz Stanislawski
Regards, Mauro
On Fri, 20 Apr 2012 10:41:37 +0200, Tomasz Stanislawski t.stanislaws@samsung.com wrote:
Am I understanding wrong or are you saying that you want to drop
userptr
from V4L2 API in long-term? If so, why?
Dropping userptr is just some brainstorming idea. It was found out that userptr is not a good mean for buffer exchange between to two devices.
I can believe that. But I am also inclined to believe that DMABUF is targetted at device-to-device transfer, while USERPTR is targetted at device-to-user (or user-to-device) transfers. Are you saying applications should use DMABUF and memory map the buffers? Or would you care to explain how DMABUF addresses the problem space of USERPTR?
The USERPTR simplifies userspace code but introduce a lot of complexity problems for the kernel drivers and frameworks.
It is not only a simplification. In some cases, USERPTR is the only I/O method that supports zero copy in pretty much any circumstance. When the user cannot reliably predict the maximum number of required buffers, predicts a value larger than the device will negotiate, or needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not.
Now, I do realize that some devices cannot support USERPTR efficiently, then they should not support USERPTR. But for those devices that can, it seems quite a nice performance enhancement.
Hi Remi,
On 04/20/2012 12:56 PM, RĂ©mi Denis-Courmont wrote:
On Fri, 20 Apr 2012 10:41:37 +0200, Tomasz Stanislawski t.stanislaws@samsung.com wrote:
Am I understanding wrong or are you saying that you want to drop
userptr
from V4L2 API in long-term? If so, why?
Dropping userptr is just some brainstorming idea. It was found out that userptr is not a good mean for buffer exchange between to two devices.
I can believe that. But I am also inclined to believe that DMABUF is targetted at device-to-device transfer, while USERPTR is targetted at device-to-user (or user-to-device) transfers. Are you saying applications should use DMABUF and memory map the buffers?
No. As I sad before: it is *too early* to drop userptr and expect application to use DMABUF and MMAPs only. This was just some hypothetical idea.
DMABUF is dedicated for dev-dev transfers. However, looking at the current speed of appearances of DMABUF extensions it may be expected that one day it starts supporting making DMA buffer from a user pointer. Currently there are already extensions for MMAP and cache synchronization. Who know what will happen future versions. However these are only hypothetical issues.
Or would you care to explain
how DMABUF addresses the problem space of USERPTR?
Allowing to attach a DMABUF to some userptr using some new magic IOCTL. I think that sooner or later someone will find this feature useful.
The USERPTR simplifies userspace code but introduce a lot of complexity problems for the kernel drivers and frameworks.
It is not only a simplification. In some cases, USERPTR is the only I/O method that supports zero copy in pretty much any circumstance.
Only for devices that have its own IOMMU that can access system memory. Moreover the userptr must come from malloc or be a mmaped file. The other case are drivers that touch memory using CPU in the kernel space like VIVI or USB drivers.
When the user cannot reliably predict the maximum number of required buffers, predicts a value larger than the device will negotiate, or needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not.
What does outlive STREAMOFF means in this context?
Anyway, IMO allocation of the buffers at VIDIOC_REQBUFS was not the best idea because it introduces an allocation overhead for negotiations of the number of the buffers. An allocation at mmap was to late. There is a need for some intermediate state between REQBUFS and mmap. The ioctl BUF_PREPARE may help here.
Can you give me an example of a sane application is forced to negotiate a larger number of buffers than it is actually going to use?
Now, I do realize that some devices cannot support USERPTR efficiently, then they should not support USERPTR.
The problem is not there is *NO* device that can handle USERPTR reliably. The can handle USERPTR generated by malloc or page cache (not sure). Memory mmaped from other devices, frameworks etc may or may not work. Even if the device has its IOMMU the DMA layer provides no generic way to transform from one device to the mapping in some other device.
It is done using platform-defendant hacks like extracting PFNs from mappings, hack-forming them into struct pages or scatterlists, mapping it and hoping that the memory is not going to release it in some other thread.
The only sure way is to copy data from userptr to MMAP buffer.
But for those devices that can, it
seems quite a nice performance enhancement.
The userptr has its niches were it works pretty well like Web cams or VIVI. I am saying that if ever DMABUF becomes a good alternative for USERPTR than maybe we should consider encouraging dropping USERPTR in the new drivers as 'obsolete' feature and providing some emulation layer in libv4l2 for legacy applications.
Regards, Tomasz Stanislawski
On Fri, 20 Apr 2012 14:25:01 +0200, Tomasz Stanislawski t.stanislaws@samsung.com wrote:
The USERPTR simplifies userspace code but introduce a lot of complexity problems for the kernel drivers and frameworks.
It is not only a simplification. In some cases, USERPTR is the only I/O method that supports zero copy in pretty much any circumstance.
Only for devices that have its own IOMMU that can access system memory.
Newer versions of the UVC driver have USERTPR, and simingly gspca seems too. That is practically all USB capture devices... That might be irrelevant for a smartphone manufacturer. That is very relevant for desktop applications.
Moreover the userptr must come from malloc or be a mmaped file. The other case are drivers that touch memory using CPU in the kernel space like VIVI or USB drivers.
I'd argue that USB is the most common case of V4L2 on the desktop...
When the user cannot reliably predict the maximum number of required buffers, predicts a value larger than the device will negotiate, or needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not.
What does outlive STREAMOFF means in this context?
Depending how your multimedia pipeline is built, it is plausible that the V4L2 source is shutdown (STREAMOFF then close()) before buffers coming from it are released/destroyed downstream. I might be wrong, but I would expect that V4L2 MMAP buffers become invalid after STREAMOFF+close()?
Anyway, IMO allocation of the buffers at VIDIOC_REQBUFS was not the best idea because it introduces an allocation overhead for negotiations of the number of the buffers. An allocation at mmap was to late. There is a need for some intermediate state between REQBUFS and mmap. The ioctl BUF_PREPARE may help here.
Can you give me an example of a sane application is forced to negotiate
a
larger number of buffers than it is actually going to use?
Outside the embedded world, the application typically does not know what the latency of the multimedia pipeline is. If the latency is not known, the number of buffers needed for zero copy cannot be precomputed for REQBUFS, say:
count = 1 + latency / frame interval.
Even for a trivial analog TV viewer application, lip synchronization requires picture frames to be bufferred to be long enough to account for the latency of the audio input, dejitter, filtering and audio output. Those values are usually not well determined at the time of requesting buffers from the video capture device. Also the application may want to play nice with PulseAudio. Then it will get very long audio buffers with very few audio periods... more latency.
It gets harder or outright impossible for frameworks dealing with complicated or arbitrary pipelines such as LibVLC or gstreamer. There is far too much unpredictability and variability downstream of the V4L2 source to estimate latency, and infer the number of buffers needed.
Now, I do realize that some devices cannot support USERPTR efficiently, then they should not support USERPTR.
The problem is not there is *NO* device that can handle USERPTR
reliably.
The can handle USERPTR generated by malloc or page cache (not sure). Memory mmaped from other devices, frameworks etc may or may not work. Even if the device has its IOMMU the DMA layer provides no generic way
to
transform from one device to the mapping in some other device.
I'm not saying that USERPTR should replace DMABUF. I'm saying USERPTR has advantages over MMAP that DMABUF does not seem to cover as yet (if only libv4l2 would not inhibit USERPTR...).
I'm definitely not saying that applications should rely on USERPTR being supported. We agree that not all devices can support USERPTR.
The userptr has its niches were it works pretty well like Web cams or
VIVI.
I am saying that if ever DMABUF becomes a good alternative for USERPTR than maybe we should consider encouraging dropping USERPTR in the new drivers as 'obsolete' feature and providing some emulation layer in
libv4l2
for legacy applications.
Sure.
Hi RĂ©mi,
On Friday 20 April 2012 15:03:17 RĂ©mi Denis-Courmont wrote:
On Fri, 20 Apr 2012 14:25:01 +0200, Tomasz Stanislawski wrote:
The USERPTR simplifies userspace code but introduce a lot of complexity problems for the kernel drivers and frameworks.
It is not only a simplification. In some cases, USERPTR is the only I/O method that supports zero copy in pretty much any circumstance.
Only for devices that have its own IOMMU that can access system memory.
Newer versions of the UVC driver have USERTPR, and simingly gspca seems too. That is practically all USB capture devices... That might be irrelevant for a smartphone manufacturer. That is very relevant for desktop applications.
Moreover the userptr must come from malloc or be a mmaped file. The other case are drivers that touch memory using CPU in the kernel space like VIVI or USB drivers.
I'd argue that USB is the most common case of V4L2 on the desktop...
When the user cannot reliably predict the maximum number of required buffers, predicts a value larger than the device will negotiate, or needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not.
What does outlive STREAMOFF means in this context?
Depending how your multimedia pipeline is built, it is plausible that the V4L2 source is shutdown (STREAMOFF then close()) before buffers coming from it are released/destroyed downstream. I might be wrong, but I would expect that V4L2 MMAP buffers become invalid after STREAMOFF+close()?
If the buffer is mmap()ed to userspace, it will not be freed before being munmap()ed.
Anyway, IMO allocation of the buffers at VIDIOC_REQBUFS was not the best idea because it introduces an allocation overhead for negotiations of the number of the buffers. An allocation at mmap was to late. There is a need for some intermediate state between REQBUFS and mmap. The ioctl BUF_PREPARE may help here.
Can you give me an example of a sane application is forced to negotiate a larger number of buffers than it is actually going to use?
Outside the embedded world, the application typically does not know what the latency of the multimedia pipeline is. If the latency is not known, the number of buffers needed for zero copy cannot be precomputed for REQBUFS, say:
count = 1 + latency / frame interval.
Even for a trivial analog TV viewer application, lip synchronization requires picture frames to be bufferred to be long enough to account for the latency of the audio input, dejitter, filtering and audio output. Those values are usually not well determined at the time of requesting buffers from the video capture device. Also the application may want to play nice with PulseAudio. Then it will get very long audio buffers with very few audio periods... more latency.
It gets harder or outright impossible for frameworks dealing with complicated or arbitrary pipelines such as LibVLC or gstreamer. There is far too much unpredictability and variability downstream of the V4L2 source to estimate latency, and infer the number of buffers needed.
If I'm not mistaken VIDIOC_CREATEBUF allows you to create additional buffers at runtime. You can thus cope with a latency increase (provided that the allocation overhead isn't prohibitive, in which case you're stuck whatever method you select). Deleting buffers at runtime is currently not possible though.
Now, I do realize that some devices cannot support USERPTR efficiently, then they should not support USERPTR.
The problem is not there is *NO* device that can handle USERPTR reliably. The can handle USERPTR generated by malloc or page cache (not sure). Memory mmaped from other devices, frameworks etc may or may not work. Even if the device has its IOMMU the DMA layer provides no generic way to transform from one device to the mapping in some other device.
I'm not saying that USERPTR should replace DMABUF. I'm saying USERPTR has advantages over MMAP that DMABUF does not seem to cover as yet (if only libv4l2 would not inhibit USERPTR...).
I'm definitely not saying that applications should rely on USERPTR being supported. We agree that not all devices can support USERPTR.
The userptr has its niches were it works pretty well like Web cams or VIVI.
I am saying that if ever DMABUF becomes a good alternative for USERPTR than maybe we should consider encouraging dropping USERPTR in the new drivers as 'obsolete' feature and providing some emulation layer in libv4l2 for legacy applications.
Sure.
Em 20-04-2012 09:25, Tomasz Stanislawski escreveu:
Hi Remi,
Now, I do realize that some devices cannot support USERPTR efficiently, then they should not support USERPTR.
The problem is not there is *NO* device that can handle USERPTR reliably. The can handle USERPTR generated by malloc or page cache (not sure). Memory mmaped from other devices, frameworks etc may or may not work. Even if the device has its IOMMU the DMA layer provides no generic way to transform from one device to the mapping in some other device.
It is done using platform-defendant hacks like extracting PFNs from mappings, hack-forming them into struct pages or scatterlists, mapping it and hoping that the memory is not going to release it in some other thread.
The only sure way is to copy data from userptr to MMAP buffer.
All you're talking about is related to userptr abuse that happened on Embedded devices, of using it for something that were never meant to be used (dev2dev).
While the DMABUF patches aren't applied, there's just one mode defined at the V4L2 API for dev2dev: overlay mode[1].
Most embedded applications and drivers decided that, instead of using overlay mode, to abuse of userptr to do dev2dev. As you've pointed, it was noticed in practice that this sometimes fail.
Yes, such abuse should be dropped, and DMABUF is the right way to address it.
That doesn't mean that USERPTR should be dropped for the thing it were originally created: dev2user or user2dev.
Regards, Mauro
[1] Even so, not all PC motherboards are capable of supporting the overlay mode: it is known that several chipsets have problems on their DMA engines, with causes data losses when a DMA transfer happens without passing through the system main memory (PCI2PCI transfers). So, drivers check the PCI quirks table to detect if dev2dev is supported, before exposing overlay mode to userspace.
Em 20-04-2012 07:56, RĂ©mi Denis-Courmont escreveu:
On Fri, 20 Apr 2012 10:41:37 +0200, Tomasz Stanislawski t.stanislaws@samsung.com wrote:
Am I understanding wrong or are you saying that you want to drop
userptr
from V4L2 API in long-term? If so, why?
Dropping userptr is just some brainstorming idea. It was found out that userptr is not a good mean for buffer exchange between to two devices.
I can believe that. But I am also inclined to believe that DMABUF is targetted at device-to-device transfer, while USERPTR is targetted at device-to-user (or user-to-device) transfers. Are you saying applications should use DMABUF and memory map the buffers? Or would you care to explain how DMABUF addresses the problem space of USERPTR?
I agree with RĂ©mi. Userptr were never meant to be used by dev2dev transfer. The overlay mode were designed for it.
I remember I've pointed it a few times at the mailing list.
The DMABUF is the proper replacement for the overlay mode, and, after having it fully implemented, we can deprecate and remove the overlay mode.
The USERPTR simplifies userspace code but introduce a lot of complexity problems for the kernel drivers and frameworks.
It is not only a simplification. In some cases, USERPTR is the only I/O method that supports zero copy in pretty much any circumstance. When the user cannot reliably predict the maximum number of required buffers, predicts a value larger than the device will negotiate, or needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not.
Yes, that's my understand too. USERPTR works helps to avoid buffer copying.
Now, I do realize that some devices cannot support USERPTR efficiently, then they should not support USERPTR. But for those devices that can, it seems quite a nice performance enhancement.
Agreed.
A quick note about that: for USB devices, with the current implementations, there will always be a copy inside the Kernel, as the USB and other transport headers should be removed.
For them, the cost of MMAP and USERPTR is the same (not all USB drivers export USERPTR, because of a limitation at videobuf-vmalloc).
The problem is that memory mmaped to the userspace may not be a part of the system memory. It often happens for devices that use remap_pfn or dma_mmap_* to mmap the memory to the userspace.
It is was empirically conjured the it is not possible to access this kind of memory by the other device without a platform-specific hacks or workarounds.
As I warned in the past: USERPTR were never meant to be used for dev2dev transfers.
The DMABUF was introduced to help in such a case.
The basic short-term idea is to drop userptr support for buffers that are MMAPed by other device.
You should, instead, just drop userptr support on devices where DMA scatter/gather is not supported, and migrate all dev2dev use cases to DMABUF.
The userptr will be used for memory allocated using malloc (anonymous pages) or (maybe) mmaped files. There are of course cache synchronization problems but there are a lesser concern.
However this approach will work only for devices that have its own IOMMU which can be configured to access system memory. Otherwise, the memory has to copied anyway to device's own buffers.
Moreover copying a large amount of data should not happen in the kernel-space.
All the reasons make userptr an unreliable and complex to implement feature.
So my rough-idea was to remove USERPTR support from kernel drivers (if possible of course) and to provide an emulation layer in the userspace code like libv4l2.
Please note that it is only a rough idea. Just brainstorming :)
It is *too early* to start any discussion on this topic. Especially until DMABUF is mature enough to become a good alternative for userptr.
Looking at the hole picture, dropping USERPTR would only make sense if it is broken on dev2user (or user2dev) transfers.
Dropping its usage on dev2dev transfers makes sense, after having DMABUF implemented.
Yet, if some userspace application wants to abuse of USERPTR in order to use it for dev2dev transfer, there's not much that can be done at Kernel level.
It makes sense to put a big warn at the V4L2 Docs telling that this is not officially supported and can cause all sorts of issues at the machine/system.
Regards, Mauro
Hi Mauro,
On Friday, April 20, 2012 3:37 PM Mauro Carvalho Chehab wrote:
(snipped)
So my rough-idea was to remove USERPTR support from kernel drivers (if possible of course) and to provide an emulation layer in the userspace code like libv4l2.
Please note that it is only a rough idea. Just brainstorming :)
It is *too early* to start any discussion on this topic. Especially until DMABUF is mature enough to become a good alternative for userptr.
Looking at the hole picture, dropping USERPTR would only make sense if it is broken on dev2user (or user2dev) transfers.
Dropping its usage on dev2dev transfers makes sense, after having DMABUF implemented.
Yet, if some userspace application wants to abuse of USERPTR in order to use it for dev2dev transfer, there's not much that can be done at Kernel level.
It makes sense to put a big warn at the V4L2 Docs telling that this is not officially supported and can cause all sorts of issues at the machine/system.
Please note that all current drivers which use videobuf/videobuf2-dma-contig are able to use userptr memory access method only with physically contiguous memory. This means that in fact they work only buffers, which come from other devices and dev2dev transfers are the only possibility. malloc()ed memory buffers are rejected.
Best regards
Em 23-04-2012 07:50, Marek Szyprowski escreveu:
Hi Mauro,
On Friday, April 20, 2012 3:37 PM Mauro Carvalho Chehab wrote:
(snipped)
So my rough-idea was to remove USERPTR support from kernel drivers (if possible of course) and to provide an emulation layer in the userspace code like libv4l2.
Please note that it is only a rough idea. Just brainstorming :)
It is *too early* to start any discussion on this topic. Especially until DMABUF is mature enough to become a good alternative for userptr.
Looking at the hole picture, dropping USERPTR would only make sense if it is broken on dev2user (or user2dev) transfers.
Dropping its usage on dev2dev transfers makes sense, after having DMABUF implemented.
Yet, if some userspace application wants to abuse of USERPTR in order to use it for dev2dev transfer, there's not much that can be done at Kernel level.
It makes sense to put a big warn at the V4L2 Docs telling that this is not officially supported and can cause all sorts of issues at the machine/system.
Please note that all current drivers which use videobuf/videobuf2-dma-contig are able to use userptr memory access method only with physically contiguous memory.
Yes.
This means that in fact they work only buffers, which come from other devices and dev2dev transfers are the only possibility. malloc()ed memory buffers are rejected.
Fragmented buffers can be detected, at Kernel level, and VB/VB2 can refuse a fragmented memory when the hardware doesn't support it. However, checking if the buffer is fragmented is not a safe way to detect that the buffer will be used by a dev2dev transfer.
If the buffers are allocated very soon just after boot time which malloc(), or if they use some different way of allocating the buffers (like reducing the max ram area addressed by the kernel or using CMU or a simila approach), it could be possible to use videobuf(1/2)-dma-contig for userptr with user2dev/dev2user transfers. This is actually used on some cases where this is used (like where the capture device only supports contiguous buffers).
If, for some reason, the hardware doesn't support dev2dev transfers on a reliable way, some other strategy should be used.
Regards, Mauro
From: Sumit Semwal sumit.semwal@ti.com
This patch adds support for DMABUF memory type in videobuf2. It calls relevant APIs of dma_buf for v4l reqbuf / qbuf / dqbuf operations.
For this version, the support is for videobuf2 as a user of the shared buffer; so the allocation of the buffer is done outside of V4L2. [A sample allocator of dma-buf shared buffer is given at [1]]
[1]: Rob Clark's DRM: https://github.com/robclark/kernel-omap4/commits/drmplane-dmabuf
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [original work in the PoC for buffer sharing] Signed-off-by: Sumit Semwal sumit.semwal@ti.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org --- drivers/media/video/videobuf2-core.c | 196 +++++++++++++++++++++++++++++++++- include/media/videobuf2-core.h | 27 +++++ 2 files changed, 219 insertions(+), 4 deletions(-)
diff --git a/drivers/media/video/videobuf2-core.c b/drivers/media/video/videobuf2-core.c index 2e8f1df..d26b1cc 100644 --- a/drivers/media/video/videobuf2-core.c +++ b/drivers/media/video/videobuf2-core.c @@ -106,6 +106,36 @@ static void __vb2_buf_userptr_put(struct vb2_buffer *vb) }
/** + * __vb2_plane_dmabuf_put() - release memory associated with + * a DMABUF shared plane + */ +static void __vb2_plane_dmabuf_put(struct vb2_queue *q, struct vb2_plane *p) +{ + if (!p->mem_priv) + return; + + if (p->dbuf_mapped) + call_memop(q, unmap_dmabuf, p->mem_priv); + + call_memop(q, detach_dmabuf, p->mem_priv); + dma_buf_put(p->dbuf); + memset(p, 0, sizeof *p); +} + +/** + * __vb2_buf_dmabuf_put() - release memory associated with + * a DMABUF shared buffer + */ +static void __vb2_buf_dmabuf_put(struct vb2_buffer *vb) +{ + struct vb2_queue *q = vb->vb2_queue; + unsigned int plane; + + for (plane = 0; plane < vb->num_planes; ++plane) + __vb2_plane_dmabuf_put(q, &vb->planes[plane]); +} + +/** * __setup_offsets() - setup unique offsets ("cookies") for every plane in * every buffer on the queue */ @@ -227,6 +257,8 @@ static void __vb2_free_mem(struct vb2_queue *q, unsigned int buffers) /* Free MMAP buffers or release USERPTR buffers */ if (q->memory == V4L2_MEMORY_MMAP) __vb2_buf_mem_free(vb); + else if (q->memory == V4L2_MEMORY_DMABUF) + __vb2_buf_dmabuf_put(vb); else __vb2_buf_userptr_put(vb); } @@ -349,6 +381,12 @@ static int __fill_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b) */ memcpy(b->m.planes, vb->v4l2_planes, b->length * sizeof(struct v4l2_plane)); + + if (q->memory == V4L2_MEMORY_DMABUF) { + unsigned int plane; + for (plane = 0; plane < vb->num_planes; ++plane) + b->m.planes[plane].m.fd = 0; + } } else { /* * We use length and offset in v4l2_planes array even for @@ -360,6 +398,8 @@ static int __fill_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b) b->m.offset = vb->v4l2_planes[0].m.mem_offset; else if (q->memory == V4L2_MEMORY_USERPTR) b->m.userptr = vb->v4l2_planes[0].m.userptr; + else if (q->memory == V4L2_MEMORY_DMABUF) + b->m.fd = 0; }
/* @@ -451,6 +491,20 @@ static int __verify_mmap_ops(struct vb2_queue *q) }
/** + * __verify_dmabuf_ops() - verify that all memory operations required for + * DMABUF queue type have been provided + */ +static int __verify_dmabuf_ops(struct vb2_queue *q) +{ + if (!(q->io_modes & VB2_DMABUF) || !q->mem_ops->attach_dmabuf || + !q->mem_ops->detach_dmabuf || !q->mem_ops->map_dmabuf || + !q->mem_ops->unmap_dmabuf) + return -EINVAL; + + return 0; +} + +/** * vb2_reqbufs() - Initiate streaming * @q: videobuf2 queue * @req: struct passed from userspace to vidioc_reqbufs handler in driver @@ -483,8 +537,9 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req) return -EBUSY; }
- if (req->memory != V4L2_MEMORY_MMAP - && req->memory != V4L2_MEMORY_USERPTR) { + if (req->memory != V4L2_MEMORY_MMAP && + req->memory != V4L2_MEMORY_DMABUF && + req->memory != V4L2_MEMORY_USERPTR) { dprintk(1, "reqbufs: unsupported memory type\n"); return -EINVAL; } @@ -513,6 +568,11 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req) return -EINVAL; }
+ if (req->memory == V4L2_MEMORY_DMABUF && __verify_dmabuf_ops(q)) { + dprintk(1, "reqbufs: DMABUF for current setup unsupported\n"); + return -EINVAL; + } + if (req->count == 0 || q->num_buffers != 0 || q->memory != req->memory) { /* * We already have buffers allocated, so first check if they @@ -619,8 +679,9 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create) return -EBUSY; }
- if (create->memory != V4L2_MEMORY_MMAP - && create->memory != V4L2_MEMORY_USERPTR) { + if (create->memory != V4L2_MEMORY_MMAP && + create->memory != V4L2_MEMORY_USERPTR && + create->memory != V4L2_MEMORY_DMABUF) { dprintk(1, "%s(): unsupported memory type\n", __func__); return -EINVAL; } @@ -644,6 +705,11 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create) return -EINVAL; }
+ if (create->memory == V4L2_MEMORY_DMABUF && __verify_dmabuf_ops(q)) { + dprintk(1, "%s(): DMABUF for current setup unsupported\n", __func__); + return -EINVAL; + } + if (q->num_buffers == VIDEO_MAX_FRAME) { dprintk(1, "%s(): maximum number of buffers already allocated\n", __func__); @@ -839,6 +905,14 @@ static int __fill_vb2_buffer(struct vb2_buffer *vb, const struct v4l2_buffer *b, b->m.planes[plane].length; } } + if (b->memory == V4L2_MEMORY_DMABUF) { + for (plane = 0; plane < vb->num_planes; ++plane) { + v4l2_planes[plane].bytesused = + b->m.planes[plane].bytesused; + v4l2_planes[plane].m.fd = + b->m.planes[plane].m.fd; + } + } } else { /* * Single-planar buffers do not use planes array, @@ -853,6 +927,10 @@ static int __fill_vb2_buffer(struct vb2_buffer *vb, const struct v4l2_buffer *b, v4l2_planes[0].m.userptr = b->m.userptr; v4l2_planes[0].length = b->length; } + + if (b->memory == V4L2_MEMORY_DMABUF) + v4l2_planes[0].m.fd = b->m.fd; + }
vb->v4l2_buf.field = b->field; @@ -957,6 +1035,100 @@ static int __qbuf_mmap(struct vb2_buffer *vb, const struct v4l2_buffer *b) }
/** + * __qbuf_dmabuf() - handle qbuf of a DMABUF buffer + */ +static int __qbuf_dmabuf(struct vb2_buffer *vb, const struct v4l2_buffer *b) +{ + struct v4l2_plane planes[VIDEO_MAX_PLANES]; + struct vb2_queue *q = vb->vb2_queue; + void *mem_priv; + unsigned int plane; + int ret; + int write = !V4L2_TYPE_IS_OUTPUT(q->type); + + /* Verify and copy relevant information provided by the userspace */ + ret = __fill_vb2_buffer(vb, b, planes); + if (ret) + return ret; + + for (plane = 0; plane < vb->num_planes; ++plane) { + struct dma_buf *dbuf = dma_buf_get(planes[plane].m.fd); + + if (IS_ERR_OR_NULL(dbuf)) { + dprintk(1, "qbuf: invalid dmabuf fd for " + "plane %d\n", plane); + ret = -EINVAL; + goto err; + } + + /* Skip the plane if already verified */ + if (dbuf == vb->planes[plane].dbuf) { + planes[plane].length = dbuf->size; + dma_buf_put(dbuf); + continue; + } + + dprintk(3, "qbuf: buffer description for plane %d changed, " + "reattaching dma buf\n", plane); + + /* Release previously acquired memory if present */ + __vb2_plane_dmabuf_put(q, &vb->planes[plane]); + + /* Acquire each plane's memory */ + mem_priv = call_memop(q, attach_dmabuf, q->alloc_ctx[plane], + dbuf, q->plane_sizes[plane], write); + if (IS_ERR(mem_priv)) { + dprintk(1, "qbuf: failed acquiring dmabuf " + "memory for plane %d\n", plane); + ret = PTR_ERR(mem_priv); + goto err; + } + + planes[plane].length = dbuf->size; + vb->planes[plane].dbuf = dbuf; + vb->planes[plane].mem_priv = mem_priv; + } + + /* TODO: This pins the buffer(s) with dma_buf_map_attachment()).. but + * really we want to do this just before the DMA, not while queueing + * the buffer(s).. + */ + for (plane = 0; plane < vb->num_planes; ++plane) { + ret = call_memop(q, map_dmabuf, vb->planes[plane].mem_priv); + if (ret) { + dprintk(1, "qbuf: failed mapping dmabuf " + "memory for plane %d\n", plane); + goto err; + } + vb->planes[plane].dbuf_mapped = 1; + } + + /* + * Call driver-specific initialization on the newly acquired buffer, + * if provided. + */ + ret = call_qop(q, buf_init, vb); + if (ret) { + dprintk(1, "qbuf: buffer initialization failed\n"); + goto err; + } + + /* + * Now that everything is in order, copy relevant information + * provided by userspace. + */ + for (plane = 0; plane < vb->num_planes; ++plane) + vb->v4l2_planes[plane] = planes[plane]; + + return 0; +err: + /* In case of errors, release planes that were already acquired */ + __vb2_buf_dmabuf_put(vb); + + return ret; +} + +/** * __enqueue_in_driver() - enqueue a vb2_buffer in driver for processing */ static void __enqueue_in_driver(struct vb2_buffer *vb) @@ -980,6 +1152,9 @@ static int __buf_prepare(struct vb2_buffer *vb, const struct v4l2_buffer *b) case V4L2_MEMORY_USERPTR: ret = __qbuf_userptr(vb, b); break; + case V4L2_MEMORY_DMABUF: + ret = __qbuf_dmabuf(vb, b); + break; default: WARN(1, "Invalid queue type\n"); ret = -EINVAL; @@ -1335,6 +1510,19 @@ int vb2_dqbuf(struct vb2_queue *q, struct v4l2_buffer *b, bool nonblocking) return ret; }
+ /* TODO: this unpins the buffer(dma_buf_unmap_attachment()).. but + * really we want to do this just after DMA, not when the + * buffer is dequeued.. + */ + if (q->memory == V4L2_MEMORY_DMABUF) { + unsigned int i; + + for (i = 0; i < vb->num_planes; ++i) { + call_memop(q, unmap_dmabuf, vb->planes[i].mem_priv); + vb->planes[i].dbuf_mapped = 0; + } + } + switch (vb->state) { case VB2_BUF_STATE_DONE: dprintk(3, "dqbuf: Returning done buffer\n"); diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h index a15d1f1..859bbaf 100644 --- a/include/media/videobuf2-core.h +++ b/include/media/videobuf2-core.h @@ -16,6 +16,7 @@ #include <linux/mutex.h> #include <linux/poll.h> #include <linux/videodev2.h> +#include <linux/dma-buf.h>
struct vb2_alloc_ctx; struct vb2_fileio_data; @@ -41,6 +42,20 @@ struct vb2_fileio_data; * argument to other ops in this structure * @put_userptr: inform the allocator that a USERPTR buffer will no longer * be used + * @attach_dmabuf: attach a shared struct dma_buf for a hardware operation; + * used for DMABUF memory types; alloc_ctx is the alloc context + * dbuf is the shared dma_buf; returns NULL on failure; + * allocator private per-buffer structure on success; + * this needs to be used for further accesses to the buffer + * @detach_dmabuf: inform the exporter of the buffer that the current DMABUF + * buffer is no longer used; the buf_priv argument is the + * allocator private per-buffer structure previously returned + * from the attach_dmabuf callback + * @map_dmabuf: request for access to the dmabuf from allocator; the allocator + * of dmabuf is informed that this driver is going to use the + * dmabuf + * @unmap_dmabuf: releases access control to the dmabuf - allocator is notified + * that this driver is done using the dmabuf for now * @vaddr: return a kernel virtual address to a given memory buffer * associated with the passed private structure or NULL if no * such mapping exists @@ -56,6 +71,8 @@ struct vb2_fileio_data; * Required ops for USERPTR types: get_userptr, put_userptr. * Required ops for MMAP types: alloc, put, num_users, mmap. * Required ops for read/write access types: alloc, put, num_users, vaddr + * Required ops for DMABUF types: attach_dmabuf, detach_dmabuf, map_dmabuf, + * unmap_dmabuf. */ struct vb2_mem_ops { void *(*alloc)(void *alloc_ctx, unsigned long size); @@ -65,6 +82,12 @@ struct vb2_mem_ops { unsigned long size, int write); void (*put_userptr)(void *buf_priv);
+ void *(*attach_dmabuf)(void *alloc_ctx, struct dma_buf *dbuf, + unsigned long size, int write); + void (*detach_dmabuf)(void *buf_priv); + int (*map_dmabuf)(void *buf_priv); + void (*unmap_dmabuf)(void *buf_priv); + void *(*vaddr)(void *buf_priv); void *(*cookie)(void *buf_priv);
@@ -75,6 +98,8 @@ struct vb2_mem_ops {
struct vb2_plane { void *mem_priv; + struct dma_buf *dbuf; + unsigned int dbuf_mapped; };
/** @@ -83,12 +108,14 @@ struct vb2_plane { * @VB2_USERPTR: driver supports USERPTR with streaming API * @VB2_READ: driver supports read() style access * @VB2_WRITE: driver supports write() style access + * @VB2_DMABUF: driver supports DMABUF with streaming API */ enum vb2_io_modes { VB2_MMAP = (1 << 0), VB2_USERPTR = (1 << 1), VB2_READ = (1 << 2), VB2_WRITE = (1 << 3), + VB2_DMABUF = (1 << 4), };
/**
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:45 Tomasz Stanislawski wrote:
From: Sumit Semwal sumit.semwal@ti.com
This patch adds support for DMABUF memory type in videobuf2. It calls relevant APIs of dma_buf for v4l reqbuf / qbuf / dqbuf operations.
For this version, the support is for videobuf2 as a user of the shared buffer; so the allocation of the buffer is done outside of V4L2. [A sample allocator of dma-buf shared buffer is given at [1]]
[1]: Rob Clark's DRM: https://github.com/robclark/kernel-omap4/commits/drmplane-dmabuf
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [original work in the PoC for buffer sharing] Signed-off-by: Sumit Semwal sumit.semwal@ti.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org
Acked-by: Laurent Pinchart laurent.pinchart@ideasonboard.com
From: Sumit Semwal sumit.semwal@ti.com
Adding DMABUF memory type causes videobuf to complain about not using it in some switch cases. This patch removes these warnings.
Signed-off-by: Sumit Semwal sumit.semwal@ti.com --- drivers/media/video/videobuf-core.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/drivers/media/video/videobuf-core.c b/drivers/media/video/videobuf-core.c index de4fa4e..b457c8b 100644 --- a/drivers/media/video/videobuf-core.c +++ b/drivers/media/video/videobuf-core.c @@ -335,6 +335,9 @@ static void videobuf_status(struct videobuf_queue *q, struct v4l2_buffer *b, case V4L2_MEMORY_OVERLAY: b->m.offset = vb->boff; break; + case V4L2_MEMORY_DMABUF: + /* DMABUF is not handled in videobuf framework */ + break; }
b->flags = 0; @@ -411,6 +414,7 @@ int __videobuf_mmap_setup(struct videobuf_queue *q, break; case V4L2_MEMORY_USERPTR: case V4L2_MEMORY_OVERLAY: + case V4L2_MEMORY_DMABUF: /* nothing */ break; }
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:46 Tomasz Stanislawski wrote:
From: Sumit Semwal sumit.semwal@ti.com
Adding DMABUF memory type causes videobuf to complain about not using it in some switch cases. This patch removes these warnings.
Signed-off-by: Sumit Semwal sumit.semwal@ti.com
Acked-by: Laurent Pinchart laurent.pinchart@ideasonboard.com
From: Laurent Pinchart laurent.pinchart@ideasonboard.com
Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com --- drivers/media/video/videobuf2-dma-contig.c | 36 ++++++++++++++-------------- 1 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index f17ad98..5207eb1 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -31,9 +31,9 @@ struct vb2_dc_buf { struct vb2_vmarea_handler handler; };
-static void vb2_dma_contig_put(void *buf_priv); +static void vb2_dc_put(void *buf_priv);
-static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) +static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) { struct vb2_dc_conf *conf = alloc_ctx; struct vb2_dc_buf *buf; @@ -55,7 +55,7 @@ static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) buf->size = size;
buf->handler.refcount = &buf->refcount; - buf->handler.put = vb2_dma_contig_put; + buf->handler.put = vb2_dc_put; buf->handler.arg = buf;
atomic_inc(&buf->refcount); @@ -63,7 +63,7 @@ static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) return buf; }
-static void vb2_dma_contig_put(void *buf_priv) +static void vb2_dc_put(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv;
@@ -74,14 +74,14 @@ static void vb2_dma_contig_put(void *buf_priv) } }
-static void *vb2_dma_contig_cookie(void *buf_priv) +static void *vb2_dc_cookie(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv;
return &buf->dma_addr; }
-static void *vb2_dma_contig_vaddr(void *buf_priv) +static void *vb2_dc_vaddr(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv; if (!buf) @@ -90,14 +90,14 @@ static void *vb2_dma_contig_vaddr(void *buf_priv) return buf->vaddr; }
-static unsigned int vb2_dma_contig_num_users(void *buf_priv) +static unsigned int vb2_dc_num_users(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv;
return atomic_read(&buf->refcount); }
-static int vb2_dma_contig_mmap(void *buf_priv, struct vm_area_struct *vma) +static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) { struct vb2_dc_buf *buf = buf_priv;
@@ -110,7 +110,7 @@ static int vb2_dma_contig_mmap(void *buf_priv, struct vm_area_struct *vma) &vb2_common_vm_ops, &buf->handler); }
-static void *vb2_dma_contig_get_userptr(void *alloc_ctx, unsigned long vaddr, +static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, unsigned long size, int write) { struct vb2_dc_buf *buf; @@ -137,7 +137,7 @@ static void *vb2_dma_contig_get_userptr(void *alloc_ctx, unsigned long vaddr, return buf; }
-static void vb2_dma_contig_put_userptr(void *mem_priv) +static void vb2_dc_put_userptr(void *mem_priv) { struct vb2_dc_buf *buf = mem_priv;
@@ -149,14 +149,14 @@ static void vb2_dma_contig_put_userptr(void *mem_priv) }
const struct vb2_mem_ops vb2_dma_contig_memops = { - .alloc = vb2_dma_contig_alloc, - .put = vb2_dma_contig_put, - .cookie = vb2_dma_contig_cookie, - .vaddr = vb2_dma_contig_vaddr, - .mmap = vb2_dma_contig_mmap, - .get_userptr = vb2_dma_contig_get_userptr, - .put_userptr = vb2_dma_contig_put_userptr, - .num_users = vb2_dma_contig_num_users, + .alloc = vb2_dc_alloc, + .put = vb2_dc_put, + .cookie = vb2_dc_cookie, + .vaddr = vb2_dc_vaddr, + .mmap = vb2_dc_mmap, + .get_userptr = vb2_dc_get_userptr, + .put_userptr = vb2_dc_put_userptr, + .num_users = vb2_dc_num_users, }; EXPORT_SYMBOL_GPL(vb2_dma_contig_memops);
vb2-dma-contig returns a vb2_dc_conf structure instance as the vb2 allocation context. That structure only stores a pointer to the physical device. Remove it and use the device pointer directly as the allocation context.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com --- drivers/media/video/videobuf2-dma-contig.c | 29 ++++++--------------------- 1 files changed, 7 insertions(+), 22 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 5207eb1..ff0a662 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -17,12 +17,8 @@ #include <media/videobuf2-core.h> #include <media/videobuf2-memops.h>
-struct vb2_dc_conf { - struct device *dev; -}; - struct vb2_dc_buf { - struct vb2_dc_conf *conf; + struct device *dev; void *vaddr; dma_addr_t dma_addr; unsigned long size; @@ -35,23 +31,21 @@ static void vb2_dc_put(void *buf_priv);
static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) { - struct vb2_dc_conf *conf = alloc_ctx; + struct device *dev = alloc_ctx; struct vb2_dc_buf *buf;
buf = kzalloc(sizeof *buf, GFP_KERNEL); if (!buf) return ERR_PTR(-ENOMEM);
- buf->vaddr = dma_alloc_coherent(conf->dev, size, &buf->dma_addr, - GFP_KERNEL); + buf->vaddr = dma_alloc_coherent(dev, size, &buf->dma_addr, GFP_KERNEL); if (!buf->vaddr) { - dev_err(conf->dev, "dma_alloc_coherent of size %ld failed\n", - size); + dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size); kfree(buf); return ERR_PTR(-ENOMEM); }
- buf->conf = conf; + buf->dev = dev; buf->size = size;
buf->handler.refcount = &buf->refcount; @@ -68,7 +62,7 @@ static void vb2_dc_put(void *buf_priv) struct vb2_dc_buf *buf = buf_priv;
if (atomic_dec_and_test(&buf->refcount)) { - dma_free_coherent(buf->conf->dev, buf->size, buf->vaddr, + dma_free_coherent(buf->dev, buf->size, buf->vaddr, buf->dma_addr); kfree(buf); } @@ -162,21 +156,12 @@ EXPORT_SYMBOL_GPL(vb2_dma_contig_memops);
void *vb2_dma_contig_init_ctx(struct device *dev) { - struct vb2_dc_conf *conf; - - conf = kzalloc(sizeof *conf, GFP_KERNEL); - if (!conf) - return ERR_PTR(-ENOMEM); - - conf->dev = dev; - - return conf; + return dev; } EXPORT_SYMBOL_GPL(vb2_dma_contig_init_ctx);
void vb2_dma_contig_cleanup_ctx(void *alloc_ctx) { - kfree(alloc_ctx); } EXPORT_SYMBOL_GPL(vb2_dma_contig_cleanup_ctx);
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:48 Tomasz Stanislawski wrote:
vb2-dma-contig returns a vb2_dc_conf structure instance as the vb2 allocation context. That structure only stores a pointer to the physical device. Remove it and use the device pointer directly as the allocation context.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com
Acked-by: Laurent Pinchart laurent.pinchart@ideasonboard.com
From: Laurent Pinchart laurent.pinchart@ideasonboard.com
Group functions by buffer type.
Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com --- drivers/media/video/videobuf2-dma-contig.c | 92 ++++++++++++++++----------- 1 files changed, 54 insertions(+), 38 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index ff0a662..476e536 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -20,14 +20,56 @@ struct vb2_dc_buf { struct device *dev; void *vaddr; - dma_addr_t dma_addr; unsigned long size; - struct vm_area_struct *vma; - atomic_t refcount; + dma_addr_t dma_addr; + + /* MMAP related */ struct vb2_vmarea_handler handler; + atomic_t refcount; + + /* USERPTR related */ + struct vm_area_struct *vma; };
-static void vb2_dc_put(void *buf_priv); +/*********************************************/ +/* callbacks for all buffers */ +/*********************************************/ + +static void *vb2_dc_cookie(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + return &buf->dma_addr; +} + +static void *vb2_dc_vaddr(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + return buf->vaddr; +} + +static unsigned int vb2_dc_num_users(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + return atomic_read(&buf->refcount); +} + +/*********************************************/ +/* callbacks for MMAP buffers */ +/*********************************************/ + +static void vb2_dc_put(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + if (!atomic_dec_and_test(&buf->refcount)) + return; + + dma_free_coherent(buf->dev, buf->size, buf->vaddr, buf->dma_addr); + kfree(buf); +}
static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) { @@ -57,40 +99,6 @@ static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) return buf; }
-static void vb2_dc_put(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - - if (atomic_dec_and_test(&buf->refcount)) { - dma_free_coherent(buf->dev, buf->size, buf->vaddr, - buf->dma_addr); - kfree(buf); - } -} - -static void *vb2_dc_cookie(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - - return &buf->dma_addr; -} - -static void *vb2_dc_vaddr(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - if (!buf) - return 0; - - return buf->vaddr; -} - -static unsigned int vb2_dc_num_users(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - - return atomic_read(&buf->refcount); -} - static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) { struct vb2_dc_buf *buf = buf_priv; @@ -104,6 +112,10 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) &vb2_common_vm_ops, &buf->handler); }
+/*********************************************/ +/* callbacks for USERPTR buffers */ +/*********************************************/ + static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, unsigned long size, int write) { @@ -142,6 +154,10 @@ static void vb2_dc_put_userptr(void *mem_priv) kfree(buf); }
+/*********************************************/ +/* DMA CONTIG exported functions */ +/*********************************************/ + const struct vb2_mem_ops vb2_dma_contig_memops = { .alloc = vb2_dc_alloc, .put = vb2_dc_put,
From: Andrzej Pietrasiewicz andrzej.p@samsung.com
This patch introduces usage of dma_map_sg to map memory behind a userspace pointer to a device as dma-contiguous mapping.
Signed-off-by: Andrzej Pietrasiewicz andrzej.p@samsung.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com [bugfixing] Signed-off-by: Kamil Debski k.debski@samsung.com [bugfixing] Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [add sglist subroutines/code refactoring] Signed-off-by: Kyungmin Park kyungmin.park@samsung.com --- drivers/media/video/videobuf2-dma-contig.c | 287 ++++++++++++++++++++++++++-- 1 files changed, 270 insertions(+), 17 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 476e536..3a1e314 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -11,6 +11,8 @@ */
#include <linux/module.h> +#include <linux/scatterlist.h> +#include <linux/sched.h> #include <linux/slab.h> #include <linux/dma-mapping.h>
@@ -22,6 +24,8 @@ struct vb2_dc_buf { void *vaddr; unsigned long size; dma_addr_t dma_addr; + enum dma_data_direction dma_dir; + struct sg_table *dma_sgt;
/* MMAP related */ struct vb2_vmarea_handler handler; @@ -32,6 +36,103 @@ struct vb2_dc_buf { };
/*********************************************/ +/* scatterlist table functions */ +/*********************************************/ + +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, + unsigned int n_pages, unsigned long offset, unsigned long size) +{ + struct sg_table *sgt; + unsigned int chunks; + unsigned int i; + unsigned int cur_page; + int ret; + struct scatterlist *s; + unsigned int offset_end = n_pages * PAGE_SIZE - size; + + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); + if (!sgt) + return ERR_PTR(-ENOMEM); + + /* compute number of chunks */ + chunks = 1; + for (i = 1; i < n_pages; ++i) + if (pages[i] != pages[i - 1] + 1) + ++chunks; + + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); + if (ret) { + kfree(sgt); + return ERR_PTR(-ENOMEM); + } + + /* merging chunks and putting them into the scatterlist */ + cur_page = 0; + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { + size_t size = PAGE_SIZE; + unsigned int j; + + for (j = cur_page + 1; j < n_pages; ++j) { + if (pages[j] != pages[j - 1] + 1) + break; + size += PAGE_SIZE; + } + + /* cut offset if chunk starts at the first page */ + if (cur_page == 0) + size -= offset; + /* cut offset_end if chunk ends at the last page */ + if (j == n_pages) + size -= offset_end; + + sg_set_page(s, pages[cur_page], size, offset); + offset = 0; + cur_page = j; + } + + return sgt; +} + +static void vb2_dc_release_sgtable(struct sg_table *sgt) +{ + sg_free_table(sgt); + kfree(sgt); +} + +static void vb2_dc_sgt_foreach_page(struct sg_table *sgt, + void (*cb)(struct page *pg)) +{ + struct scatterlist *s; + unsigned int i; + + for_each_sg(sgt->sgl, s, sgt->nents, i) { + struct page *page = sg_page(s); + unsigned int n_pages = PAGE_ALIGN(s->offset + s->length) + >> PAGE_SHIFT; + unsigned int j; + + for (j = 0; j < n_pages; ++j, ++page) + cb(page); + } +} + +static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt) +{ + struct scatterlist *s; + dma_addr_t expected = sg_dma_address(sgt->sgl); + unsigned int i; + unsigned long size = 0; + + for_each_sg(sgt->sgl, s, sgt->nents, i) { + if (sg_dma_address(s) != expected) + break; + expected = sg_dma_address(s) + sg_dma_len(s); + size += sg_dma_len(s); + } + return size; +} + +/*********************************************/ /* callbacks for all buffers */ /*********************************************/
@@ -116,42 +217,194 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) /* callbacks for USERPTR buffers */ /*********************************************/
+static inline int vma_is_io(struct vm_area_struct *vma) +{ + return !!(vma->vm_flags & (VM_IO | VM_PFNMAP)); +} + +static struct vm_area_struct *vb2_dc_get_user_vma( + unsigned long start, unsigned long size) +{ + struct vm_area_struct *vma; + + /* current->mm->mmap_sem is taken by videobuf2 core */ + vma = find_vma(current->mm, start); + if (!vma) { + printk(KERN_ERR "no vma for address %lu\n", start); + return ERR_PTR(-EFAULT); + } + + if (vma->vm_end - vma->vm_start < size) { + printk(KERN_ERR "vma at %lu is too small for %lu bytes\n", + start, size); + return ERR_PTR(-EFAULT); + } + + vma = vb2_get_vma(vma); + if (!vma) { + printk(KERN_ERR "failed to copy vma\n"); + return ERR_PTR(-ENOMEM); + } + + return vma; +} + +static int vb2_dc_get_user_pages(unsigned long start, struct page **pages, + int n_pages, struct vm_area_struct *vma, int write) +{ + if (vma_is_io(vma)) { + unsigned int i; + + for (i = 0; i < n_pages; ++i, start += PAGE_SIZE) { + unsigned long pfn; + int ret = follow_pfn(vma, start, &pfn); + + if (ret) { + printk(KERN_ERR "no page for address %lu\n", + start); + return ret; + } + pages[i] = pfn_to_page(pfn); + } + } else { + unsigned int n; + + n = get_user_pages(current, current->mm, start & PAGE_MASK, + n_pages, write, 1, pages, NULL); + if (n != n_pages) { + printk(KERN_ERR "got only %d of %d user pages\n", + n, n_pages); + while (n) + put_page(pages[--n]); + return -EFAULT; + } + } + + return 0; +} + +static void vb2_dc_put_dirty_page(struct page *page) +{ + set_page_dirty_lock(page); + put_page(page); +} + +static void vb2_dc_put_userptr(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + struct sg_table *sgt = buf->dma_sgt; + + dma_unmap_sg(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); + if (!vma_is_io(buf->vma)) + vb2_dc_sgt_foreach_page(sgt, vb2_dc_put_dirty_page); + + vb2_dc_release_sgtable(sgt); + vb2_put_vma(buf->vma); + kfree(buf); +} + static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, - unsigned long size, int write) + unsigned long size, int write) { struct vb2_dc_buf *buf; - struct vm_area_struct *vma; - dma_addr_t dma_addr = 0; - int ret; + unsigned long start; + unsigned long end; + unsigned long offset; + struct page **pages; + int n_pages; + int ret = 0; + struct sg_table *sgt; + unsigned long contig_size;
buf = kzalloc(sizeof *buf, GFP_KERNEL); if (!buf) return ERR_PTR(-ENOMEM);
- ret = vb2_get_contig_userptr(vaddr, size, &vma, &dma_addr); + buf->dev = alloc_ctx; + buf->dma_dir = write ? DMA_FROM_DEVICE : DMA_TO_DEVICE; + + start = vaddr & PAGE_MASK; + offset = vaddr & ~PAGE_MASK; + end = PAGE_ALIGN(vaddr + size); + n_pages = (end - start) >> PAGE_SHIFT; + + pages = kmalloc(n_pages * sizeof pages[0], GFP_KERNEL); + if (!pages) { + ret = -ENOMEM; + printk(KERN_ERR "failed to allocate pages table\n"); + goto fail_buf; + } + + buf->vma = vb2_dc_get_user_vma(start, size); + if (IS_ERR(buf->vma)) { + printk(KERN_ERR "failed to get VMA\n"); + ret = PTR_ERR(buf->vma); + goto fail_pages; + } + + /* extract page list from userspace mapping */ + ret = vb2_dc_get_user_pages(start, pages, n_pages, buf->vma, write); if (ret) { - printk(KERN_ERR "Failed acquiring VMA for vaddr 0x%08lx\n", - vaddr); - kfree(buf); - return ERR_PTR(ret); + printk(KERN_ERR "failed to get user pages\n"); + goto fail_vma; + } + + sgt = vb2_dc_pages_to_sgt(pages, n_pages, offset, size); + if (IS_ERR(sgt)) { + printk(KERN_ERR "failed to create scatterlist table\n"); + ret = -ENOMEM; + goto fail_get_user_pages; + } + + /* pages are no longer needed */ + kfree(pages); + pages = NULL; + + sgt->nents = dma_map_sg(buf->dev, sgt->sgl, sgt->orig_nents, + buf->dma_dir); + if (sgt->nents <= 0) { + printk(KERN_ERR "failed to map scatterlist\n"); + ret = -EIO; + goto fail_sgt; }
+ contig_size = vb2_dc_get_contiguous_size(sgt); + if (contig_size < size) { + printk(KERN_ERR "contiguous mapping is too small %lu/%lu\n", + contig_size, size); + ret = -EFAULT; + goto fail_map_sg; + } + + buf->dma_addr = sg_dma_address(sgt->sgl); buf->size = size; - buf->dma_addr = dma_addr; - buf->vma = vma; + buf->dma_sgt = sgt;
return buf; -}
-static void vb2_dc_put_userptr(void *mem_priv) -{ - struct vb2_dc_buf *buf = mem_priv; +fail_map_sg: + dma_unmap_sg(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
- if (!buf) - return; +fail_sgt: + if (!vma_is_io(buf->vma)) + vb2_dc_sgt_foreach_page(sgt, put_page); + vb2_dc_release_sgtable(sgt); + +fail_get_user_pages: + if (pages && !vma_is_io(buf->vma)) + while (n_pages) + put_page(pages[--n_pages]);
+fail_vma: vb2_put_vma(buf->vma); + +fail_pages: + kfree(pages); /* kfree is NULL-proof */ + +fail_buf: kfree(buf); + + return ERR_PTR(ret); }
/*********************************************/
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:50 Tomasz Stanislawski wrote:
From: Andrzej Pietrasiewicz andrzej.p@samsung.com
This patch introduces usage of dma_map_sg to map memory behind a userspace pointer to a device as dma-contiguous mapping.
Signed-off-by: Andrzej Pietrasiewicz andrzej.p@samsung.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com [bugfixing] Signed-off-by: Kamil Debski k.debski@samsung.com [bugfixing] Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [add sglist subroutines/code refactoring] Signed-off-by: Kyungmin Park kyungmin.park@samsung.com
drivers/media/video/videobuf2-dma-contig.c | 287 +++++++++++++++++++++++-- 1 files changed, 270 insertions(+), 17 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 476e536..3a1e314 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c
[snip]
+static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages,
- unsigned int n_pages, unsigned long offset, unsigned long size)
+{
- struct sg_table *sgt;
- unsigned int chunks;
- unsigned int i;
- unsigned int cur_page;
- int ret;
- struct scatterlist *s;
- unsigned int offset_end = n_pages * PAGE_SIZE - size;
Shouldn't offset_end be equal to n_page * PAGE_SIZE - size - offset ?
- sgt = kzalloc(sizeof *sgt, GFP_KERNEL);
- if (!sgt)
return ERR_PTR(-ENOMEM);
- /* compute number of chunks */
- chunks = 1;
- for (i = 1; i < n_pages; ++i)
if (pages[i] != pages[i - 1] + 1)
++chunks;
- ret = sg_alloc_table(sgt, chunks, GFP_KERNEL);
- if (ret) {
kfree(sgt);
return ERR_PTR(-ENOMEM);
- }
- /* merging chunks and putting them into the scatterlist */
- cur_page = 0;
- for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
size_t size = PAGE_SIZE;
This will shadow the size parameter, it's a bit confusing. You could rename it chunk_size.
unsigned int j;
for (j = cur_page + 1; j < n_pages; ++j) {
if (pages[j] != pages[j - 1] + 1)
break;
size += PAGE_SIZE;
}
/* cut offset if chunk starts at the first page */
if (cur_page == 0)
size -= offset;
/* cut offset_end if chunk ends at the last page */
if (j == n_pages)
size -= offset_end;
sg_set_page(s, pages[cur_page], size, offset);
offset = 0;
What about just
chunk_size -= offset; sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); size -= chunk_size; offset = 0;
You could then remove the offset_end calculation above.
cur_page = j;
- }
- return sgt;
+}
Hi Laurent,
On 04/17/2012 02:43 AM, Laurent Pinchart wrote:
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:50 Tomasz Stanislawski wrote:
From: Andrzej Pietrasiewicz andrzej.p@samsung.com
This patch introduces usage of dma_map_sg to map memory behind a userspace pointer to a device as dma-contiguous mapping.
Signed-off-by: Andrzej Pietrasiewicz andrzej.p@samsung.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com [bugfixing] Signed-off-by: Kamil Debski k.debski@samsung.com [bugfixing] Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [add sglist subroutines/code refactoring] Signed-off-by: Kyungmin Park kyungmin.park@samsung.com
drivers/media/video/videobuf2-dma-contig.c | 287 +++++++++++++++++++++++-- 1 files changed, 270 insertions(+), 17 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 476e536..3a1e314 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c
[snip]
I found out that the function below may be useful for DMABUF exporters and importers. I found that its code in '[PATCH 1/4] [RFC] drm/exynos: DMABUF: Added support for exporting non-contig buffers' by Prathyush K.
Therefore I posted a patch on linux-media: '[PATCH] scatterlist: add sg_alloc_table_by_pages function'.
For now I will keep the function below (with your fixes). If the patch above gets merged then the function will be removed.
Regards, Tomasz Stanislawski
+static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages,
- unsigned int n_pages, unsigned long offset, unsigned long size)
+{
- struct sg_table *sgt;
- unsigned int chunks;
- unsigned int i;
- unsigned int cur_page;
- int ret;
- struct scatterlist *s;
- unsigned int offset_end = n_pages * PAGE_SIZE - size;
Shouldn't offset_end be equal to n_page * PAGE_SIZE - size - offset ?
- sgt = kzalloc(sizeof *sgt, GFP_KERNEL);
- if (!sgt)
return ERR_PTR(-ENOMEM);
- /* compute number of chunks */
- chunks = 1;
- for (i = 1; i < n_pages; ++i)
if (pages[i] != pages[i - 1] + 1)
++chunks;
- ret = sg_alloc_table(sgt, chunks, GFP_KERNEL);
- if (ret) {
kfree(sgt);
return ERR_PTR(-ENOMEM);
- }
- /* merging chunks and putting them into the scatterlist */
- cur_page = 0;
- for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
size_t size = PAGE_SIZE;
This will shadow the size parameter, it's a bit confusing. You could rename it chunk_size.
unsigned int j;
for (j = cur_page + 1; j < n_pages; ++j) {
if (pages[j] != pages[j - 1] + 1)
break;
size += PAGE_SIZE;
}
/* cut offset if chunk starts at the first page */
if (cur_page == 0)
size -= offset;
/* cut offset_end if chunk ends at the last page */
if (j == n_pages)
size -= offset_end;
sg_set_page(s, pages[cur_page], size, offset);
offset = 0;
What about just
chunk_size -= offset; sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); size -= chunk_size; offset = 0;
You could then remove the offset_end calculation above.
cur_page = j;
- }
- return sgt;
+}
From: Marek Szyprowski m.szyprowski@samsung.com
This patch adds support for prepare/finish callbacks in VB2 allocators. These callback are used for buffer flushing.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- drivers/media/video/videobuf2-core.c | 11 +++++++++++ include/media/videobuf2-core.h | 7 +++++++ 2 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/drivers/media/video/videobuf2-core.c b/drivers/media/video/videobuf2-core.c index d26b1cc..b431dc6 100644 --- a/drivers/media/video/videobuf2-core.c +++ b/drivers/media/video/videobuf2-core.c @@ -842,6 +842,7 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state) { struct vb2_queue *q = vb->vb2_queue; unsigned long flags; + unsigned int plane;
if (vb->state != VB2_BUF_STATE_ACTIVE) return; @@ -852,6 +853,10 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state) dprintk(4, "Done processing on buffer %d, state: %d\n", vb->v4l2_buf.index, vb->state);
+ /* sync buffers */ + for (plane = 0; plane < vb->num_planes; ++plane) + call_memop(q, finish, vb->planes[plane].mem_priv); + /* Add the buffer to the done buffers list */ spin_lock_irqsave(&q->done_lock, flags); vb->state = state; @@ -1134,9 +1139,15 @@ err: static void __enqueue_in_driver(struct vb2_buffer *vb) { struct vb2_queue *q = vb->vb2_queue; + unsigned int plane;
vb->state = VB2_BUF_STATE_ACTIVE; atomic_inc(&q->queued_count); + + /* sync buffers */ + for (plane = 0; plane < vb->num_planes; ++plane) + call_memop(q, prepare, vb->planes[plane].mem_priv); + q->ops->buf_queue(vb); }
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h index 859bbaf..d079f92 100644 --- a/include/media/videobuf2-core.h +++ b/include/media/videobuf2-core.h @@ -56,6 +56,10 @@ struct vb2_fileio_data; * dmabuf * @unmap_dmabuf: releases access control to the dmabuf - allocator is notified * that this driver is done using the dmabuf for now + * @prepare: called everytime the buffer is passed from userspace to the + * driver, usefull for cache synchronisation, optional + * @finish: called everytime the buffer is passed back from the driver + * to the userspace, also optional * @vaddr: return a kernel virtual address to a given memory buffer * associated with the passed private structure or NULL if no * such mapping exists @@ -82,6 +86,9 @@ struct vb2_mem_ops { unsigned long size, int write); void (*put_userptr)(void *buf_priv);
+ void (*prepare)(void *buf_priv); + void (*finish)(void *buf_priv); + void *(*attach_dmabuf)(void *alloc_ctx, struct dma_buf *dbuf, unsigned long size, int write); void (*detach_dmabuf)(void *buf_priv);
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:51 Tomasz Stanislawski wrote:
From: Marek Szyprowski m.szyprowski@samsung.com
This patch adds support for prepare/finish callbacks in VB2 allocators. These callback are used for buffer flushing.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com
Acked-by: Laurent Pinchart laurent.pinchart@ideasonboard.com
From: Marek Szyprowski m.szyprowski@samsung.com
Add prepare/finish callbacks to vb2-dma-contig allocator.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- drivers/media/video/videobuf2-dma-contig.c | 24 ++++++++++++++++++++++++ 1 files changed, 24 insertions(+), 0 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 3a1e314..3b1ac77d 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -157,6 +157,28 @@ static unsigned int vb2_dc_num_users(void *buf_priv) return atomic_read(&buf->refcount); }
+static void vb2_dc_prepare(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!sgt) + return; + + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); +} + +static void vb2_dc_finish(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!sgt) + return; + + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); +} + /*********************************************/ /* callbacks for MMAP buffers */ /*********************************************/ @@ -419,6 +441,8 @@ const struct vb2_mem_ops vb2_dma_contig_memops = { .mmap = vb2_dc_mmap, .get_userptr = vb2_dc_get_userptr, .put_userptr = vb2_dc_put_userptr, + .prepare = vb2_dc_prepare, + .finish = vb2_dc_finish, .num_users = vb2_dc_num_users, }; EXPORT_SYMBOL_GPL(vb2_dma_contig_memops);
From: Sumit Semwal sumit.semwal@ti.com
This patch makes changes for adding dma-contig as a dma_buf user. It provides function implementations for the {attach, detach, map, unmap}_dmabuf() mem_ops of DMABUF memory type.
Signed-off-by: Sumit Semwal sumit.semwal@ti.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org [author of the original patch] Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [integration with refactored dma-contig allocator] --- drivers/media/video/videobuf2-dma-contig.c | 112 ++++++++++++++++++++++++++++ 1 files changed, 112 insertions(+), 0 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 3b1ac77d..a829e7d 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -10,6 +10,7 @@ * the Free Software Foundation. */
+#include <linux/dma-buf.h> #include <linux/module.h> #include <linux/scatterlist.h> #include <linux/sched.h> @@ -33,6 +34,9 @@ struct vb2_dc_buf {
/* USERPTR related */ struct vm_area_struct *vma; + + /* DMABUF related */ + struct dma_buf_attachment *db_attach; };
/*********************************************/ @@ -430,6 +434,110 @@ fail_buf: }
/*********************************************/ +/* callbacks for DMABUF buffers */ +/*********************************************/ + +static int vb2_dc_map_dmabuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + struct sg_table *sgt; + unsigned long contig_size; + + if (WARN_ON(!buf->db_attach)) { + printk(KERN_ERR "trying to pin a non attached buffer\n"); + return -EINVAL; + } + + if (WARN_ON(buf->dma_sgt)) { + printk(KERN_ERR "dmabuf buffer is already pinned\n"); + return 0; + } + + /* get the associated scatterlist for this buffer */ + sgt = dma_buf_map_attachment(buf->db_attach, buf->dma_dir); + if (IS_ERR_OR_NULL(sgt)) { + printk(KERN_ERR "Error getting dmabuf scatterlist\n"); + return -EINVAL; + } + + /* checking if dmabuf is big enough to store contiguous chunk */ + contig_size = vb2_dc_get_contiguous_size(sgt); + if (contig_size < buf->size) { + printk(KERN_ERR "contiguous chunk is too small %lu/%lu b\n", + contig_size, buf->size); + dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + return -EFAULT; + } + + buf->dma_addr = sg_dma_address(sgt->sgl); + buf->dma_sgt = sgt; + + return 0; +} + +static void vb2_dc_unmap_dmabuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + struct sg_table *sgt = buf->dma_sgt; + + if (WARN_ON(!buf->db_attach)) { + printk(KERN_ERR "trying to unpin a not attached buffer\n"); + return; + } + + if (WARN_ON(!sgt)) { + printk(KERN_ERR "dmabuf buffer is already unpinned\n"); + return; + } + + dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + + buf->dma_addr = 0; + buf->dma_sgt = NULL; +} + +static void vb2_dc_detach_dmabuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + + if (WARN_ON(buf->dma_addr)) + vb2_dc_unmap_dmabuf(buf); + + /* detach this attachment */ + dma_buf_detach(buf->db_attach->dmabuf, buf->db_attach); + kfree(buf); +} + +static void *vb2_dc_attach_dmabuf(void *alloc_ctx, struct dma_buf *dbuf, + unsigned long size, int write) +{ + struct vb2_dc_buf *buf; + struct dma_buf_attachment *dba; + + if (dbuf->size < size) + return ERR_PTR(-EFAULT); + + buf = kzalloc(sizeof *buf, GFP_KERNEL); + if (!buf) + return ERR_PTR(-ENOMEM); + + buf->dev = alloc_ctx; + /* create attachment for the dmabuf with the user device */ + dba = dma_buf_attach(dbuf, buf->dev); + if (IS_ERR(dba)) { + printk(KERN_ERR "failed to attach dmabuf\n"); + kfree(buf); + return dba; + } + + buf->dma_dir = write ? DMA_FROM_DEVICE : DMA_TO_DEVICE; + buf->size = size; + buf->db_attach = dba; + + return buf; +} + +/*********************************************/ /* DMA CONTIG exported functions */ /*********************************************/
@@ -443,6 +551,10 @@ const struct vb2_mem_ops vb2_dma_contig_memops = { .put_userptr = vb2_dc_put_userptr, .prepare = vb2_dc_prepare, .finish = vb2_dc_finish, + .map_dmabuf = vb2_dc_map_dmabuf, + .unmap_dmabuf = vb2_dc_unmap_dmabuf, + .attach_dmabuf = vb2_dc_attach_dmabuf, + .detach_dmabuf = vb2_dc_detach_dmabuf, .num_users = vb2_dc_num_users, }; EXPORT_SYMBOL_GPL(vb2_dma_contig_memops);
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:53 Tomasz Stanislawski wrote:
From: Sumit Semwal sumit.semwal@ti.com
This patch makes changes for adding dma-contig as a dma_buf user. It provides function implementations for the {attach, detach, map, unmap}_dmabuf() mem_ops of DMABUF memory type.
Signed-off-by: Sumit Semwal sumit.semwal@ti.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org [author of the original patch] Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [integration with refactored dma-contig allocator]
Pending the comment below,
Acked-by: Laurent Pinchart laurent.pinchart@ideasonboard.com
+static void vb2_dc_detach_dmabuf(void *mem_priv) +{
- struct vb2_dc_buf *buf = mem_priv;
- if (WARN_ON(buf->dma_addr))
vb2_dc_unmap_dmabuf(buf);
This should never happen, and would be a videobuf2 bug otherwise, right ?
- /* detach this attachment */
- dma_buf_detach(buf->db_attach->dmabuf, buf->db_attach);
- kfree(buf);
+}
Hi Laurent,
On 04/17/2012 02:57 AM, Laurent Pinchart wrote:
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:53 Tomasz Stanislawski wrote:
From: Sumit Semwal sumit.semwal@ti.com
This patch makes changes for adding dma-contig as a dma_buf user. It provides function implementations for the {attach, detach, map, unmap}_dmabuf() mem_ops of DMABUF memory type.
Signed-off-by: Sumit Semwal sumit.semwal@ti.com Signed-off-by: Sumit Semwal sumit.semwal@linaro.org [author of the original patch] Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com [integration with refactored dma-contig allocator]
Pending the comment below,
Acked-by: Laurent Pinchart laurent.pinchart@ideasonboard.com
+static void vb2_dc_detach_dmabuf(void *mem_priv) +{
- struct vb2_dc_buf *buf = mem_priv;
- if (WARN_ON(buf->dma_addr))
vb2_dc_unmap_dmabuf(buf);
This should never happen, and would be a videobuf2 bug otherwise, right ?
Theoretically it should not happen with latest vb2-core patches. However there is little sense to crash the kernel if it is possible to handle this bug. Maybe I should add some comments before the check.
- /* detach this attachment */
- dma_buf_detach(buf->db_attach->dmabuf, buf->db_attach);
- kfree(buf);
+}
Regards, Tomasz Stanislawski
The DMABUF documentation says that the map_dma_buf callback should return scatterlist that is mapped into a caller's address space. In practice, almost none of existing implementations of DMABUF exporter does it. This patch breaks the DMABUF specification in order to allow exchange DMABUF buffers between other APIs like DRM.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com --- drivers/media/video/videobuf2-dma-contig.c | 21 +++++++++++++++++++-- 1 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index a829e7d..0ea3fcd 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -442,6 +442,7 @@ static int vb2_dc_map_dmabuf(void *mem_priv) struct vb2_dc_buf *buf = mem_priv; struct sg_table *sgt; unsigned long contig_size; + int ret = -EFAULT;
if (WARN_ON(!buf->db_attach)) { printk(KERN_ERR "trying to pin a non attached buffer\n"); @@ -460,19 +461,34 @@ static int vb2_dc_map_dmabuf(void *mem_priv) return -EINVAL; }
+ /* mapping new sglist to the client */ + sgt->nents = dma_map_sg(buf->dev, sgt->sgl, sgt->orig_nents, + buf->dma_dir); + if (sgt->nents <= 0) { + printk(KERN_ERR "failed to map scatterlist\n"); + goto fail_map_attachment; + } + /* checking if dmabuf is big enough to store contiguous chunk */ contig_size = vb2_dc_get_contiguous_size(sgt); if (contig_size < buf->size) { printk(KERN_ERR "contiguous chunk is too small %lu/%lu b\n", contig_size, buf->size); - dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); - return -EFAULT; + goto fail_map_sg; }
buf->dma_addr = sg_dma_address(sgt->sgl); buf->dma_sgt = sgt;
return 0; + +fail_map_sg: + dma_unmap_sg(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); + +fail_map_attachment: + dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + + return ret; }
static void vb2_dc_unmap_dmabuf(void *mem_priv) @@ -490,6 +506,7 @@ static void vb2_dc_unmap_dmabuf(void *mem_priv) return; }
+ dma_unmap_sg(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir);
buf->dma_addr = 0;
Hi Tomasz,
Thanks for the patch.
On Friday 13 April 2012 17:47:54 Tomasz Stanislawski wrote:
The DMABUF documentation says that the map_dma_buf callback should return scatterlist that is mapped into a caller's address space. In practice, almost none of existing implementations of DMABUF exporter does it. This patch breaks the DMABUF specification in order to allow exchange DMABUF buffers between other APIs like DRM.
Once again, this means that it's time to either fix the documentation or fix the non-compliant drivers.
Could you please read the mail I've sent on 27/03/2012 in the "Minutes from V4L2 update call" thread and reply there to avoid scattering the discussion ?
This patch enhances s5p-tv with support for DMABUF importing via V4L2_MEMORY_DMABUF memory type.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com --- drivers/media/video/s5p-tv/Kconfig | 1 + drivers/media/video/s5p-tv/mixer_video.c | 2 +- 2 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/media/video/s5p-tv/Kconfig b/drivers/media/video/s5p-tv/Kconfig index f248b28..2e80126 100644 --- a/drivers/media/video/s5p-tv/Kconfig +++ b/drivers/media/video/s5p-tv/Kconfig @@ -10,6 +10,7 @@ config VIDEO_SAMSUNG_S5P_TV bool "Samsung TV driver for S5P platform (experimental)" depends on PLAT_S5P && PM_RUNTIME depends on EXPERIMENTAL + select DMA_SHARED_BUFFER default n ---help--- Say Y here to enable selecting the TV output devices for diff --git a/drivers/media/video/s5p-tv/mixer_video.c b/drivers/media/video/s5p-tv/mixer_video.c index f7ca5cc..6b45d93 100644 --- a/drivers/media/video/s5p-tv/mixer_video.c +++ b/drivers/media/video/s5p-tv/mixer_video.c @@ -1074,7 +1074,7 @@ struct mxr_layer *mxr_base_layer_create(struct mxr_device *mdev,
layer->vb_queue = (struct vb2_queue) { .type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE, - .io_modes = VB2_MMAP | VB2_USERPTR, + .io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF, .drv_priv = layer, .buf_struct_size = sizeof(struct mxr_buffer), .ops = &mxr_video_qops,
This patch enhances s5p-fimc with support for DMABUF importing via V4L2_MEMORY_DMABUF memory type.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com --- drivers/media/video/Kconfig | 1 + drivers/media/video/s5p-fimc/fimc-capture.c | 2 +- 2 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/media/video/Kconfig b/drivers/media/video/Kconfig index f2479c5..9de9ddc 100644 --- a/drivers/media/video/Kconfig +++ b/drivers/media/video/Kconfig @@ -1133,6 +1133,7 @@ config VIDEO_SAMSUNG_S5P_FIMC VIDEO_V4L2_SUBDEV_API && EXPERIMENTAL select VIDEOBUF2_DMA_CONTIG select V4L2_MEM2MEM_DEV + select DMA_SHARED_BUFFER ---help--- This is a v4l2 driver for Samsung S5P and EXYNOS4 camera host interface and video postprocessor. diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c index b06efd2..38fb39e 100644 --- a/drivers/media/video/s5p-fimc/fimc-capture.c +++ b/drivers/media/video/s5p-fimc/fimc-capture.c @@ -1530,7 +1530,7 @@ int fimc_register_capture_device(struct fimc_dev *fimc, q = &fimc->vid_cap.vbq; memset(q, 0, sizeof(*q)); q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; - q->io_modes = VB2_MMAP | VB2_USERPTR; + q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF; q->drv_priv = fimc->vid_cap.ctx; q->ops = &fimc_capture_qops; q->mem_ops = &vb2_dma_contig_memops;
Hi Tomasz,
On 04/13/2012 05:47 PM, Tomasz Stanislawski wrote:
This patch enhances s5p-fimc with support for DMABUF importing via V4L2_MEMORY_DMABUF memory type.
Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com
Acked-by: Sylwester Nawrocki s.nawrocki@samsung.com
Just one nitpick, please change the commit summary prefix from "v4l: fimc: ..." to "s5p-fimc: ..." when sending upstream.
Thanks.
linaro-mm-sig@lists.linaro.org