This is the first step in defining a buffer sharing framework. A new dma_buf buffer object is added, with hooks to allow for easy sharing of this buffer object across devices.
The framework allows: - a new buffer-object to be created with fixed size. - different devices to 'attach' themselves to this buffer, to facilitate backing storage negotiation, using dma_buf_attach() API. - association of a file pointer with each user-buffer and associated allocator-defined operations on that buffer. This operation is called the 'export' operation. - this exported buffer-object to be shared with the other entity by asking for its 'file-descriptor (fd)', and sharing the fd across. - a received fd to get the buffer object back, where it can be accessed using the associated exporter-defined operations. - the exporter and importer to share the scatterlist using get_scatterlist and put_scatterlist operations.
Atleast one 'attach()' call is required to be made prior to calling the buffer_map() callback.
Couple of building blocks in get_scatterlist() are added to ease introduction of sync'ing across exporter and importers, and late allocation by the exporter.
Optionally, mmap() file operation is provided for the associated 'fd', as wrapper over the allocator defined mmap()[optional], to be used by devices that might need one.
The idea was first mooted at the Linaro memory management mini-summit in Budapest in May 2011, as part of multiple things needed for a 'unified memory management framework'. It took a more concrete shape at Linaro memory-management mini-summit in Cambridge, Aug 2011.
This is based on design suggestions from many people at both the mini-summits, most notably from Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and Daniel Vetter daniel@ffwll.ch.
The implementation is inspired from proof-of-concept patch-set from Tomasz Stanislawski t.stanislaws@samsung.com, who demonstrated buffer sharing between two v4l2 devices.
------ v1: initial RFC. v2: - added attach() / detach() dma_buf_ops, and dma_buf_attach(),dma_buf_detach(). - added handling of list of attachment in the dma_buf central API itself. - corrected copyright information.
Signed-off-by: Sumit Semwal sumit.semwal@linaro.org Signed-off-by: Sumit Semwal sumit.semwal@ti.com --- drivers/base/Kconfig | 10 ++ drivers/base/Makefile | 1 + drivers/base/dma-buf.c | 224 +++++++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 179 +++++++++++++++++++++++++++++++++++++ 4 files changed, 414 insertions(+), 0 deletions(-) create mode 100644 drivers/base/dma-buf.c create mode 100644 include/linux/dma-buf.h
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 21cf46f..07d8095 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -174,4 +174,14 @@ config SYS_HYPERVISOR
source "drivers/base/regmap/Kconfig"
+config DMA_SHARED_BUFFER + bool "Buffer framework to be shared between drivers" + default n + depends on ANON_INODES + help + This option enables the framework for buffer-sharing between + multiple drivers. A buffer is associated with a file using driver + APIs extension; the file's descriptor can then be passed on to other + driver. + endmenu diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 99a375a..d0df046 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DEVTMPFS) += devtmpfs.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c new file mode 100644 index 0000000..8936da7 --- /dev/null +++ b/drivers/base/dma-buf.c @@ -0,0 +1,224 @@ +/* + * Framework for buffer objects that can be shared across devices/subsystems. + * + * Copyright(C) 2011 Linaro Limited. All rights reserved. + * Author: Sumit Semwal sumit.semwal@ti.com + * + * Many thanks to linaro-mm-sig list, and specially + * Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and + * Daniel Vetter daniel@ffwll.ch for their support in creation and + * refining of this idea. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <linux/fs.h> +#include <linux/slab.h> +#include <linux/dma-buf.h> +#include <linux/anon_inodes.h> + +static inline int is_dma_buf_file(struct file *); + +static int dma_buf_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct dma_buf *dmabuf; + + if (!is_dma_buf_file(file)) + return -EINVAL; + + dmabuf = file->private_data; + + if (!dmabuf->ops->mmap) + return -EINVAL; + + return dmabuf->ops->mmap(dmabuf, vma); +} + +static int dma_buf_release(struct inode *inode, struct file *file) +{ + struct dma_buf *dmabuf; + + if (!is_dma_buf_file(file)) + return -EINVAL; + + dmabuf = file->private_data; + + dmabuf->ops->release(dmabuf); + kfree(dmabuf); + return 0; +} + +static const struct file_operations dma_buf_fops = { + .mmap = dma_buf_mmap, + .release = dma_buf_release, +}; + +/* + * is_dma_buf_file - Check if struct file* is associated with dma_buf + */ +static inline int is_dma_buf_file(struct file *file) +{ + return file->f_op == &dma_buf_fops; +} + +/** + * dma_buf_export - Creates a new dma_buf, and associates an anon file + * with this buffer,so it can be exported. + * Also connect the allocator specific data and ops to the buffer. + * + * @priv: [in] Attach private data of allocator to this buffer + * @ops: [in] Attach allocator-defined dma buf ops to the new buffer. + * @flags: [in] mode flags for the file. + * + * Returns, on success, a newly created dma_buf object, which wraps the + * supplied private data and operations for dma_buf_ops. On failure to + * allocate the dma_buf object, it can return NULL. + * + */ +struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, + int flags) +{ + struct dma_buf *dmabuf; + struct file *file; + + BUG_ON(!priv || !ops); + + dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL); + if (dmabuf == NULL) + return dmabuf; + + dmabuf->priv = priv; + dmabuf->ops = ops; + + file = anon_inode_getfile("dmabuf", &dma_buf_fops, dmabuf, flags); + + dmabuf->file = file; + + INIT_LIST_HEAD(&dmabuf->attachments); + + return dmabuf; +} +EXPORT_SYMBOL(dma_buf_export); + + +/** + * dma_buf_fd - returns a file descriptor for the given dma_buf + * @dmabuf: [in] pointer to dma_buf for which fd is required. + * + * On success, returns an associated 'fd'. Else, returns error. + */ +int dma_buf_fd(struct dma_buf *dmabuf) +{ + int error, fd; + + if (!dmabuf->file) + return -EINVAL; + + error = get_unused_fd_flags(0); + if (error < 0) + return error; + fd = error; + + fd_install(fd, dmabuf->file); + + return fd; +} +EXPORT_SYMBOL(dma_buf_fd); + +/** + * dma_buf_get - returns the dma_buf structure related to an fd + * @fd: [in] fd associated with the dma_buf to be returned + * + * On success, returns the dma_buf structure associated with an fd; uses + * file's refcounting done by fget to increase refcount. returns ERR_PTR + * otherwise. + */ +struct dma_buf *dma_buf_get(int fd) +{ + struct file *file; + + file = fget(fd); + + if (!file) + return ERR_PTR(-EBADF); + + if (!is_dma_buf_file(file)) { + fput(file); + return ERR_PTR(-EINVAL); + } + + return file->private_data; +} +EXPORT_SYMBOL(dma_buf_get); + +/** + * dma_buf_put - decreases refcount of the buffer + * @dmabuf: [in] buffer to reduce refcount of + * + * Uses file's refcounting done implicitly by fput() + */ +void dma_buf_put(struct dma_buf *dmabuf) +{ + BUG_ON(!dmabuf->file); + + fput(dmabuf->file); + + return; +} +EXPORT_SYMBOL(dma_buf_put); + +/** + * dma_buf_attach - Add the device to dma_buf's attachments list; calls the + * attach() of dma_buf_ops to allow device-specific attach functionality + * @dmabuf: [in] buffer to attach device to. + * @dev: [in] device to be attached. + * + * Returns struct dma_buf_attachment * for this attachment; may return NULL. + * + */ +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, + struct device *dev) +{ + struct dma_buf_attachment *attach; + + BUG_ON(!dmabuf || !dev); + + attach = kzalloc(sizeof(struct dma_buf_attachment), GFP_KERNEL); + if (attach == NULL) + return attach; + + dmabuf->ops->attach(dmabuf, dev, attach); + list_add(&attach->node, &dmabuf->attachments); + + return attach; +} +EXPORT_SYMBOL(dma_buf_attach); + +/** + * dma_buf_detach - Remove the given attachment from dmabuf's attachments list; + * calls detach() of dma_buf_ops for device-specific detach + * @dmabuf: [in] buffer to detach from. + * @attach: [in] attachment to be detached; is free'd after this call. + * + */ +void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) +{ + BUG_ON(!dmabuf || !attach); + + list_del(&attach->node); + dmabuf->ops->detach(dmabuf, attach); + + kfree(attach); + return; +} +EXPORT_SYMBOL(dma_buf_detach); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h new file mode 100644 index 0000000..2894b45 --- /dev/null +++ b/include/linux/dma-buf.h @@ -0,0 +1,179 @@ +/* + * Header file for dma buffer sharing framework. + * + * Copyright(C) 2011 Linaro Limited. All rights reserved. + * Author: Sumit Semwal sumit.semwal@ti.com + * + * Many thanks to linaro-mm-sig list, and specially + * Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and + * Daniel Vetter daniel@ffwll.ch for their support in creation and + * refining of this idea. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ +#ifndef __DMA_BUF_H__ +#define __DMA_BUF_H__ + +#include <linux/file.h> +#include <linux/err.h> +#include <linux/device.h> +#include <linux/scatterlist.h> +#include <linux/list.h> + +struct dma_buf; + +/** + * enum dma_buf_optype - Used to denote the operation for which + * get_scatterlist() is called. This will help in implementing a wait(op) + * for sync'ing. + * @DMA_BUF_OP_READ: read operation will be done on this scatterlist + * @DMA_BUF_OP_WRITE: write operation will be done on this scatterlist + */ +enum dma_buf_optype { + DMA_BUF_OP_READ = (1 << 0), + DMA_BUF_OP_WRITE = (1 << 1), + DMA_BUF_OP_MAX, +}; + +/** + * enum dma_buf_attr_flags - Defines the attributes for this buffer. This + * can allow 'late backing of buffer' based on first get_scatterlist() call. + * @DMA_BUF_ATTR_CONTIG: Contiguous buffer + * @DMA_BUF_ATTR_DISCONTIG: Discontiguous buffer + * @DMA_BUF_ATTR_CUSTOM: Platform specific buffer; should evolve into some + * attributes that define buffers on given specific platform + */ +enum dma_buf_attr_flags { + DMA_BUF_ATTR_CONTIG, + DMA_BUF_ATTR_DISCONTIG, + DMA_BUF_ATTR_CUSTOM, + DMA_BUF_ATTR_MAX, +}; + +/** + * struct dma_buf_attachment - holds device-buffer attachment data + * @node: list_head to allow manipulation of list of dma_buf_attachment. + * @dev: device attached to the buffer. + * @priv: exporter-specific attachment data. + */ +struct dma_buf_attachment { + struct list_head node; + struct device *dev; + void *priv; +}; + +/** + * struct dma_buf_ops - operations possible on struct dma_buf + * @create: creates a struct dma_buf of a fixed size. Actual allocation + * does not happen here. + * @attach: allows different devices to 'attach' themselves to the given + * buffer. It might return -EBUSY to signal that backing storage + * is already allocated and incompatible with the requirements + * of requesting device. + * @detach: detach a given device from this buffer. + * @get_scatterlist: returns list of scatter pages allocated, increases + * usecount of the buffer. Requires atleast one attach to be + * called before. + * @put_scatterlist: decreases usecount of buffer, might deallocate scatter + * pages. + * @mmap: memory map this buffer - optional. + * @release: release this buffer; to be called after the last dma_buf_put. + * @sync_sg_for_cpu: sync the sg list for cpu. + * @sync_sg_for_device: synch the sg list for device. + */ +struct dma_buf_ops { + void (*attach)(struct dma_buf *, struct device *, + struct dma_buf_attachment *); + + void (*detach)(struct dma_buf *, struct dma_buf_attachment *); + + struct scatterlist * (*get_scatterlist)(struct dma_buf *, + struct dma_buf_attachment *, + enum dma_buf_optype, + enum dma_buf_attr_flags); + void (*put_scatterlist)(struct dma_buf *, struct dma_buf_attachment *, + struct scatterlist *); + + /* allow mmap optionally for devices that need it */ + int (*mmap)(struct dma_buf *, struct vm_area_struct *); + /* after final dma_buf_put() */ + void (*release)(struct dma_buf *); + + /* allow allocator to take care of cache ops */ + void (*sync_sg_for_cpu) (struct dma_buf *, struct device *); + void (*sync_sg_for_device)(struct dma_buf *, struct device *); +}; + +/** + * struct dma_buf - shared buffer object + * @file: file pointer used for sharing buffers across, and for refcounting. + * @attachments: list of dma_buf_attachment that denotes all devices attached. + * @ops: dma_buf_ops associated with this buffer object + * @priv: user specific private data + */ +struct dma_buf { + size_t size; + struct file *file; + struct list_head attachments; + struct dma_buf_ops *ops; + void *priv; +}; + +#ifdef CONFIG_DMA_SHARED_BUFFER +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, + struct device *dev); +void dma_buf_detach(struct dma_buf *dmabuf, + struct dma_buf_attachment *dmabuf_attach); +struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, int flags); +int dma_buf_fd(struct dma_buf *dmabuf); +struct dma_buf *dma_buf_get(int fd); +void dma_buf_put(struct dma_buf *dmabuf); + +#else + +static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, + struct device *dev) +{ + return ERR_PTR(-ENODEV); +} + +static inline void dma_buf_detach(struct dma_buf *dmabuf, + struct dma_buf_attachment *dmabuf_attach) +{ + return; +} + +static inline struct dma_buf *dma_buf_export(void *priv, + struct dma_buf_ops *ops, + int flags) +{ + return ERR_PTR(-ENODEV); +} + +static inline int dma_buf_fd(struct dma_buf *dmabuf) +{ + return -ENODEV; +} + +static inline struct dma_buf *dma_buf_get(int fd) +{ + return ERR_PTR(-ENODEV); +} + +static inline void dma_buf_put(struct dma_buf *dmabuf) +{ + return; +} +#endif /* CONFIG_DMA_SHARED_BUFFER */ + +#endif /* __DMA_BUF_H__ */
Hi All,
The versions that I posted are also on feature-branch at git.linaro.orghere:
version 2: http://git.linaro.org/gitweb?p=people/sumitsemwal/linux-3.x.git%3Ba=shortlog... dma-buf-v2 version 1: http://git.linaro.org/gitweb?p=people/sumitsemwal/linux-3.x.git%3Ba=shortlog... dma-buf-v1
Some comments inline.
Thanks, and best regards, ~Sumit.
On 13 September 2011 19:59, Sumit Semwal sumit.semwal@ti.com wrote:
This is the first step in defining a buffer sharing framework. A new dma_buf buffer object is added, with hooks to allow for easy sharing of this buffer object across devices.
The framework allows:
- a new buffer-object to be created with fixed size.
- different devices to 'attach' themselves to this buffer, to facilitate
backing storage negotiation, using dma_buf_attach() API.
- association of a file pointer with each user-buffer and associated allocator-defined operations on that buffer. This operation is called the 'export' operation.
- this exported buffer-object to be shared with the other entity by asking
for its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed
using the associated exporter-defined operations.
- the exporter and importer to share the scatterlist using get_scatterlist
and put_scatterlist operations.
Atleast one 'attach()' call is required to be made prior to calling the buffer_map() callback.
Couple of building blocks in get_scatterlist() are added to ease introduction of sync'ing across exporter and importers, and late allocation by the exporter.
Optionally, mmap() file operation is provided for the associated 'fd', as wrapper over the allocator defined mmap()[optional], to be used by devices that might need one.
The idea was first mooted at the Linaro memory management mini-summit in Budapest in May 2011, as part of multiple things needed for a 'unified memory management framework'. It took a more concrete shape at Linaro memory-management mini-summit in Cambridge, Aug 2011.
This is based on design suggestions from many people at both the mini-summits, most notably from Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and Daniel Vetter daniel@ffwll.ch.
The implementation is inspired from proof-of-concept patch-set from Tomasz Stanislawski t.stanislaws@samsung.com, who demonstrated buffer sharing between two v4l2 devices.
v1: initial RFC. v2:
- added attach() / detach() dma_buf_ops, and
dma_buf_attach(),dma_buf_detach().
- added handling of list of attachment in the dma_buf central API itself.
- corrected copyright information.
I forgot to mention: As per review comments from Hans, and others, the read / write ops are removed for now from both dma_buf_ops and fops of the associated file. Also mmap is made optional.
Signed-off-by: Sumit Semwal sumit.semwal@linaro.org Signed-off-by: Sumit Semwal sumit.semwal@ti.com
<snip>
On 09/15/2011 03:31 AM, Sumit Semwal wrote:
I forgot to mention: As per review comments from Hans, and others, the read / write ops are removed for now from both dma_buf_ops and fops of the associated file. Also mmap is made optional.
My summary of the previous mail thread was that everybody ended up agreeing that there is value in the read/write hooks. I know I was convinced that they could be useful.
Jordan
On Thu, Sep 15, 2011 at 08:53:24AM -0600, Jordan Crouse wrote:
On 09/15/2011 03:31 AM, Sumit Semwal wrote:
I forgot to mention: As per review comments from Hans, and others, the read / write ops are removed for now from both dma_buf_ops and fops of the associated file. Also mmap is made optional.
My summary of the previous mail thread was that everybody ended up agreeing that there is value in the read/write hooks. I know I was convinced that they could be useful.
Can we just ditch them, please ;-)
Really. Imo dma_buf's first concern should be to make buffer sharing possible. That needs absolutely zero support for actually getting at the data from userspace - existing subsystems can do that already. And most of all they (and their userspace parts) already know when it is safe to access a buffer, what kind of dance is required to ensure that and all these things.
So they need to be at least optional (like they're currently are) so e.g. gpu drivers can refuse to implement them. When dma_buf is merged and picks up unforeseen users, they could maybe use such a thing (because they're new and don't yet have any interface), but till that happens, I don't see much use (at least for v1 of dma_buf). -Daniel
On Thu, Sep 15, 2011 at 9:53 AM, Jordan Crouse jcrouse@codeaurora.org wrote:
On 09/15/2011 03:31 AM, Sumit Semwal wrote:
I forgot to mention: As per review comments from Hans, and others, the read / write ops are removed for now from both dma_buf_ops and fops of the associated file. Also mmap is made optional.
My summary of the previous mail thread was that everybody ended up agreeing that there is value in the read/write hooks. I know I was convinced that they could be useful.
I guess it should be easier to add new APIs later, than take them away.. ;-)
It seems like read/write could be potentially useful.. but couldn't think of any immediate need for them, so for me it seems ok to add later if/when needed. If they are just optional dmabuf's, then no harm in adding more fxn ptrs to the struct later.
BR, -R
Jordan
Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig
On Sep 15, 2011 5:09 PM, "Clark, Rob" rob@ti.com wrote:
On Thu, Sep 15, 2011 at 9:53 AM, Jordan Crouse jcrouse@codeaurora.org
wrote:
On 09/15/2011 03:31 AM, Sumit Semwal wrote:
I forgot to mention: As per review comments from Hans, and others, the
read
/ write ops are removed for now from both dma_buf_ops and fops of the associated file. Also mmap is made optional.
My summary of the previous mail thread was that everybody ended up
agreeing
that there is value in the read/write hooks. I know I was convinced
that
they could be useful.
I guess it should be easier to add new APIs later, than take them away..
;-)
It seems like read/write could be potentially useful.. but couldn't think of any immediate need for them, so for me it seems ok to add later if/when needed. If they are just optional dmabuf's, then no harm in adding more fxn ptrs to the struct later.
Although I do think read/write have immediate uses, I agree with Rob that they don't need to be part of this review.
BR, -R
Jordan
Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig
Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig
On Tue, Sep 13, 2011 at 9:29 AM, Sumit Semwal sumit.semwal@ti.com wrote:
+struct dma_buf_ops {
- void (*attach)(struct dma_buf *, struct device *,
- struct dma_buf_attachment *);
- void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
- struct scatterlist * (*get_scatterlist)(struct dma_buf *,
- struct dma_buf_attachment *,
- enum dma_buf_optype,
- enum dma_buf_attr_flags);
I guess you don't need to pass both 'struct dma_buf' and 'struct dma_buf_attachment'.. the 'struct dma_buf_attachment' can have the 'struct dma_buf' ptr. This somehow seems a bit more logical and cleaner to me.
Also, I'm not sure about dma_buf_attr_flags. I think we could add whatever is needed to 'struct device_dma_parameters' (dev->dma_params).
BR, -R
- void (*put_scatterlist)(struct dma_buf *, struct dma_buf_attachment *,
- struct scatterlist *);
- /* allow mmap optionally for devices that need it */
- int (*mmap)(struct dma_buf *, struct vm_area_struct *);
- /* after final dma_buf_put() */
- void (*release)(struct dma_buf *);
- /* allow allocator to take care of cache ops */
- void (*sync_sg_for_cpu) (struct dma_buf *, struct device *);
- void (*sync_sg_for_device)(struct dma_buf *, struct device *);
+};
Hi Rob,
On 19 September 2011 05:37, Clark, Rob rob@ti.com wrote:
On Tue, Sep 13, 2011 at 9:29 AM, Sumit Semwal sumit.semwal@ti.com wrote:
+struct dma_buf_ops {
void (*attach)(struct dma_buf *, struct device *,
struct dma_buf_attachment *);
void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
struct scatterlist * (*get_scatterlist)(struct dma_buf *,
struct dma_buf_attachment
*,
enum dma_buf_optype,
enum dma_buf_attr_flags);
I guess you don't need to pass both 'struct dma_buf' and 'struct dma_buf_attachment'.. the 'struct dma_buf_attachment' can have the 'struct dma_buf' ptr. This somehow seems a bit more logical and cleaner to me.
Thanks, I agree, and will remove it in the next version.
Also, I'm not sure about dma_buf_attr_flags. I think we could add whatever is needed to 'struct device_dma_parameters' (dev->dma_params).
Yes, I think we will leave it for addition-as-and-when-required.
With these changes, I am thinking of publishing out this RFC to the upstream mailing lists - do you all think it's ready, or does it need some more obvious work?
If it is ok, could you all please let me know what all mailing lists should I target? [I have thought of linux-arm-kernel, linux-mm, linux-arch - in addition, linux-media, dri-devel; CC to Arnd and Russell. Any others?]
Best regards, ~Sumit.
BR, -R
void (*put_scatterlist)(struct dma_buf *, struct
dma_buf_attachment *,
struct scatterlist *);
/* allow mmap optionally for devices that need it */
int (*mmap)(struct dma_buf *, struct vm_area_struct *);
/* after final dma_buf_put() */
void (*release)(struct dma_buf *);
/* allow allocator to take care of cache ops */
void (*sync_sg_for_cpu) (struct dma_buf *, struct device *);
void (*sync_sg_for_device)(struct dma_buf *, struct device *);
+};
Hi Sumit,
Dont we need to have a locking member for the struct dma_buf. Below is one scenario I can think of:
When the client drivers call attach() on a dma_buf object, you would possibly be modifying the attachments list. If more than one driver calls attach() for the same buffer at the same time, should'nt it be good to lock the buffer before we operate on it? The same may even apply for detach and other ops calls.
Regards, Subash
On 09/13/2011 07:59 PM, Sumit Semwal wrote:
+/**
- struct dma_buf_ops - operations possible on struct dma_buf
- @create: creates a struct dma_buf of a fixed size. Actual allocation
does not happen here.
- @attach: allows different devices to 'attach' themselves to the given
buffer. It might return -EBUSY to signal that backing storage
is already allocated and incompatible with the requirements
of requesting device.
- @detach: detach a given device from this buffer.
- @get_scatterlist: returns list of scatter pages allocated, increases
usecount of the buffer. Requires atleast one attach to be
called before.
- @put_scatterlist: decreases usecount of buffer, might deallocate scatter
pages.
- @mmap: memory map this buffer - optional.
- @release: release this buffer; to be called after the last dma_buf_put.
- @sync_sg_for_cpu: sync the sg list for cpu.
- @sync_sg_for_device: synch the sg list for device.
- */
+struct dma_buf_ops {
- void (*attach)(struct dma_buf *, struct device *,
struct dma_buf_attachment *);
- void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
- struct scatterlist * (*get_scatterlist)(struct dma_buf *,
struct dma_buf_attachment *,
enum dma_buf_optype,
enum dma_buf_attr_flags);
- void (*put_scatterlist)(struct dma_buf *, struct dma_buf_attachment *,
struct scatterlist *);
- /* allow mmap optionally for devices that need it */
- int (*mmap)(struct dma_buf *, struct vm_area_struct *);
- /* after final dma_buf_put() */
- void (*release)(struct dma_buf *);
- /* allow allocator to take care of cache ops */
- void (*sync_sg_for_cpu) (struct dma_buf *, struct device *);
- void (*sync_sg_for_device)(struct dma_buf *, struct device *);
+};
+/**
- struct dma_buf - shared buffer object
- @file: file pointer used for sharing buffers across, and for refcounting.
- @attachments: list of dma_buf_attachment that denotes all devices attached.
- @ops: dma_buf_ops associated with this buffer object
- @priv: user specific private data
- */
+struct dma_buf {
- size_t size;
- struct file *file;
- struct list_head attachments;
- struct dma_buf_ops *ops;
- void *priv;
+};
Hi Subash,
On Sep 23, 2011 5:20 PM, "Subash Patel" subashrp@gmail.com wrote:
Hi Sumit,
Dont we need to have a locking member for the struct dma_buf. Below is one
scenario I can think of:
When the client drivers call attach() on a dma_buf object, you would
possibly be modifying the attachments list. If more than one driver calls attach() for the same buffer at the same time, should'nt it be good to lock the buffer before we operate on it? The same may even apply for detach and other ops calls.
I would say since attach/detach and other ops are implementation-dependent on the exporter of buffers, if locking is required by the exporter, it can be a part of dma_buf->priv? That leaves the choice of lock implementation also on the exporter.
Please let me know if you feel otherwise.
Regards, Subash
Thanks and regards, Sumit.
On 09/13/2011 07:59 PM, Sumit Semwal wrote:
+/**
- struct dma_buf_ops - operations possible on struct dma_buf
- @create: creates a struct dma_buf of a fixed size. Actual allocation
does not happen here.
- @attach: allows different devices to 'attach' themselves to the given
buffer. It might return -EBUSY to signal that backing storage
is already allocated and incompatible with the requirements
of requesting device.
- @detach: detach a given device from this buffer.
- @get_scatterlist: returns list of scatter pages allocated, increases
usecount of the buffer. Requires atleast one attach
to be
called before.
- @put_scatterlist: decreases usecount of buffer, might deallocate
scatter
pages.
- @mmap: memory map this buffer - optional.
- @release: release this buffer; to be called after the last
dma_buf_put.
- @sync_sg_for_cpu: sync the sg list for cpu.
- @sync_sg_for_device: synch the sg list for device.
- */
+struct dma_buf_ops {
void (*attach)(struct dma_buf *, struct device *,
struct dma_buf_attachment *);
void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
struct scatterlist * (*get_scatterlist)(struct dma_buf *,
struct dma_buf_attachment
*,
enum dma_buf_optype,
enum dma_buf_attr_flags);
void (*put_scatterlist)(struct dma_buf *, struct
dma_buf_attachment *,
struct scatterlist *);
/* allow mmap optionally for devices that need it */
int (*mmap)(struct dma_buf *, struct vm_area_struct *);
/* after final dma_buf_put() */
void (*release)(struct dma_buf *);
/* allow allocator to take care of cache ops */
void (*sync_sg_for_cpu) (struct dma_buf *, struct device *);
void (*sync_sg_for_device)(struct dma_buf *, struct device *);
+};
+/**
- struct dma_buf - shared buffer object
- @file: file pointer used for sharing buffers across, and for
refcounting.
- @attachments: list of dma_buf_attachment that denotes all devices
attached.
- @ops: dma_buf_ops associated with this buffer object
- @priv: user specific private data
- */
+struct dma_buf {
size_t size;
struct file *file;
struct list_head attachments;
struct dma_buf_ops *ops;
void *priv;
+};
On Tue, Sep 13, 2011 at 07:59:09PM +0530, Sumit Semwal wrote:
This is the first step in defining a buffer sharing framework. A new dma_buf buffer object is added, with hooks to allow for easy sharing of this buffer object across devices.
The framework allows:
- a new buffer-object to be created with fixed size.
- different devices to 'attach' themselves to this buffer, to facilitate backing storage negotiation, using dma_buf_attach() API.
- association of a file pointer with each user-buffer and associated allocator-defined operations on that buffer. This operation is called the 'export' operation.
- this exported buffer-object to be shared with the other entity by asking for its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using the associated exporter-defined operations.
- the exporter and importer to share the scatterlist using get_scatterlist and put_scatterlist operations.
Atleast one 'attach()' call is required to be made prior to calling the buffer_map() callback.
Couple of building blocks in get_scatterlist() are added to ease introduction of sync'ing across exporter and importers, and late allocation by the exporter.
Optionally, mmap() file operation is provided for the associated 'fd', as wrapper over the allocator defined mmap()[optional], to be used by devices that might need one.
The idea was first mooted at the Linaro memory management mini-summit in Budapest in May 2011, as part of multiple things needed for a 'unified memory management framework'. It took a more concrete shape at Linaro memory-management mini-summit in Cambridge, Aug 2011.
This is based on design suggestions from many people at both the mini-summits, most notably from Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and Daniel Vetter daniel@ffwll.ch.
The implementation is inspired from proof-of-concept patch-set from Tomasz Stanislawski t.stanislaws@samsung.com, who demonstrated buffer sharing between two v4l2 devices.
v1: initial RFC. v2:
- added attach() / detach() dma_buf_ops, and dma_buf_attach(),dma_buf_detach().
- added handling of list of attachment in the dma_buf central API itself.
- corrected copyright information.
Signed-off-by: Sumit Semwal sumit.semwal@linaro.org Signed-off-by: Sumit Semwal sumit.semwal@ti.com
Hi Summit,
Sorry for the late review, I've been a bit busy. Some nitpicks below, but most of it has already been brought up by Rob and Subash.
drivers/base/Kconfig | 10 ++ drivers/base/Makefile | 1 + drivers/base/dma-buf.c | 224 +++++++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 179 +++++++++++++++++++++++++++++++++++++ 4 files changed, 414 insertions(+), 0 deletions(-) create mode 100644 drivers/base/dma-buf.c create mode 100644 include/linux/dma-buf.h
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 21cf46f..07d8095 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -174,4 +174,14 @@ config SYS_HYPERVISOR source "drivers/base/regmap/Kconfig" +config DMA_SHARED_BUFFER
- bool "Buffer framework to be shared between drivers"
- default n
- depends on ANON_INODES
- help
This option enables the framework for buffer-sharing between
multiple drivers. A buffer is associated with a file using driver
APIs extension; the file's descriptor can then be passed on to other
driver.
endmenu diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 99a375a..d0df046 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DEVTMPFS) += devtmpfs.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c new file mode 100644 index 0000000..8936da7 --- /dev/null +++ b/drivers/base/dma-buf.c @@ -0,0 +1,224 @@ +/*
- Framework for buffer objects that can be shared across devices/subsystems.
- Copyright(C) 2011 Linaro Limited. All rights reserved.
- Author: Sumit Semwal sumit.semwal@ti.com
- Many thanks to linaro-mm-sig list, and specially
- Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and
- Daniel Vetter daniel@ffwll.ch for their support in creation and
- refining of this idea.
- This program is free software; you can redistribute it and/or modify it
- under the terms of the GNU General Public License version 2 as published by
- the Free Software Foundation.
- This program is distributed in the hope that it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
- You should have received a copy of the GNU General Public License along with
- this program. If not, see http://www.gnu.org/licenses/.
- */
+#include <linux/fs.h> +#include <linux/slab.h> +#include <linux/dma-buf.h> +#include <linux/anon_inodes.h>
+static inline int is_dma_buf_file(struct file *);
+static int dma_buf_mmap(struct file *file, struct vm_area_struct *vma) +{
- struct dma_buf *dmabuf;
- if (!is_dma_buf_file(file))
return -EINVAL;
- dmabuf = file->private_data;
- if (!dmabuf->ops->mmap)
return -EINVAL;
- return dmabuf->ops->mmap(dmabuf, vma);
+}
+static int dma_buf_release(struct inode *inode, struct file *file) +{
- struct dma_buf *dmabuf;
- if (!is_dma_buf_file(file))
return -EINVAL;
- dmabuf = file->private_data;
- dmabuf->ops->release(dmabuf);
- kfree(dmabuf);
- return 0;
+}
+static const struct file_operations dma_buf_fops = {
- .mmap = dma_buf_mmap,
- .release = dma_buf_release,
+};
+/*
- is_dma_buf_file - Check if struct file* is associated with dma_buf
- */
+static inline int is_dma_buf_file(struct file *file) +{
- return file->f_op == &dma_buf_fops;
+}
+/**
- dma_buf_export - Creates a new dma_buf, and associates an anon file
- with this buffer,so it can be exported.
- Also connect the allocator specific data and ops to the buffer.
- @priv: [in] Attach private data of allocator to this buffer
- @ops: [in] Attach allocator-defined dma buf ops to the new buffer.
- @flags: [in] mode flags for the file.
- Returns, on success, a newly created dma_buf object, which wraps the
- supplied private data and operations for dma_buf_ops. On failure to
- allocate the dma_buf object, it can return NULL.
- */
+struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops,
int flags)
+{
- struct dma_buf *dmabuf;
- struct file *file;
- BUG_ON(!priv || !ops);
- dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL);
- if (dmabuf == NULL)
return dmabuf;
- dmabuf->priv = priv;
- dmabuf->ops = ops;
- file = anon_inode_getfile("dmabuf", &dma_buf_fops, dmabuf, flags);
- dmabuf->file = file;
- INIT_LIST_HEAD(&dmabuf->attachments);
- return dmabuf;
+} +EXPORT_SYMBOL(dma_buf_export);
+/**
- dma_buf_fd - returns a file descriptor for the given dma_buf
- @dmabuf: [in] pointer to dma_buf for which fd is required.
- On success, returns an associated 'fd'. Else, returns error.
- */
+int dma_buf_fd(struct dma_buf *dmabuf) +{
- int error, fd;
- if (!dmabuf->file)
return -EINVAL;
- error = get_unused_fd_flags(0);
- if (error < 0)
return error;
- fd = error;
- fd_install(fd, dmabuf->file);
- return fd;
+} +EXPORT_SYMBOL(dma_buf_fd);
+/**
- dma_buf_get - returns the dma_buf structure related to an fd
- @fd: [in] fd associated with the dma_buf to be returned
- On success, returns the dma_buf structure associated with an fd; uses
- file's refcounting done by fget to increase refcount. returns ERR_PTR
- otherwise.
- */
+struct dma_buf *dma_buf_get(int fd) +{
- struct file *file;
- file = fget(fd);
- if (!file)
return ERR_PTR(-EBADF);
- if (!is_dma_buf_file(file)) {
fput(file);
return ERR_PTR(-EINVAL);
- }
- return file->private_data;
+} +EXPORT_SYMBOL(dma_buf_get);
+/**
- dma_buf_put - decreases refcount of the buffer
- @dmabuf: [in] buffer to reduce refcount of
- Uses file's refcounting done implicitly by fput()
- */
+void dma_buf_put(struct dma_buf *dmabuf) +{
- BUG_ON(!dmabuf->file);
- fput(dmabuf->file);
- return;
+} +EXPORT_SYMBOL(dma_buf_put);
+/**
- dma_buf_attach - Add the device to dma_buf's attachments list; calls the
- attach() of dma_buf_ops to allow device-specific attach functionality
- @dmabuf: [in] buffer to attach device to.
- @dev: [in] device to be attached.
- Returns struct dma_buf_attachment * for this attachment; may return NULL.
- */
+struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
+{
- struct dma_buf_attachment *attach;
- BUG_ON(!dmabuf || !dev);
- attach = kzalloc(sizeof(struct dma_buf_attachment), GFP_KERNEL);
- if (attach == NULL)
return attach;
- dmabuf->ops->attach(dmabuf, dev, attach);
- list_add(&attach->node, &dmabuf->attachments);
- return attach;
+} +EXPORT_SYMBOL(dma_buf_attach);
I agree with Subash that we need a look here to protect the list manipulation. I think for simplicity just add a mutex and wrap it around both the list manipulation and the call to attach/detach.
Also, I think the common code shoudl set dma_buf_attachment->dev and ops->attach/detach should be optional (for simpler cases) and attach needs to be able to return an error (which should then be converted into an ERR_PTR by this function) in case the allocation is already fixed and the device can't access it at its current position.
+/**
- dma_buf_detach - Remove the given attachment from dmabuf's attachments list;
- calls detach() of dma_buf_ops for device-specific detach
- @dmabuf: [in] buffer to detach from.
- @attach: [in] attachment to be detached; is free'd after this call.
- */
+void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) +{
- BUG_ON(!dmabuf || !attach);
- list_del(&attach->node);
- dmabuf->ops->detach(dmabuf, attach);
- kfree(attach);
- return;
+} +EXPORT_SYMBOL(dma_buf_detach); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h new file mode 100644 index 0000000..2894b45 --- /dev/null +++ b/include/linux/dma-buf.h @@ -0,0 +1,179 @@ +/*
- Header file for dma buffer sharing framework.
- Copyright(C) 2011 Linaro Limited. All rights reserved.
- Author: Sumit Semwal sumit.semwal@ti.com
- Many thanks to linaro-mm-sig list, and specially
- Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and
- Daniel Vetter daniel@ffwll.ch for their support in creation and
- refining of this idea.
- This program is free software; you can redistribute it and/or modify it
- under the terms of the GNU General Public License version 2 as published by
- the Free Software Foundation.
- This program is distributed in the hope that it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
- You should have received a copy of the GNU General Public License along with
- this program. If not, see http://www.gnu.org/licenses/.
- */
+#ifndef __DMA_BUF_H__ +#define __DMA_BUF_H__
+#include <linux/file.h> +#include <linux/err.h> +#include <linux/device.h> +#include <linux/scatterlist.h> +#include <linux/list.h>
+struct dma_buf;
+/**
- enum dma_buf_optype - Used to denote the operation for which
- get_scatterlist() is called. This will help in implementing a wait(op)
- for sync'ing.
- @DMA_BUF_OP_READ: read operation will be done on this scatterlist
- @DMA_BUF_OP_WRITE: write operation will be done on this scatterlist
- */
+enum dma_buf_optype {
- DMA_BUF_OP_READ = (1 << 0),
- DMA_BUF_OP_WRITE = (1 << 1),
- DMA_BUF_OP_MAX,
+};
Can we just use DMA_TO_DEVICE, DMA_FROM_DEVICE, DMA_BIDIRECTIONAL from the dma api? I think that would be much better for 2 reasons: - consistency between dma api and dma_buf api - on sane machines (x86) the dma_buf allocator could just grab an sg_list and directly map the get_scatterlist to the dma_map_sg call.
+/**
- enum dma_buf_attr_flags - Defines the attributes for this buffer. This
- can allow 'late backing of buffer' based on first get_scatterlist() call.
- @DMA_BUF_ATTR_CONTIG: Contiguous buffer
- @DMA_BUF_ATTR_DISCONTIG: Discontiguous buffer
- @DMA_BUF_ATTR_CUSTOM: Platform specific buffer; should evolve into some
- attributes that define buffers on given specific platform
- */
+enum dma_buf_attr_flags {
- DMA_BUF_ATTR_CONTIG,
- DMA_BUF_ATTR_DISCONTIG,
- DMA_BUF_ATTR_CUSTOM,
- DMA_BUF_ATTR_MAX,
+};
I don't like this for essentially the same reasons as Rob. Also, if we really need this I think it should be passed in at attach time, not at get_scatterlist (where it might be too late). Also, for the same reasons as with the direction flags above, I want this to be consistent with the solution the normal dma mapping api provides.
Furthermore adding flags for extendability always souns great until you actually want to extend your api: Usually you then notice gazillions of ad-hoc extension and the associated mess, making extensions harder than easier. So for dma_buf, first try I'd aim for the simplest possible solution that can get the job done. No strange flags, please ;-)
+/**
- struct dma_buf_attachment - holds device-buffer attachment data
- @node: list_head to allow manipulation of list of dma_buf_attachment.
- @dev: device attached to the buffer.
- @priv: exporter-specific attachment data.
- */
+struct dma_buf_attachment {
- struct list_head node;
- struct device *dev;
- void *priv;
+};
+/**
- struct dma_buf_ops - operations possible on struct dma_buf
- @create: creates a struct dma_buf of a fixed size. Actual allocation
does not happen here.
- @attach: allows different devices to 'attach' themselves to the given
buffer. It might return -EBUSY to signal that backing storage
is already allocated and incompatible with the requirements
of requesting device.
- @detach: detach a given device from this buffer.
- @get_scatterlist: returns list of scatter pages allocated, increases
usecount of the buffer. Requires atleast one attach to be
called before.
I think the important bit to spec here is that the returned sg list is already mapped into the _device_ address space. This would support the use case of a special remmapper (like ompas TILER) that a group of devices could access to e.g. see tiled buffers as linear buffers. That would obviously require some driver/platform specific magic, but I think we should design the interface to allow this.
Also, I think this will give us a cleaner interface, because about the only thing the driver can do with that sg list is to map it. So better fold that into the get_scatterlist.
I'd still like to see a name for the get/put_scatterlist that matches more the dma api (i.e. a wraper dma_buf_map or something), but this is a bit bikeshedding, so I don't care much.
- @put_scatterlist: decreases usecount of buffer, might deallocate scatter
pages.
- @mmap: memory map this buffer - optional.
- @release: release this buffer; to be called after the last dma_buf_put.
- @sync_sg_for_cpu: sync the sg list for cpu.
- @sync_sg_for_device: synch the sg list for device.
- */
+struct dma_buf_ops {
- void (*attach)(struct dma_buf *, struct device *,
struct dma_buf_attachment *);
- void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
- struct scatterlist * (*get_scatterlist)(struct dma_buf *,
struct dma_buf_attachment *,
enum dma_buf_optype,
enum dma_buf_attr_flags);
- void (*put_scatterlist)(struct dma_buf *, struct dma_buf_attachment *,
struct scatterlist *);
- /* allow mmap optionally for devices that need it */
- int (*mmap)(struct dma_buf *, struct vm_area_struct *);
- /* after final dma_buf_put() */
- void (*release)(struct dma_buf *);
- /* allow allocator to take care of cache ops */
- void (*sync_sg_for_cpu) (struct dma_buf *, struct device *);
- void (*sync_sg_for_device)(struct dma_buf *, struct device *);
+};
+/**
- struct dma_buf - shared buffer object
- @file: file pointer used for sharing buffers across, and for refcounting.
- @attachments: list of dma_buf_attachment that denotes all devices attached.
- @ops: dma_buf_ops associated with this buffer object
- @priv: user specific private data
- */
+struct dma_buf {
- size_t size;
- struct file *file;
- struct list_head attachments;
- struct dma_buf_ops *ops;
- void *priv;
+};
+#ifdef CONFIG_DMA_SHARED_BUFFER +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev);
+void dma_buf_detach(struct dma_buf *dmabuf,
struct dma_buf_attachment *dmabuf_attach);
+struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, int flags); +int dma_buf_fd(struct dma_buf *dmabuf); +struct dma_buf *dma_buf_get(int fd); +void dma_buf_put(struct dma_buf *dmabuf);
+#else
+static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
+{
- return ERR_PTR(-ENODEV);
+}
+static inline void dma_buf_detach(struct dma_buf *dmabuf,
struct dma_buf_attachment *dmabuf_attach)
+{
- return;
+}
+static inline struct dma_buf *dma_buf_export(void *priv,
struct dma_buf_ops *ops,
int flags)
+{
- return ERR_PTR(-ENODEV);
+}
+static inline int dma_buf_fd(struct dma_buf *dmabuf) +{
- return -ENODEV;
+}
+static inline struct dma_buf *dma_buf_get(int fd) +{
- return ERR_PTR(-ENODEV);
+}
+static inline void dma_buf_put(struct dma_buf *dmabuf) +{
- return;
+} +#endif /* CONFIG_DMA_SHARED_BUFFER */
+#endif /* __DMA_BUF_H__ */
1.7.4.1
Hi Sumit,
Thanks for your work! Sorry for being absent recently, but Real Life (tm) interfered with my normal review duties in the past 3-4 weeks.
So I've been reading the backlog today and I have a few questions and remarks below.
On Tuesday, September 13, 2011 16:29:09 Sumit Semwal wrote:
This is the first step in defining a buffer sharing framework. A new dma_buf buffer object is added, with hooks to allow for easy sharing of this buffer object across devices.
The framework allows:
- a new buffer-object to be created with fixed size.
- different devices to 'attach' themselves to this buffer, to facilitate backing storage negotiation, using dma_buf_attach() API.
- association of a file pointer with each user-buffer and associated allocator-defined operations on that buffer. This operation is called the 'export' operation.
- this exported buffer-object to be shared with the other entity by asking for its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using the associated exporter-defined operations.
- the exporter and importer to share the scatterlist using get_scatterlist and put_scatterlist operations.
Atleast one 'attach()' call is required to be made prior to calling the buffer_map() callback.
Couple of building blocks in get_scatterlist() are added to ease introduction of sync'ing across exporter and importers, and late allocation by the exporter.
Optionally, mmap() file operation is provided for the associated 'fd', as wrapper over the allocator defined mmap()[optional], to be used by devices that might need one.
The idea was first mooted at the Linaro memory management mini-summit in Budapest in May 2011, as part of multiple things needed for a 'unified memory management framework'. It took a more concrete shape at Linaro memory-management mini-summit in Cambridge, Aug 2011.
This is based on design suggestions from many people at both the mini-summits, most notably from Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and Daniel Vetter daniel@ffwll.ch.
The implementation is inspired from proof-of-concept patch-set from Tomasz Stanislawski t.stanislaws@samsung.com, who demonstrated buffer sharing between two v4l2 devices.
v1: initial RFC. v2:
- added attach() / detach() dma_buf_ops, and dma_buf_attach(),dma_buf_detach().
- added handling of list of attachment in the dma_buf central API itself.
- corrected copyright information.
Signed-off-by: Sumit Semwal sumit.semwal@linaro.org Signed-off-by: Sumit Semwal sumit.semwal@ti.com
drivers/base/Kconfig | 10 ++ drivers/base/Makefile | 1 + drivers/base/dma-buf.c | 224 +++++++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 179 +++++++++++++++++++++++++++++++++++++ 4 files changed, 414 insertions(+), 0 deletions(-) create mode 100644 drivers/base/dma-buf.c create mode 100644 include/linux/dma-buf.h
<cut>
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h new file mode 100644 index 0000000..2894b45 --- /dev/null +++ b/include/linux/dma-buf.h @@ -0,0 +1,179 @@ +/*
- Header file for dma buffer sharing framework.
- Copyright(C) 2011 Linaro Limited. All rights reserved.
- Author: Sumit Semwal sumit.semwal@ti.com
- Many thanks to linaro-mm-sig list, and specially
- Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and
- Daniel Vetter daniel@ffwll.ch for their support in creation and
- refining of this idea.
- This program is free software; you can redistribute it and/or modify it
- under the terms of the GNU General Public License version 2 as published by
- the Free Software Foundation.
- This program is distributed in the hope that it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
- You should have received a copy of the GNU General Public License along with
- this program. If not, see http://www.gnu.org/licenses/.
- */
+#ifndef __DMA_BUF_H__ +#define __DMA_BUF_H__
+#include <linux/file.h> +#include <linux/err.h> +#include <linux/device.h> +#include <linux/scatterlist.h> +#include <linux/list.h>
+struct dma_buf;
+/**
- enum dma_buf_optype - Used to denote the operation for which
- get_scatterlist() is called. This will help in implementing a wait(op)
- for sync'ing.
- @DMA_BUF_OP_READ: read operation will be done on this scatterlist
- @DMA_BUF_OP_WRITE: write operation will be done on this scatterlist
- */
+enum dma_buf_optype {
- DMA_BUF_OP_READ = (1 << 0),
- DMA_BUF_OP_WRITE = (1 << 1),
- DMA_BUF_OP_MAX,
+};
+/**
- enum dma_buf_attr_flags - Defines the attributes for this buffer. This
- can allow 'late backing of buffer' based on first get_scatterlist() call.
- @DMA_BUF_ATTR_CONTIG: Contiguous buffer
- @DMA_BUF_ATTR_DISCONTIG: Discontiguous buffer
- @DMA_BUF_ATTR_CUSTOM: Platform specific buffer; should evolve into some
- attributes that define buffers on given specific platform
- */
+enum dma_buf_attr_flags {
- DMA_BUF_ATTR_CONTIG,
- DMA_BUF_ATTR_DISCONTIG,
- DMA_BUF_ATTR_CUSTOM,
- DMA_BUF_ATTR_MAX,
+};
+/**
- struct dma_buf_attachment - holds device-buffer attachment data
- @node: list_head to allow manipulation of list of dma_buf_attachment.
- @dev: device attached to the buffer.
- @priv: exporter-specific attachment data.
- */
+struct dma_buf_attachment {
- struct list_head node;
- struct device *dev;
- void *priv;
+};
+/**
- struct dma_buf_ops - operations possible on struct dma_buf
- @create: creates a struct dma_buf of a fixed size. Actual allocation
does not happen here.
- @attach: allows different devices to 'attach' themselves to the given
buffer. It might return -EBUSY to signal that backing storage
is already allocated and incompatible with the requirements
of requesting device.
- @detach: detach a given device from this buffer.
- @get_scatterlist: returns list of scatter pages allocated, increases
usecount of the buffer. Requires atleast one attach to be
called before.
- @put_scatterlist: decreases usecount of buffer, might deallocate scatter
pages.
- @mmap: memory map this buffer - optional.
- @release: release this buffer; to be called after the last dma_buf_put.
- @sync_sg_for_cpu: sync the sg list for cpu.
- @sync_sg_for_device: synch the sg list for device.
- */
+struct dma_buf_ops {
- void (*attach)(struct dma_buf *, struct device *,
struct dma_buf_attachment *);
- void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
- struct scatterlist * (*get_scatterlist)(struct dma_buf *,
struct dma_buf_attachment *,
enum dma_buf_optype,
enum dma_buf_attr_flags);
- void (*put_scatterlist)(struct dma_buf *, struct dma_buf_attachment *,
struct scatterlist *);
- /* allow mmap optionally for devices that need it */
- int (*mmap)(struct dma_buf *, struct vm_area_struct *);
- /* after final dma_buf_put() */
- void (*release)(struct dma_buf *);
- /* allow allocator to take care of cache ops */
- void (*sync_sg_for_cpu) (struct dma_buf *, struct device *);
- void (*sync_sg_for_device)(struct dma_buf *, struct device *);
+};
OK, it is not clear to me what the purpose is of the attachments.
If I understand the discussion from this list correctly, then the idea is that each device that wants to use this buffer first attaches itself by calling dma_buf_attach().
Then at some point the application asks some driver to export the buffer. So the driver calls dma_buf_export() and passes its own dma_buf_ops. In other words, this becomes the driver that 'controls' the memory, right?
Another driver that receives the fd will call dma_buf_get() and can then call e.g. get_scatterlist from dma_buf->ops. (As an aside: I would make inline functions that take a dma_buf pointer and call the corresponding op, rather than requiring drivers to go through ops directly)
But what I miss in this picture is the role of dma_buf_attachment. I'm passing it to get_scatterlist, but which attachment is that? That of the calling driver? And what is the get_scatterlist implementation supposed to do with it?
I also read some discussion about what is supposed to happen if another device is attached after get_scatterlist was already called. Apparently the idea was that the old scatterlist is somehow migrated to a new one if that should be necessary? Although I got the impression that that involved a lot of hand-waving with a pinch of wishful thinking. But I may be wrong about that.
Anyway, I guess my main point is that this patch does not explain the role of the attachments and how they should be used (and who uses them).
One other thing: once you call REQBUFS on a V4L device the V4L spec says that the memory should be allocated at that time. Because V4L often needs a lot of memory that behavior makes sense: you know immediately if you can get the memory or not. In addition, that memory is mmap-ed before the DMA is started.
This behavior may pose a problem if the idea is to wait with actually allocating memory until the pipeline is started.
Hmm, I'm rambling a bit, but I hope the gist of my mail is clear.
Regards,
Hans
+/**
- struct dma_buf - shared buffer object
- @file: file pointer used for sharing buffers across, and for refcounting.
- @attachments: list of dma_buf_attachment that denotes all devices attached.
- @ops: dma_buf_ops associated with this buffer object
- @priv: user specific private data
- */
+struct dma_buf {
- size_t size;
- struct file *file;
- struct list_head attachments;
- struct dma_buf_ops *ops;
- void *priv;
+};
+#ifdef CONFIG_DMA_SHARED_BUFFER +struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev);
+void dma_buf_detach(struct dma_buf *dmabuf,
struct dma_buf_attachment *dmabuf_attach);
+struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, int flags); +int dma_buf_fd(struct dma_buf *dmabuf); +struct dma_buf *dma_buf_get(int fd); +void dma_buf_put(struct dma_buf *dmabuf);
+#else
+static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
struct device *dev)
+{
- return ERR_PTR(-ENODEV);
+}
+static inline void dma_buf_detach(struct dma_buf *dmabuf,
struct dma_buf_attachment *dmabuf_attach)
+{
- return;
+}
+static inline struct dma_buf *dma_buf_export(void *priv,
struct dma_buf_ops *ops,
int flags)
+{
- return ERR_PTR(-ENODEV);
+}
+static inline int dma_buf_fd(struct dma_buf *dmabuf) +{
- return -ENODEV;
+}
+static inline struct dma_buf *dma_buf_get(int fd) +{
- return ERR_PTR(-ENODEV);
+}
+static inline void dma_buf_put(struct dma_buf *dmabuf) +{
- return;
+} +#endif /* CONFIG_DMA_SHARED_BUFFER */
+#endif /* __DMA_BUF_H__ */
Hi Hans,
I'll try to explain a bit, after all I've been pushing this attachment buisness quite a bit.
On Tue, Sep 27, 2011 at 03:24:24PM +0200, Hans Verkuil wrote:
OK, it is not clear to me what the purpose is of the attachments.
If I understand the discussion from this list correctly, then the idea is that each device that wants to use this buffer first attaches itself by calling dma_buf_attach().
Then at some point the application asks some driver to export the buffer. So the driver calls dma_buf_export() and passes its own dma_buf_ops. In other words, this becomes the driver that 'controls' the memory, right?
Actually, the ordering is the other way round. First, some driver calls dam_buf_export, userspace then passes around the fd to all other drivers, they do an import and call attach. While all this happens, the driver that exported the dma_buf does not (yet) allocate any backing storage.
Another driver that receives the fd will call dma_buf_get() and can then call e.g. get_scatterlist from dma_buf->ops. (As an aside: I would make inline functions that take a dma_buf pointer and call the corresponding op, rather than requiring drivers to go through ops directly)
Well, drivers should only call get_scatterlist when they actually need to access the memory. This way the originating driver can go through the list of all attached devices and decide where to allocate backing storage on the first get_scatterlist.
But what I miss in this picture is the role of dma_buf_attachment. I'm passing it to get_scatterlist, but which attachment is that? That of the calling driver? And what is the get_scatterlist implementation supposed to do with it?
See above, essentially an attachment is just list bookkeeping for all the devices that take part in a buffer sharing.
I also read some discussion about what is supposed to happen if another device is attached after get_scatterlist was already called. Apparently the idea was that the old scatterlist is somehow migrated to a new one if that should be necessary? Although I got the impression that that involved a lot of hand-waving with a pinch of wishful thinking. But I may be wrong about that.
Now this is very it's getting "intersting". If all drivers guard they're usage with get_scatterlist/put_scatterlist, and we add a new driver, and all drivers that currently hold onto a mapping are known to release that with put_scatterlist in a finite time, we can do Cool Stuff (tm). First, that scenario actually happens for e.g. a video pipe, where we cycle through buffers.
Now when adding a new device with stricter backing storage constrains, the originator can just stall in the get_scatterlist call until all outstanding access has completed (signalled by put_scatterlist), move the object around and let things continue merily. The video pipe might stutter a bit when e.g. switching on the encoder until all buffers have settled into the new place, but it should Just Work (tm).
Anyway, I guess my main point is that this patch does not explain the role of the attachments and how they should be used (and who uses them).
I agree.
One other thing: once you call REQBUFS on a V4L device the V4L spec says that the memory should be allocated at that time. Because V4L often needs a lot of memory that behavior makes sense: you know immediately if you can get the memory or not. In addition, that memory is mmap-ed before the DMA is started.
If that is actually a fixed requirement for v4l, that's a good reason for mmap support on the dma_buf object. We could hide all the complecity of shooting down userspace mmapings on buffer movements from the drivers. Can you elaborate a bit on this?
This behavior may pose a problem if the idea is to wait with actually allocating memory until the pipeline is started.
I think you're looking at v4lv3 ;-)
More seriously all modern linux apis for pushing frames out use one of two modes: - gimme the next frame to draw into (dri2) - here's the next frame I've drawn into (wayland)
To make that fast, we obviously need to recycle buffers. But from a semantic point of view, you only ever have one buffer, namely the current one. All the other N buffers to make the graphics pipeline not stutter are transparently in-flight somewhere.
Imo such a dynamic scheme has a few advantages: - there's just no way to know the amount of buffers you need up-front on any reasonable complex graphics pipeline. As soon as a gpu is in the mix, it's best effort. With a dynamic limit on the in-flight buffers you can cope with latencies until you hit -ENOMEM. With a fixed set you always have to make a compromise and can't really allocate for the worst case - it will hinder stuff running in the background. - in the usual case you need much fewer buffers to make any given pipeline run stutter-free than in the worst case. No point wasting that memory.
Now I have no idea how you could shoe-horn that onto the current v4l interfaces.
Hmm, I'm rambling a bit, but I hope the gist of my mail is clear.
It's clear and I think you're raising good points.
Cheers, Daniel
On Tue, Sep 27, 2011 at 9:19 AM, Daniel Vetter daniel@ffwll.ch wrote:
One other thing: once you call REQBUFS on a V4L device the V4L spec says that the memory should be allocated at that time. Because V4L often needs a lot of memory that behavior makes sense: you know immediately if you can get the memory or not. In addition, that memory is mmap-ed before the DMA is started.
If that is actually a fixed requirement for v4l, that's a good reason for mmap support on the dma_buf object. We could hide all the complecity of shooting down userspace mmapings on buffer movements from the drivers. Can you elaborate a bit on this?
I would hope if we are using V4L2_MEMORY_DMABUF that V4L should not create a user mapping of the buffer.
Given that V4L2_MEMORY_DMABUF is something new, I think it is fine to spec out this part of the V4L spec to now require a userspace mapping.
BR, -R
Hi Hans, Daniel, Rob,
On 28 September 2011 04:07, Clark, Rob rob@ti.com wrote:
On Tue, Sep 27, 2011 at 9:19 AM, Daniel Vetter daniel@ffwll.ch wrote:
One other thing: once you call REQBUFS on a V4L device the V4L spec says
that
the memory should be allocated at that time. Because V4L often needs a
lot of
memory that behavior makes sense: you know immediately if you can get
the memory
or not. In addition, that memory is mmap-ed before the DMA is started.
If that is actually a fixed requirement for v4l, that's a good reason for mmap support on the dma_buf object. We could hide all the complecity of shooting down userspace mmapings on buffer movements from the drivers. Can you elaborate a bit on this?
I would hope if we are using V4L2_MEMORY_DMABUF that V4L should not create a user mapping of the buffer.
Given that V4L2_MEMORY_DMABUF is something new, I think it is fine to spec out this part of the V4L spec to now require a userspace mapping.
Thanks for your questions and comments - and for clarifying the use cases to Hans. I will create a Documentation doc for explaining it as easily and crisply as Daniel has done. I will also update the patch based on all your comments - I think a v3 posted here is warranted before I post to upstream lists I guess.
BR, -R
Thanks and best regards,
Sumit Semwal
Linaro Kernel Engineer - Graphics working group
Linaro.org http://www.linaro.org/* **│ *Open source software for ARM SoCs* ***
Follow *Linaro: *Facebook http://www.facebook.com/pages/Linaro | Twitterhttp://twitter.com/#!/linaroorg | Blog http://www.linaro.org/linaro-blog/
On Tuesday, September 27, 2011 16:19:56 Daniel Vetter wrote:
Hi Hans,
I'll try to explain a bit, after all I've been pushing this attachment buisness quite a bit.
On Tue, Sep 27, 2011 at 03:24:24PM +0200, Hans Verkuil wrote:
OK, it is not clear to me what the purpose is of the attachments.
If I understand the discussion from this list correctly, then the idea is that each device that wants to use this buffer first attaches itself by calling dma_buf_attach().
Then at some point the application asks some driver to export the buffer. So the driver calls dma_buf_export() and passes its own dma_buf_ops. In other words, this becomes the driver that 'controls' the memory, right?
Actually, the ordering is the other way round. First, some driver calls dam_buf_export, userspace then passes around the fd to all other drivers, they do an import and call attach. While all this happens, the driver that exported the dma_buf does not (yet) allocate any backing storage.
Ah. This really should be documented better :-)
Another driver that receives the fd will call dma_buf_get() and can then call e.g. get_scatterlist from dma_buf->ops. (As an aside: I would make inline functions that take a dma_buf pointer and call the corresponding op, rather than requiring drivers to go through ops directly)
Well, drivers should only call get_scatterlist when they actually need to access the memory. This way the originating driver can go through the list of all attached devices and decide where to allocate backing storage on the first get_scatterlist.
Unless I am mistaken there is nothing there in the attachment datastructure at this moment to help determine the best device to use for the scatterlist. Right? It's just a list of devices together with an opaque pointer.
Another thing I don't get (sorry, I must be really dense today) is that the get_scatterlist op is set by the dma_buf_export call. I would expect that that op is part of the attachment data since I would expect that to be device specific.
At least, my assumption is that the actual memory is allocated by the first call to get_scatterlist?
Actually, I think this might be key: who allocates the actual memory and when? There is no documentation whatsoever on this crucial topic.
Before attempting to post this to a wider audience you really need to have proper documentation for this API and an example implementation.
But what I miss in this picture is the role of dma_buf_attachment. I'm passing it to get_scatterlist, but which attachment is that? That of the calling driver? And what is the get_scatterlist implementation supposed to do with it?
See above, essentially an attachment is just list bookkeeping for all the devices that take part in a buffer sharing.
I also read some discussion about what is supposed to happen if another device is attached after get_scatterlist was already called. Apparently the idea was that the old scatterlist is somehow migrated to a new one if that should be necessary? Although I got the impression that that involved a lot of hand-waving with a pinch of wishful thinking. But I may be wrong about that.
Now this is very it's getting "intersting". If all drivers guard they're usage with get_scatterlist/put_scatterlist, and we add a new driver, and all drivers that currently hold onto a mapping are known to release that with put_scatterlist in a finite time, we can do Cool Stuff (tm). First, that scenario actually happens for e.g. a video pipe, where we cycle through buffers.
Now when adding a new device with stricter backing storage constrains, the originator can just stall in the get_scatterlist call until all outstanding access has completed (signalled by put_scatterlist), move the object around and let things continue merily. The video pipe might stutter a bit when e.g. switching on the encoder until all buffers have settled into the new place, but it should Just Work (tm).
Sounds great. Of course how to determine 'stricter backing storage constrains' is currently just a lot of hand-waving :-)
Anyway, I guess my main point is that this patch does not explain the role of the attachments and how they should be used (and who uses them).
I agree.
One other thing: once you call REQBUFS on a V4L device the V4L spec says that the memory should be allocated at that time. Because V4L often needs a lot of memory that behavior makes sense: you know immediately if you can get the memory or not. In addition, that memory is mmap-ed before the DMA is started.
If that is actually a fixed requirement for v4l, that's a good reason for mmap support on the dma_buf object. We could hide all the complecity of shooting down userspace mmapings on buffer movements from the drivers. Can you elaborate a bit on this?
There's not much to elaborate on. Calling VIDIOC_REQBUFS is supposed to allocate all buffers and pin them in memory, thus ensuring that all is ready for DMA.
Note that drivers based around the videobuf framework actually postpone the allocation until the first use. This violates the spec and is fixed in the videobuf2 framework.
As Rob suggested, this is a requirement that can probably be relaxed for new memory types (such as DMA_BUF).
While not a requirement, it is common practice that applications mmap the buffers immediately after the call to REQBUFS. For dma_buf it is very likely that such mmap operations would go through the dma_buf fd. Of course, if you just hand over the buffer to another device, then there is no need for userspace to call mmap.
This behavior may pose a problem if the idea is to wait with actually allocating memory until the pipeline is started.
I think you're looking at v4lv3 ;-)
More seriously all modern linux apis for pushing frames out use one of two modes:
- gimme the next frame to draw into (dri2)
- here's the next frame I've drawn into (wayland)
To make that fast, we obviously need to recycle buffers. But from a semantic point of view, you only ever have one buffer, namely the current one. All the other N buffers to make the graphics pipeline not stutter are transparently in-flight somewhere.
Imo such a dynamic scheme has a few advantages:
- there's just no way to know the amount of buffers you need up-front on any reasonable complex graphics pipeline. As soon as a gpu is in the mix, it's best effort. With a dynamic limit on the in-flight buffers you can cope with latencies until you hit -ENOMEM. With a fixed set you always have to make a compromise and can't really allocate for the worst case - it will hinder stuff running in the background.
Currently in V4L2 the number of buffers is fixed after calling REQBUFS. That is, you can't add more buffers later. However, work is being done to lift this limitation (it's almost finished, I expect to see this in 3.2 or 3.3).
- in the usual case you need much fewer buffers to make any given pipeline run stutter-free than in the worst case. No point wasting that memory.
Now I have no idea how you could shoe-horn that onto the current v4l interfaces.
Not in the current API, but we hope to have the required flexibility quite soon. Certainly in time for dma_buf.
Hmm, I'm rambling a bit, but I hope the gist of my mail is clear.
It's clear and I think you're raising good points.
Good :-)
Hans
On Wed, Sep 28, 2011 at 10:51:08AM +0200, Hans Verkuil wrote:
On Tuesday, September 27, 2011 16:19:56 Daniel Vetter wrote:
Hi Hans,
I'll try to explain a bit, after all I've been pushing this attachment buisness quite a bit.
On Tue, Sep 27, 2011 at 03:24:24PM +0200, Hans Verkuil wrote:
OK, it is not clear to me what the purpose is of the attachments.
If I understand the discussion from this list correctly, then the idea is that each device that wants to use this buffer first attaches itself by calling dma_buf_attach().
Then at some point the application asks some driver to export the buffer. So the driver calls dma_buf_export() and passes its own dma_buf_ops. In other words, this becomes the driver that 'controls' the memory, right?
Actually, the ordering is the other way round. First, some driver calls dam_buf_export, userspace then passes around the fd to all other drivers, they do an import and call attach. While all this happens, the driver that exported the dma_buf does not (yet) allocate any backing storage.
Ah. This really should be documented better :-)
Another driver that receives the fd will call dma_buf_get() and can then call e.g. get_scatterlist from dma_buf->ops. (As an aside: I would make inline functions that take a dma_buf pointer and call the corresponding op, rather than requiring drivers to go through ops directly)
Well, drivers should only call get_scatterlist when they actually need to access the memory. This way the originating driver can go through the list of all attached devices and decide where to allocate backing storage on the first get_scatterlist.
Unless I am mistaken there is nothing there in the attachment datastructure at this moment to help determine the best device to use for the scatterlist. Right? It's just a list of devices together with an opaque pointer.
Well, I want a struct device pointer in there, not something opaque. But yes, an attachment is just bookkeeping.
Another thing I don't get (sorry, I must be really dense today) is that the get_scatterlist op is set by the dma_buf_export call. I would expect that that op is part of the attachment data since I would expect that to be device specific.
There's a dma_buf_attachment parameter so you can get at the device the mapping is for.
At least, my assumption is that the actual memory is allocated by the first call to get_scatterlist?
Yes.
Actually, I think this might be key: who allocates the actual memory and when? There is no documentation whatsoever on this crucial topic.
Ok, let me elaborate a bit on how I see this. The memory will be allocated by whichever driver exported the dma_buf originally. We probably end up with a bunch of helper fucntion for the common case, but this approach allows crazy stuff like sharing special memory (e.g. TILER on omap) which can only be accessed by certain devices. I think this is easieast with a few examples:
- On a sane hw, e.g. buffer exported by i915 on x86: attach is a noop (and hence should be optional), in get_scatterlist I'll just create a sg_list out of the struct page * array I get from the gem backing storage and the call map_sg and return that sg_list (now mapped into the device address space).
- On insane hw with strange dma requirements (like dram bank placement, contigious, hugepagesize for puny tlbs, ...): Attach could check whether the buffer is already fixed or not (e.g. the kernel fb console probably can't be moved) and if the fixed buffer could be mapped by the device. That's we attach needs a return value. If the buffer is not pinned or not yet allocated, do nothing.
On get_scatterlist walk the attachment list and grab the dma requirements of all devices (via some platform/dma specific stuff attached to struct device). Then walk through a list of dma allocators your platform provides (page_alloc, cma, ...) until you have one that satisfies the requirements and alloc the backing storage.
Then create an sg_list out of hit and map_sg it, like above.
For arm we probably need some helper functions to make this easier.
- Sharing of a special buffer object, like a tiled buffer remapping into omap's TILER. attach would check whether the device can actually access that special io area (iirc on omap only certain devices can access the tiler). get_scatterlist would just create a 1-entry sg_list that points directly to the pyhsical address (in device space) of the remapped buffer. This is way I want get_scatterlist to return a mapped buffer (and for api cleanliness).
Cheers, Daniel
On Wednesday, September 28, 2011 12:52:58 Daniel Vetter wrote:
On Wed, Sep 28, 2011 at 10:51:08AM +0200, Hans Verkuil wrote:
On Tuesday, September 27, 2011 16:19:56 Daniel Vetter wrote:
Hi Hans,
I'll try to explain a bit, after all I've been pushing this attachment buisness quite a bit.
On Tue, Sep 27, 2011 at 03:24:24PM +0200, Hans Verkuil wrote:
OK, it is not clear to me what the purpose is of the attachments.
If I understand the discussion from this list correctly, then the idea is that each device that wants to use this buffer first attaches itself by calling dma_buf_attach().
Then at some point the application asks some driver to export the buffer. So the driver calls dma_buf_export() and passes its own dma_buf_ops. In other words, this becomes the driver that 'controls' the memory, right?
Actually, the ordering is the other way round. First, some driver calls dam_buf_export, userspace then passes around the fd to all other drivers, they do an import and call attach. While all this happens, the driver that exported the dma_buf does not (yet) allocate any backing storage.
Ah. This really should be documented better :-)
Another driver that receives the fd will call dma_buf_get() and can then call e.g. get_scatterlist from dma_buf->ops. (As an aside: I would make inline functions that take a dma_buf pointer and call the corresponding op, rather than requiring drivers to go through ops directly)
Well, drivers should only call get_scatterlist when they actually need to access the memory. This way the originating driver can go through the list of all attached devices and decide where to allocate backing storage on the first get_scatterlist.
Unless I am mistaken there is nothing there in the attachment datastructure at this moment to help determine the best device to use for the scatterlist. Right? It's just a list of devices together with an opaque pointer.
Well, I want a struct device pointer in there, not something opaque. But yes, an attachment is just bookkeeping.
Another thing I don't get (sorry, I must be really dense today) is that the get_scatterlist op is set by the dma_buf_export call. I would expect that that op is part of the attachment data since I would expect that to be device specific.
There's a dma_buf_attachment parameter so you can get at the device the mapping is for.
At least, my assumption is that the actual memory is allocated by the first call to get_scatterlist?
Yes.
Actually, I think this might be key: who allocates the actual memory and when? There is no documentation whatsoever on this crucial topic.
Ok, let me elaborate a bit on how I see this. The memory will be allocated by whichever driver exported the dma_buf originally. We probably end up with a bunch of helper fucntion for the common case, but this approach allows crazy stuff like sharing special memory (e.g. TILER on omap) which can only be accessed by certain devices. I think this is easieast with a few examples:
- On a sane hw, e.g. buffer exported by i915 on x86: attach is a noop (and hence should be optional), in get_scatterlist I'll just create a sg_list out of the struct page * array I get from the gem backing storage and the call map_sg and return that sg_list (now mapped into the device address space).
OK.
- On insane hw with strange dma requirements (like dram bank placement, contigious, hugepagesize for puny tlbs, ...): Attach could check whether the buffer is already fixed or not (e.g. the kernel fb console probably can't be moved) and if the fixed buffer could be mapped by the device. That's we attach needs a return value. If the buffer is not pinned or not yet allocated, do nothing.
OK.
On get_scatterlist walk the attachment list and grab the dma requirements of all devices (via some platform/dma specific stuff attached to struct device). Then walk through a list of dma allocators your platform provides (page_alloc, cma, ...) until you have one that satisfies the requirements and alloc the backing storage.
Then create an sg_list out of hit and map_sg it, like above.
Hmm. Here is where I run into problems. Right now the get_scatterlist op is set by the driver that exports the buffer. So each driver supporting dma_buf would have it's own get_scatterlist implementation, each able to walk the attachment list. This feels wrong to me. get_scatterlist is doing too much and this functionality should probably be split up somehow.
There is also the problem of how to determine whether all requirements are satisfied: that's currently left undefined.
For arm we probably need some helper functions to make this easier.
- Sharing of a special buffer object, like a tiled buffer remapping into omap's TILER. attach would check whether the device can actually access that special io area (iirc on omap only certain devices can access the tiler). get_scatterlist would just create a 1-entry sg_list that points directly to the pyhsical address (in device space) of the remapped buffer. This is way I want get_scatterlist to return a mapped buffer (and for api cleanliness).
OK.
I think I would like to see a demo implementation and proper documentation. Both help a lot to bring the API into focus and identify what works and what doesn't. If it's hard to explain or to implement, then we probably should improve that part :-)
Regards,
Hans
On Wed, Sep 28, 2011 at 13:12, Hans Verkuil hverkuil@xs4all.nl wrote:
On Wednesday, September 28, 2011 12:52:58 Daniel Vetter wrote:
On get_scatterlist walk the attachment list and grab the dma requirements of all devices (via some platform/dma specific stuff attached to struct device). Then walk through a list of dma allocators your platform provides (page_alloc, cma, ...) until you have one that satisfies the requirements and alloc the backing storage.
Then create an sg_list out of hit and map_sg it, like above.
Hmm. Here is where I run into problems. Right now the get_scatterlist op is set by the driver that exports the buffer. So each driver supporting dma_buf would have it's own get_scatterlist implementation, each able to walk the attachment list. This feels wrong to me. get_scatterlist is doing too much and this functionality should probably be split up somehow.
I think we'll end up with a default implementation that just creates a dma_buf objects and hides all the magic from drivers. But for special cases (like when using objects mapped into auto-tiling iommus) and drivers with special infrastructure (drm drivers) I think we should be able to override this.
Maybe we should add such a thing as an official function to the api. The x86 implementation should be fairly simple (array of struct page* for backing storage), on arm it would deal with all the insanity there is.
There is also the problem of how to determine whether all requirements are satisfied: that's currently left undefined.
I think that should be a platform/architecture problem. And I'm happy not to be forced to deal with arm ;-) In general, if anything of that arm-craziness leaks into the interface, that's bad, so I regard your question here as a good sign ...
For arm we probably need some helper functions to make this easier.
- Sharing of a special buffer object, like a tiled buffer remapping into
omap's TILER. attach would check whether the device can actually access that special io area (iirc on omap only certain devices can access the tiler). get_scatterlist would just create a 1-entry sg_list that points directly to the pyhsical address (in device space) of the remapped buffer. This is way I want get_scatterlist to return a mapped buffer (and for api cleanliness).
OK.
I think I would like to see a demo implementation and proper documentation. Both help a lot to bring the API into focus and identify what works and what doesn't. If it's hard to explain or to implement, then we probably should improve that part :-)
Yeah, definitely. I think it'll take a while for the arm-insanity-induced infrastructure to shape up unfortunately, but maybe I can hack something together on x86 with i915 as a proof of concept.
Cheers, Daniel
Hi Daniel,
On 09/28/2011 04:22 PM, Daniel Vetter wrote:
On Wed, Sep 28, 2011 at 10:51:08AM +0200, Hans Verkuil wrote:
On Tuesday, September 27, 2011 16:19:56 Daniel Vetter wrote:
Hi Hans,
I'll try to explain a bit, after all I've been pushing this attachment buisness quite a bit.
On Tue, Sep 27, 2011 at 03:24:24PM +0200, Hans Verkuil wrote:
OK, it is not clear to me what the purpose is of the attachments.
If I understand the discussion from this list correctly, then the idea is that each device that wants to use this buffer first attaches itself by calling dma_buf_attach().
Then at some point the application asks some driver to export the buffer. So the driver calls dma_buf_export() and passes its own dma_buf_ops. In other words, this becomes the driver that 'controls' the memory, right?
Actually, the ordering is the other way round. First, some driver calls dam_buf_export, userspace then passes around the fd to all other drivers, they do an import and call attach. While all this happens, the driver that exported the dma_buf does not (yet) allocate any backing storage.
Ah. This really should be documented better :-)
Another driver that receives the fd will call dma_buf_get() and can then call e.g. get_scatterlist from dma_buf->ops. (As an aside: I would make inline functions that take a dma_buf pointer and call the corresponding op, rather than requiring drivers to go through ops directly)
Well, drivers should only call get_scatterlist when they actually need to access the memory. This way the originating driver can go through the list of all attached devices and decide where to allocate backing storage on the first get_scatterlist.
Time is something I am now worried with above approach. In Linaro connect in Cambridge, I had raised an idea of drivers registering their memory requirement capabilities at their startup (and vice-versa). I will try to elaborate it again below:
Driver A and Driver B wish to share the same set of buffer pool. For some reason both have different buffer allocation requirements (one needs contiguous allocation and other can use its own IOMMU to manage the buffer chunks) When driver A probe is called, it will *also* register its capabilities with some buffer bank (we can think of extending vb2 in case of v4l2). Similarly the driver B will register its capabilities. Where to keep this piece of meta-data, I am not sure at this point of time. When the first allocation call happens (from application, which may call A or B in whatever order), the allocator will check the common set of buffer requirements and then make appropriate allocation, going through all the drivers buffer capability information (which should be quick). Even if the application asks the drivers to release buffers and try to request again after some time, the allocation will not take much time, as the meta-data will be ready with the (central)buffer bank. However saying that, I am again not sure how to consider scenario when the buffer capabilities of both devices dont match at all. May be, such cases we can take care during the driver design itself.
For above, I was thinking of some multimedia application like camera. User may open the camera app, which starts showing preview. He may choose to close it and then immediately reopen it for his crazy reasons. If we take too much time just to decide the kind of buffer to allocate, the start-up time of the application will be too much.
Unless I am mistaken there is nothing there in the attachment datastructure at this moment to help determine the best device to use for the scatterlist. Right? It's just a list of devices together with an opaque pointer.
Well, I want a struct device pointer in there, not something opaque. But yes, an attachment is just bookkeeping.
Another thing I don't get (sorry, I must be really dense today) is that the get_scatterlist op is set by the dma_buf_export call. I would expect that that op is part of the attachment data since I would expect that to be device specific.
There's a dma_buf_attachment parameter so you can get at the device the mapping is for.
At least, my assumption is that the actual memory is allocated by the first call to get_scatterlist?
Yes.
Actually, I think this might be key: who allocates the actual memory and when? There is no documentation whatsoever on this crucial topic.
Ok, let me elaborate a bit on how I see this. The memory will be allocated by whichever driver exported the dma_buf originally. We probably end up with a bunch of helper fucntion for the common case, but this approach allows crazy stuff like sharing special memory (e.g. TILER on omap) which can only be accessed by certain devices. I think this is easieast with a few examples:
On a sane hw, e.g. buffer exported by i915 on x86: attach is a noop (and hence should be optional), in get_scatterlist I'll just create a sg_list out of the struct page * array I get from the gem backing storage and the call map_sg and return that sg_list (now mapped into the device address space).
On insane hw with strange dma requirements (like dram bank placement, contigious, hugepagesize for puny tlbs, ...): Attach could check whether the buffer is already fixed or not (e.g. the kernel fb console probably can't be moved) and if the fixed buffer could be mapped by the device. That's we attach needs a return value. If the buffer is not pinned or not yet allocated, do nothing.
On get_scatterlist walk the attachment list and grab the dma requirements of all devices (via some platform/dma specific stuff attached to struct device). Then walk through a list of dma allocators your platform provides (page_alloc, cma, ...) until you have one that satisfies the requirements and alloc the backing storage.
Then create an sg_list out of hit and map_sg it, like above.
For arm we probably need some helper functions to make this easier.
Sharing of a special buffer object, like a tiled buffer remapping into omap's TILER. attach would check whether the device can actually access that special io area (iirc on omap only certain devices can access the tiler). get_scatterlist would just create a 1-entry sg_list that points directly to the pyhsical address (in device space) of the remapped buffer. This is way I want get_scatterlist to return a mapped buffer (and for api cleanliness).
Cheers, Daniel
Regards, Subash
linaro-mm-sig@lists.linaro.org