Linaro-mm-sig October 2013

linaro-mm-sig@lists.linaro.org

12 participants
5 discussions

[PATCH 0/2] Revert support for reserved memory regions defined in device tree

by Marek Szyprowski

Hi all! Benjamin Herrenschmidt pointed a few issues in the proposed design of device tree bindings for contiguous memory allocator and reserved memory regions: https://lkml.org/lkml/2013/9/15/151 http://www.spinics.net/lists/arm-kernel/msg273548.html Some time has passed, but there is still no consensus on the bindings for the reserved memory and various drawback of this solution has been shown, so in my opinion the best I can do now is to revert them completely and start from scratch again later. This patch series reverts patches related to device tree bindings proposed in the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/263216 and merged by commit 64c353864e3f7ccba0ade1bd6f562f9a3bc7e68d ("Merge branch 'for-v3.12' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping"). Best regards Marek Szyprowski Samsung R&D Institute Poland Marek Szyprowski (2): Revert "ARM: init: add support for reserved memory defined by device tree" Revert "drivers: of: add initialization code for dma reserved memory" Documentation/devicetree/bindings/memory.txt | 168 ------------------------- arch/arm/mm/init.c | 3 - drivers/of/Kconfig | 6 - drivers/of/Makefile | 1 - drivers/of/of_reserved_mem.c | 173 -------------------------- drivers/of/platform.c | 4 - include/linux/of_reserved_mem.h | 14 --- 7 files changed, 369 deletions(-) delete mode 100644 Documentation/devicetree/bindings/memory.txt delete mode 100644 drivers/of/of_reserved_mem.c delete mode 100644 include/linux/of_reserved_mem.h -- 1.7.9.5

11 years, 7 months

dma-buf non-coherent mmap

by Thomas Hellstrom

Hi! I'm just looking over what's needed to implement drm Prime / dma-buf exports + imports in the vmwgfx driver. It seems like most dma-bufs ops are quite straightforward to implement except user-space mmap(). The reason being that vmwgfx dma-bufs will be using completely non-coherent memory, whenever there needs to be CPU accesses. The accelerated contents resides in an opaque structure on the device into which we can DMA to and from, so for mmap to work we need to zap ptes and DMA to the device when doing something accelerated, and on the first page-fault DMA data back and wait for idle if the device did a write to the dma-buf. Now this shouldn't really be a problem if dma-bufs were only used for cross-device sharing, but since people apparently want to use dma-buf file handles to share CPU data between processes it really becomes a serious problem. Needless to say we'd want to limit the size of the DMAs, and have mmap users request regions for read, and mark regions dirty for write, something similar to gallium's texture transfer objects. Any ideas? /Thomas

11 years, 7 months

thoughts of looking at android fences

by Maarten Lankhorst

Hey, So I took a look at the sync stuff in android, in a lot of ways I believe that they're similar, yet subtly different. Most of the stuff I looked at is from the sync.h header in drivers/staging, so maybe my knowledge is incomplete. The timeline is similar to what I called a fence context. Each command stream on a gpu can have a context. Because nvidia hardware can have 4095 separate timelines, I didn't want to keep the bookkeeping for each timeline, although I guess that it's already done. Maybe it could be done in a unified way for each driver, making a transition to timelines that can be used by android easier. I did not have an explicit syncpoint addition, but I think that sync points + sync_fence were similar to what I did with my dma-fence stuff, except slightly different. In my approach the dma-fence is signaled after all sync_points are done AND the queued commands are executed. In effect the dma-fence becomes the next syncpoint, depending on all previous dma-fence syncpoints. An important thing to note is that dma-fence is kernelspace only, so it might be better to rename it to syncpoint, and use fence for the userspace interface. A big difference is locking, I assume in my code that most fences emitted are not waited on, so the fastpath fence_signal is a test_and_set_bit plus test_bit. A single lock is used for the waitqueue and callbacks, with the waitqueue being implemented internally as an asynchronous callback. The lock is provided by the driver, which makes adding support for old hardware that has no reliable way of notifying completion of events easier. I avoided using global locks, but I think for debugfs support I may end up having to add some. The dma fence looks similar overall, except that I allow overriding some stuff and keep less track about state. I do believe that I can create a userspace interface around dma_fence that works similar to android, and the kernel space interface could be done in a similar way too. One thing though: is it really required to merge fences? It seems to me that if I add a poll callback userspace could simply do a poll on a list of fences. This would give userspace all the information it needs about each individual fence. The thing about wait/wound mutexes can be ignored for this discussion. It's really just a method of adding a fence to a dma-buf, and building a list of all dma-fences to wait on in the kernel before starting a command buffer, and setting a new fence to all the dma-bufs to signal completion of the event. Regardless of the sync mechanism we'll decide on, this stuff wouldn't change. Depending on feedback I'll try reflashing my nexus 7 to stock android, and work on trying to convert android syncpoints to dma-fence, which I'll probably rename to syncpoints. ~Maarten

11 years, 7 months

[PATCH] gpu: drm: Add helpers to allow export gem cma objects

by benjamin.gaignard＠linaro.org

From: Benjamin Gaignard <benjamin.gaignard(a)linaro.org> DRM already offer helpers to use CMA for dumb buffers. This patch add helpers to export/import gem_cam objects and allow them to be mmap from userland. The goal is to make working this kind of sequence: create_dumb, get fd from buffer handle and then use fd (maybe in another process which may ignore it is comming from DRM) to mmap the buffer. drm_gem_cma_prime_export() add O_RDWR to flags to be sure that memory could be mmapped later with PROT_WRITE flag. Signed-off-by: Benjamin Gaignard <benjamin.gaignard(a)linaro.org> --- drivers/gpu/drm/drm_gem_cma_helper.c | 192 ++++++++++++++++++++++++++++++++++ include/drm/drm_gem_cma_helper.h | 6 ++ 2 files changed, 198 insertions(+) diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c b/drivers/gpu/drm/drm_gem_cma_helper.c index bad85bb..936c337 100644 --- a/drivers/gpu/drm/drm_gem_cma_helper.c +++ b/drivers/gpu/drm/drm_gem_cma_helper.c @@ -21,12 +21,204 @@ #include <linux/slab.h> #include <linux/mutex.h> #include <linux/export.h> +#include <linux/dma-buf.h> #include <linux/dma-mapping.h> #include <drm/drmP.h> #include <drm/drm.h> #include <drm/drm_gem_cma_helper.h> +struct drm_gem_cma_dmabuf_attachment { + struct sg_table sgt; + enum dma_data_direction dir; + bool is_mapped; +}; + +static int drm_gem_cma_attach_dma_buf(struct dma_buf *dmabuf, + struct device *dev, + struct dma_buf_attachment *attach) +{ + struct drm_gem_cma_dmabuf_attachment *drm_gem_cma_attach; + + drm_gem_cma_attach = kzalloc(sizeof(*drm_gem_cma_attach), GFP_KERNEL); + if (!drm_gem_cma_attach) + return -ENOMEM; + + drm_gem_cma_attach->dir = DMA_NONE; + attach->priv = drm_gem_cma_attach; + + return 0; +} + +static void drm_gem_cma_detach_dma_buf(struct dma_buf *dmabuf, + struct dma_buf_attachment *attach) +{ + struct drm_gem_cma_dmabuf_attachment *drm_gem_cma_attach = attach->priv; + struct sg_table *sgt; + + if (!drm_gem_cma_attach) + return; + + sgt = &drm_gem_cma_attach->sgt; + + if (drm_gem_cma_attach->dir != DMA_NONE) + dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, + drm_gem_cma_attach->dir); + + sg_free_table(sgt); + kfree(drm_gem_cma_attach); + attach->priv = NULL; +} + +static struct sg_table * +drm_gem_cma_map_dma_buf(struct dma_buf_attachment *attach, + enum dma_data_direction dir) +{ + struct drm_gem_cma_dmabuf_attachment *drm_gem_cma_attach = attach->priv; + struct drm_gem_cma_object *cma_obj = attach->dmabuf->priv; + struct drm_device *dev = cma_obj->base.dev; + struct sg_table *sgt = NULL; + int nents, ret; + + /* just return current sgt if already requested. */ + if (drm_gem_cma_attach->dir == dir && drm_gem_cma_attach->is_mapped) + return &drm_gem_cma_attach->sgt; + + sgt = &drm_gem_cma_attach->sgt; + + ret = dma_common_get_sgtable(dev->dev, sgt, + cma_obj->vaddr, cma_obj->paddr, cma_obj->base.size); + if (ret) { + DRM_ERROR("failed to get sgt.\n"); + return ERR_PTR(-ENOMEM); + } + + mutex_lock(&dev->struct_mutex); + + if (dir != DMA_NONE) { + nents = dma_map_sg(attach->dev, sgt->sgl, sgt->orig_nents, dir); + if (!nents) { + DRM_ERROR("failed to map sgl with iommu.\n"); + sg_free_table(sgt); + sgt = ERR_PTR(-EIO); + goto err_unlock; + } + } + + drm_gem_cma_attach->is_mapped = true; + drm_gem_cma_attach->dir = dir; + attach->priv = drm_gem_cma_attach; + +err_unlock: + mutex_unlock(&dev->struct_mutex); + return sgt; +} + +static void drm_gem_cma_unmap_dma_buf(struct dma_buf_attachment *attach, + struct sg_table *sgt, + enum dma_data_direction dir) +{ + /* Nothing to do */ +} + +static void drm_gem_cma_dmabuf_release(struct dma_buf *dmabuf) +{ + struct drm_gem_cma_object *cma_obj = dmabuf->priv; + + /* + * drm_gem_cma_dmabuf_release() call means that file object's + * f_count is 0 and it calls drm_gem_object_handle_unreference() + * to drop the references that these values had been increased + * at drm_prime_handle_to_fd() + */ + if (cma_obj->base.export_dma_buf == dmabuf) { + cma_obj->base.export_dma_buf = NULL; + + /* + * drop this gem object refcount to release allocated buffer + * and resources. + */ + drm_gem_object_unreference_unlocked(&cma_obj->base); + } +} + +static void *drm_gem_cma_dmabuf_kmap_atomic(struct dma_buf *dma_buf, + unsigned long page_num) +{ + struct drm_gem_cma_object *cma_obj = dma_buf->priv; + return cma_obj->paddr; +} + +static int drm_gem_cma_dmabuf_mmap(struct dma_buf *dmabuf, + struct vm_area_struct *vma) +{ + struct drm_gem_cma_object *cma_obj = dmabuf->priv; + struct drm_device *dev = cma_obj->base.dev; + int ret; + + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); + + ret = dma_mmap_coherent(dev->dev, vma, + cma_obj->vaddr, cma_obj->paddr, cma_obj->base.size); + + if (ret) { + DRM_DEBUG_PRIME("Remapping memory failed, error: %d\n", ret); + return ret; + } + DRM_DEBUG_PRIME("%s: mapped dma addr 0x%08lx at 0x%08lx, size %ld\n", + __func__, (unsigned long)cma_obj->paddr, vma->vm_start, + cma_obj->base.size); + + return ret; +} + +static struct dma_buf_ops drm_gem_cma_dmabuf_ops = { + .attach = drm_gem_cma_attach_dma_buf, + .detach = drm_gem_cma_detach_dma_buf, + .map_dma_buf = drm_gem_cma_map_dma_buf, + .unmap_dma_buf = drm_gem_cma_unmap_dma_buf, + .kmap = drm_gem_cma_dmabuf_kmap_atomic, + .kmap_atomic = drm_gem_cma_dmabuf_kmap_atomic, + .mmap = drm_gem_cma_dmabuf_mmap, + .release = drm_gem_cma_dmabuf_release, +}; + +struct dma_buf *drm_gem_cma_prime_export(struct drm_device *drm_dev, + struct drm_gem_object *obj, int flags) +{ + struct drm_gem_cma_object *cma_obj = to_drm_gem_cma_obj(obj); + + flags |= O_RDWR; + return dma_buf_export(cma_obj, &drm_gem_cma_dmabuf_ops, + cma_obj->base.size, flags); +} +EXPORT_SYMBOL_GPL(drm_gem_cma_prime_export); + +struct drm_gem_object *drm_gem_cma_prime_import(struct drm_device *drm_dev, + struct dma_buf *dmabuf) +{ + struct drm_gem_cma_object *cma_obj; + + if (dmabuf->ops == &drm_gem_cma_dmabuf_ops) { + struct drm_gem_object *obj; + + cma_obj = dmabuf->priv; + obj = &cma_obj->base; + + /* is it from our device? */ + if (obj->dev == drm_dev) { + /* + * Importing dmabuf exported from out own gem increases + * refcount on gem itself instead of f_count of dmabuf. + */ + drm_gem_object_reference(obj); + return obj; + } + } + return NULL; +} +EXPORT_SYMBOL_GPL(drm_gem_cma_prime_import); + static unsigned int get_gem_mmap_offset(struct drm_gem_object *obj) { return (unsigned int)obj->map_list.hash.key << PAGE_SHIFT; diff --git a/include/drm/drm_gem_cma_helper.h b/include/drm/drm_gem_cma_helper.h index 63397ce..8ce21df 100644 --- a/include/drm/drm_gem_cma_helper.h +++ b/include/drm/drm_gem_cma_helper.h @@ -45,4 +45,10 @@ extern const struct vm_operations_struct drm_gem_cma_vm_ops; void drm_gem_cma_describe(struct drm_gem_cma_object *obj, struct seq_file *m); #endif +struct dma_buf *drm_gem_cma_prime_export(struct drm_device *drm_dev, + struct drm_gem_object *obj, int flags); + +struct drm_gem_object *drm_gem_cma_prime_import(struct drm_device *drm_dev, + struct dma_buf *dmabuf); + #endif /* __DRM_GEM_CMA_HELPER_H__ */ -- 1.7.9.5

11 years, 8 months

Re: [Linaro-mm-sig] [PATCH 0/2] Revert support for reserved memory regions defined in device tree

by Benjamin Herrenschmidt

On Thu, 2013-10-17 at 13:37 -0500, Matt Sealey wrote: > This may be late, but please can you consider re-using the CHRP > reserved node (i.e. device_type = "reserved")? > > Since it does exactly the same thing, is well defined since the dark ages? > > It's CHRP 1.7 section 5.9 by the way (just before /chosen gets defined). > > It would solve a selection of the issues; and require zero binding > work except describing perhaps a couple freakish Linux-specific > properties that may be only as intrusive as, say, linux,initrd would > be in /chosen. > > The most effective, multi-OS way of using it ("available" property not > currently implemented in Linux for some reason, but it could come in > so handy - not only because it matches the way Linux resource > structures are handled) First, the original /reserved on CHRP was supposed to be about reserved bus space for things like hidden HW devices, but yes, it could be used for that. However it would be nice to enrich the binding to provide at least some kind of specific identification of what a given reserved area is about, see my comments about that in the previous threads. The available property is of no use to us. It purely indicates what is available while OFW is still running. Once we get rid of OFW its content is utterly meaningless. The original OFW was design with the idea that OFW remains alive along with the operating system, and that has been done on some platforms, but that idea has been ditched very early on in powerpc space for many reasons, one of them being that most implementations of OFW around were way to broken to bother. > memory@0x70000000 { > device_type = "memory"; > reg = <0x70000000 0x40000000>; > available = <0x70000000 0x10000000 > 0x90000000 0x1ffc00000>; /* top 16KiB of memory > is where the secure firmware keeps it's mailboxes */ > }; > freaky-codec-memory: reserved@0x80000000 { > device_type = "reserved"; > reg = <0x80000000 0x10000000>; > available = <0x80000000 0x8000000 > 0x88000000 0x8000000>; /* two 128MiB buffers */ > non-objectionable-mark-as-contiguous-property-name-here; > cacheable; > }; > > Any driver that has, previously, required a bunch of it's own memory > carved out of DDR *should* be gaining a phandle reference to that > reserved node however it likes (it would be up to that devices' > binding). > > On Linux under CMA, it may well be ignored and just stuffed into the > generic CMA regions, and the driver MAY allocate anywhere it likes, > but it COULD ask for memory based on a region phandle or, horribly, by > name (since there's no other way to search for it, the OF "name" for > reserved SHALL be "reserved") and be given memory in that region > defined by the reserved node if it had any addressing restrictions. > > /videodecoder@0x43f01000 { > compatible = "freaky,codec"; > : > decode-buffer = &freaky-codec-memory; > }; > > On another OS it may manually map and use a custom allocator for that > memory region, since otherwise the OS would not have even looked at > it. > > Also this discussion of Jeremy Kerr's proposal seems to be 'missing' > on Google. Do you happen to have a link to it? > > Thanks, > Matt Sealey <neko(a)bakuhatsu.net>

11 years, 8 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig October 2013