From rob.clark at linaro.org Tue May 1 16:04:51 2012 From: rob.clark at linaro.org (Rob Clark) Date: Tue, 1 May 2012 11:04:51 -0500 Subject: [Linaro-mm-sig] [PATCH] drm: pass dev to drm_vm_{open, close}_locked() Message-ID: <1335888291-9311-1-git-send-email-rob.clark@linaro.org> From: Rob Clark Previously these functions would assume that vma->vm_file was the drm_file. Although if in some cases if the drm driver needs to use something else for the backing file (such as the tmpfs filp) then this assumption is no longer true. But vma->vm_private_data is still the GEM object. With this change, now the drm_device comes from the GEM object rather than the drm_file so the driver is more free to play with vma->vm_file. The scenario where this comes up is for mmap'ing of cached dmabuf's for non-coherent systems, where the driver needs to use fault handling and PTE shootdown to simulate coherency. We can't use the vma->vm_file of the dmabuf, which is using anon_inode's address_space. The most straightforward thing to do is to use the GEM object's obj->filp for vma->vm_file in all cases, for which we need this patch. Signed-off-by: Rob Clark --- drivers/gpu/drm/drm_gem.c | 6 +++--- drivers/gpu/drm/drm_vm.c | 18 ++++++++---------- include/drm/drmP.h | 4 ++-- 3 files changed, 13 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 83114b5..475b300 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -628,7 +628,7 @@ void drm_gem_vm_open(struct vm_area_struct *vma) drm_gem_object_reference(obj); mutex_lock(&obj->dev->struct_mutex); - drm_vm_open_locked(vma); + drm_vm_open_locked(obj->dev, vma); mutex_unlock(&obj->dev->struct_mutex); } EXPORT_SYMBOL(drm_gem_vm_open); @@ -639,7 +639,7 @@ void drm_gem_vm_close(struct vm_area_struct *vma) struct drm_device *dev = obj->dev; mutex_lock(&dev->struct_mutex); - drm_vm_close_locked(vma); + drm_vm_close_locked(obj->dev, vma); drm_gem_object_unreference(obj); mutex_unlock(&dev->struct_mutex); } @@ -712,7 +712,7 @@ int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma) */ drm_gem_object_reference(obj); - drm_vm_open_locked(vma); + drm_vm_open_locked(dev, vma); out_unlock: mutex_unlock(&dev->struct_mutex); diff --git a/drivers/gpu/drm/drm_vm.c b/drivers/gpu/drm/drm_vm.c index 1495618..961ee08 100644 --- a/drivers/gpu/drm/drm_vm.c +++ b/drivers/gpu/drm/drm_vm.c @@ -406,10 +406,9 @@ static const struct vm_operations_struct drm_vm_sg_ops = { * Create a new drm_vma_entry structure as the \p vma private data entry and * add it to drm_device::vmalist. */ -void drm_vm_open_locked(struct vm_area_struct *vma) +void drm_vm_open_locked(struct drm_device *dev, + struct vm_area_struct *vma) { - struct drm_file *priv = vma->vm_file->private_data; - struct drm_device *dev = priv->minor->dev; struct drm_vma_entry *vma_entry; DRM_DEBUG("0x%08lx,0x%08lx\n", @@ -430,14 +429,13 @@ static void drm_vm_open(struct vm_area_struct *vma) struct drm_device *dev = priv->minor->dev; mutex_lock(&dev->struct_mutex); - drm_vm_open_locked(vma); + drm_vm_open_locked(dev, vma); mutex_unlock(&dev->struct_mutex); } -void drm_vm_close_locked(struct vm_area_struct *vma) +void drm_vm_close_locked(struct drm_device *dev, + struct vm_area_struct *vma) { - struct drm_file *priv = vma->vm_file->private_data; - struct drm_device *dev = priv->minor->dev; struct drm_vma_entry *pt, *temp; DRM_DEBUG("0x%08lx,0x%08lx\n", @@ -467,7 +465,7 @@ static void drm_vm_close(struct vm_area_struct *vma) struct drm_device *dev = priv->minor->dev; mutex_lock(&dev->struct_mutex); - drm_vm_close_locked(vma); + drm_vm_close_locked(dev, vma); mutex_unlock(&dev->struct_mutex); } @@ -519,7 +517,7 @@ static int drm_mmap_dma(struct file *filp, struct vm_area_struct *vma) vma->vm_flags |= VM_RESERVED; /* Don't swap */ vma->vm_flags |= VM_DONTEXPAND; - drm_vm_open_locked(vma); + drm_vm_open_locked(dev, vma); return 0; } @@ -670,7 +668,7 @@ int drm_mmap_locked(struct file *filp, struct vm_area_struct *vma) vma->vm_flags |= VM_RESERVED; /* Don't swap */ vma->vm_flags |= VM_DONTEXPAND; - drm_vm_open_locked(vma); + drm_vm_open_locked(dev, vma); return 0; } diff --git a/include/drm/drmP.h b/include/drm/drmP.h index dd73104..213a85a 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1309,8 +1309,8 @@ extern int drm_release(struct inode *inode, struct file *filp); /* Mapping support (drm_vm.h) */ extern int drm_mmap(struct file *filp, struct vm_area_struct *vma); extern int drm_mmap_locked(struct file *filp, struct vm_area_struct *vma); -extern void drm_vm_open_locked(struct vm_area_struct *vma); -extern void drm_vm_close_locked(struct vm_area_struct *vma); +extern void drm_vm_open_locked(struct drm_device *dev, struct vm_area_struct *vma); +extern void drm_vm_close_locked(struct drm_device *dev, struct vm_area_struct *vma); extern unsigned int drm_poll(struct file *filp, struct poll_table_struct *wait); /* Memory management support (drm_memory.h) */ -- 1.7.9.5 From laurent.pinchart at ideasonboard.com Mon May 7 13:27:36 2012 From: laurent.pinchart at ideasonboard.com (Laurent Pinchart) Date: Mon, 07 May 2012 15:27:36 +0200 Subject: [Linaro-mm-sig] [RFC 05/13] v4l: vb2-dma-contig: add support for DMABUF exporting In-Reply-To: <4F8FEC04.3030700@samsung.com> References: <1334063447-16824-1-git-send-email-t.stanislaws@samsung.com> <3143149.ZCeOjVLXgh@avalon> <4F8FEC04.3030700@samsung.com> Message-ID: <1813415.rG2izL3i2h@avalon> Hi Tomasz, Sorry for the late reply, this one slipped through the cracks. On Thursday 19 April 2012 12:42:12 Tomasz Stanislawski wrote: > On 04/17/2012 04:08 PM, Laurent Pinchart wrote: > > On Tuesday 10 April 2012 15:10:39 Tomasz Stanislawski wrote: > >> This patch adds support for exporting a dma-contig buffer using > >> DMABUF interface. > >> > >> Signed-off-by: Tomasz Stanislawski > >> Signed-off-by: Kyungmin Park > >> --- > > [snip] > > >> +static struct sg_table *vb2_dc_dmabuf_ops_map( > >> + struct dma_buf_attachment *db_attach, enum dma_data_direction dir) > >> +{ > >> + struct dma_buf *dbuf = db_attach->dmabuf; > >> + struct vb2_dc_buf *buf = dbuf->priv; > >> + struct vb2_dc_attachment *attach = db_attach->priv; > >> + struct sg_table *sgt; > >> + struct scatterlist *rd, *wr; > >> + int i, ret; > > > > You can make i an unsigned int :-) > > Right.. splitting declaration may be also a good idea :) > > >> + > >> + /* return previously mapped sg table */ > >> + if (attach) > >> + return &attach->sgt; > > > > This effectively keeps the mapping around as long as the attachment > > exists. We don't try to swap out buffers in V4L2 as is done in DRM at the > > moment, so it might not be too much of an issue, but the behaviour of the > > implementation will change if we later decide to map/unmap the buffers in > > the map/unmap handlers. Do you think that could be a problem ? > > I don't that it is a problem. If an importer calls dma_map_sg then caching > sgt on an exporter side reduces a cost of an allocating and an > initialization of sgt. > > >> + > >> + attach = kzalloc(sizeof *attach, GFP_KERNEL); > >> + if (!attach) > >> + return ERR_PTR(-ENOMEM); > > > > Why don't you allocate the vb2_dc_attachment here instead of > > vb2_dc_dmabuf_ops_attach() ? > > Good point. > The attachment could be allocated at vb2_dc_attachment but all its > fields would be uninitialized. I mean an empty sgt and an undefined > dma direction. I decided to allocate the attachment in vb2_dc_dmabuf_ops_map > because only than all information needed to create a valid attachment > object are available. > > The other solution might be the allocation at vb2_dc_attachment. The field > dir would be set to DMA_NONE. If this filed is equal to DMA_NONE at > vb2_dc_dmabuf_ops_map then sgt is allocated and mapped and direction field > is updated. If value is not DMA_NONE then the sgt is reused. > > Do you think that it is a good idea? I think I would prefer that. It sounds more logical to allocate the attachment in the attach operation handler. > >> + sgt = &attach->sgt; > >> + attach->dir = dir; > >> + > >> + /* copying the buf->base_sgt to attachment */ > > > > I would add an explanation regarding why you need to copy the SG list. > > Something like. > > > > "Copy the buf->base_sgt scatter list to the attachment, as we can't map > > the same scatter list to multiple devices at the same time." > > ok > > >> + ret = sg_alloc_table(sgt, buf->sgt_base->orig_nents, GFP_KERNEL); > >> + if (ret) { > >> + kfree(attach); > >> + return ERR_PTR(-ENOMEM); > >> + } > >> + > >> + rd = buf->sgt_base->sgl; > >> + wr = sgt->sgl; > >> + for (i = 0; i < sgt->orig_nents; ++i) { > >> + sg_set_page(wr, sg_page(rd), rd->length, rd->offset); > >> + rd = sg_next(rd); > >> + wr = sg_next(wr); > >> + } > >> > >> + /* mapping new sglist to the client */ > >> + ret = dma_map_sg(db_attach->dev, sgt->sgl, sgt->orig_nents, dir); > >> + if (ret <= 0) { > >> + printk(KERN_ERR "failed to map scatterlist\n"); > >> + sg_free_table(sgt); > >> + kfree(attach); > >> + return ERR_PTR(-EIO); > >> + } > >> + > >> + db_attach->priv = attach; > >> + > >> + return sgt; > >> +} > >> + > >> +static void vb2_dc_dmabuf_ops_unmap(struct dma_buf_attachment > >> *db_attach, > >> + struct sg_table *sgt, enum dma_data_direction dir) > >> +{ > >> + /* nothing to be done here */ > >> +} > >> + > >> +static void vb2_dc_dmabuf_ops_release(struct dma_buf *dbuf) > >> +{ > >> + /* drop reference obtained in vb2_dc_get_dmabuf */ > >> + vb2_dc_put(dbuf->priv); > > > > Shouldn't you set vb2_dc_buf::dma_buf to NULL here ? Otherwise the next > > vb2_dc_get_dmabuf() call will return a DMABUF object that has been freed. > > No. > > The buffer object is destroyed at vb2_dc_put when reference count drops to > 0. It happens could happen after only REQBUF(count=0) or on last close(). > The DMABUF object is created only for MMAP buffers. The DMABUF object is > based only on results of dma_alloc_coherent and dma_get_pages (or its future > equivalent). Therefore the DMABUF object is valid as long as the buffer is > valid. OK. > Notice that dmabuf object could be created in vb2_dc_alloc. I moved it to > vb2_dc_get_dmabuf to avoid a creation of an object that may not be used. > > >> +} > >> + > >> +static struct dma_buf_ops vb2_dc_dmabuf_ops = { > >> + .attach = vb2_dc_dmabuf_ops_attach, > >> + .detach = vb2_dc_dmabuf_ops_detach, > >> + .map_dma_buf = vb2_dc_dmabuf_ops_map, > >> + .unmap_dma_buf = vb2_dc_dmabuf_ops_unmap, > >> + .release = vb2_dc_dmabuf_ops_release, > >> +}; > >> + > >> +static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv) > >> +{ > >> + struct vb2_dc_buf *buf = buf_priv; > >> + struct dma_buf *dbuf; > >> + > >> + if (buf->dma_buf) > >> + return buf->dma_buf; > > > > Can't there be a race condition here if the user closes the DMABUF file > > handle before vb2 core calls dma_buf_fd() ? > > The user cannot access the file until it is associated with a file > descriptor. How can the user close it? Could you give me a more detailed > description of this potential race condition? Let's assume the V4L2 buffer has already been exported once. buf->dma_buf is set to a non-NULL value, and the application has an open file handle for the buffer. The application then tries to export the buffer a second time. vb2_dc_get_dmabuf() gets called, checks buf->dma_buf and returns it as it's non-NULL. Right after that, before the vb2 core calls dma_buf_fd() on the struct dma_buf, the application closes the file handle to the exported buffer. The struct dma_buf object gets freed, as the reference count drops to 0. The vb2 core will then try to call dma_buf_fd() on a dma_buf object that has been freed. > >> + /* dmabuf keeps reference to vb2 buffer */ > >> + atomic_inc(&buf->refcount); > >> + dbuf = dma_buf_export(buf, &vb2_dc_dmabuf_ops, buf->size, 0); > >> + if (IS_ERR(dbuf)) { > >> + atomic_dec(&buf->refcount); > >> + return NULL; > >> + } > >> + > >> + buf->dma_buf = dbuf; > >> + > >> + return dbuf; > >> +} -- Regards, Laurent Pinchart From subashrp at gmail.com Mon May 7 14:38:25 2012 From: subashrp at gmail.com (Subash Patel) Date: Mon, 07 May 2012 20:08:25 +0530 Subject: [Linaro-mm-sig] [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> References: <1334933134-4688-1-git-send-email-t.stanislaws@samsung.com> <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> Message-ID: <4FA7DE61.7000705@gmail.com> Hello Thomasz, Laurent, I found an issue in the function vb2_dc_pages_to_sgt() below. I saw that during the attach, the size of the SGT and size requested mis-matched (by atleast 8k bytes). Hence I made a small correction to the code as below. I could then attach the importer properly. Regards, Subash On 04/20/2012 08:15 PM, Tomasz Stanislawski wrote: > From: Andrzej Pietrasiewicz > > This patch introduces usage of dma_map_sg to map memory behind > a userspace pointer to a device as dma-contiguous mapping. > > Signed-off-by: Andrzej Pietrasiewicz > Signed-off-by: Marek Szyprowski > [bugfixing] > Signed-off-by: Kamil Debski > [bugfixing] > Signed-off-by: Tomasz Stanislawski > [add sglist subroutines/code refactoring] > Signed-off-by: Kyungmin Park > --- > drivers/media/video/videobuf2-dma-contig.c | 279 ++++++++++++++++++++++++++-- > 1 files changed, 262 insertions(+), 17 deletions(-) > > diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c > index 476e536..9cbc8d4 100644 > --- a/drivers/media/video/videobuf2-dma-contig.c > +++ b/drivers/media/video/videobuf2-dma-contig.c > @@ -11,6 +11,8 @@ > */ > > #include > +#include > +#include > #include > #include > > @@ -22,6 +24,8 @@ struct vb2_dc_buf { > void *vaddr; > unsigned long size; > dma_addr_t dma_addr; > + enum dma_data_direction dma_dir; > + struct sg_table *dma_sgt; > > /* MMAP related */ > struct vb2_vmarea_handler handler; > @@ -32,6 +36,95 @@ struct vb2_dc_buf { > }; > > /*********************************************/ > +/* scatterlist table functions */ > +/*********************************************/ > + > +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, > + unsigned int n_pages, unsigned long offset, unsigned long size) > +{ > + struct sg_table *sgt; > + unsigned int chunks; > + unsigned int i; > + unsigned int cur_page; > + int ret; > + struct scatterlist *s; > + > + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); > + if (!sgt) > + return ERR_PTR(-ENOMEM); > + > + /* compute number of chunks */ > + chunks = 1; > + for (i = 1; i< n_pages; ++i) > + if (pages[i] != pages[i - 1] + 1) > + ++chunks; > + > + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); > + if (ret) { > + kfree(sgt); > + return ERR_PTR(-ENOMEM); > + } > + > + /* merging chunks and putting them into the scatterlist */ > + cur_page = 0; > + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { > + unsigned long chunk_size; > + unsigned int j; size = PAGE_SIZE; > + > + for (j = cur_page + 1; j< n_pages; ++j) for (j = cur_page + 1; j < n_pages; ++j) { > + if (pages[j] != pages[j - 1] + 1) > + break; size += PAGE } > + > + chunk_size = ((j - cur_page)<< PAGE_SHIFT) - offset; > + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); [DELETE] size -= chunk_size; > + offset = 0; > + cur_page = j; > + } > + > + return sgt; > +} > + > +static void vb2_dc_release_sgtable(struct sg_table *sgt) > +{ > + sg_free_table(sgt); > + kfree(sgt); > +} > + > +static void vb2_dc_sgt_foreach_page(struct sg_table *sgt, > + void (*cb)(struct page *pg)) > +{ > + struct scatterlist *s; > + unsigned int i; > + > + for_each_sg(sgt->sgl, s, sgt->nents, i) { > + struct page *page = sg_page(s); > + unsigned int n_pages = PAGE_ALIGN(s->offset + s->length) > + >> PAGE_SHIFT; > + unsigned int j; > + > + for (j = 0; j< n_pages; ++j, ++page) > + cb(page); > + } > +} > + > +static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt) > +{ > + struct scatterlist *s; > + dma_addr_t expected = sg_dma_address(sgt->sgl); > + unsigned int i; > + unsigned long size = 0; > + > + for_each_sg(sgt->sgl, s, sgt->nents, i) { > + if (sg_dma_address(s) != expected) > + break; > + expected = sg_dma_address(s) + sg_dma_len(s); > + size += sg_dma_len(s); > + } > + return size; > +} > + > +/*********************************************/ > /* callbacks for all buffers */ > /*********************************************/ > > @@ -116,42 +209,194 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) > /* callbacks for USERPTR buffers */ > /*********************************************/ > > +static inline int vma_is_io(struct vm_area_struct *vma) > +{ > + return !!(vma->vm_flags& (VM_IO | VM_PFNMAP)); > +} > + > +static struct vm_area_struct *vb2_dc_get_user_vma( > + unsigned long start, unsigned long size) > +{ > + struct vm_area_struct *vma; > + > + /* current->mm->mmap_sem is taken by videobuf2 core */ > + vma = find_vma(current->mm, start); > + if (!vma) { > + printk(KERN_ERR "no vma for address %lu\n", start); > + return ERR_PTR(-EFAULT); > + } > + > + if (vma->vm_end - vma->vm_start< size) { > + printk(KERN_ERR "vma at %lu is too small for %lu bytes\n", > + start, size); > + return ERR_PTR(-EFAULT); > + } > + > + vma = vb2_get_vma(vma); > + if (!vma) { > + printk(KERN_ERR "failed to copy vma\n"); > + return ERR_PTR(-ENOMEM); > + } > + > + return vma; > +} > + > +static int vb2_dc_get_user_pages(unsigned long start, struct page **pages, > + int n_pages, struct vm_area_struct *vma, int write) > +{ > + if (vma_is_io(vma)) { > + unsigned int i; > + > + for (i = 0; i< n_pages; ++i, start += PAGE_SIZE) { > + unsigned long pfn; > + int ret = follow_pfn(vma, start,&pfn); > + > + if (ret) { > + printk(KERN_ERR "no page for address %lu\n", > + start); > + return ret; > + } > + pages[i] = pfn_to_page(pfn); > + } > + } else { > + unsigned int n; > + > + n = get_user_pages(current, current->mm, start& PAGE_MASK, > + n_pages, write, 1, pages, NULL); > + if (n != n_pages) { > + printk(KERN_ERR "got only %d of %d user pages\n", > + n, n_pages); > + while (n) > + put_page(pages[--n]); > + return -EFAULT; > + } > + } > + > + return 0; > +} > + > +static void vb2_dc_put_dirty_page(struct page *page) > +{ > + set_page_dirty_lock(page); > + put_page(page); > +} > + > +static void vb2_dc_put_userptr(void *buf_priv) > +{ > + struct vb2_dc_buf *buf = buf_priv; > + struct sg_table *sgt = buf->dma_sgt; > + > + dma_unmap_sg(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); > + if (!vma_is_io(buf->vma)) > + vb2_dc_sgt_foreach_page(sgt, vb2_dc_put_dirty_page); > + > + vb2_dc_release_sgtable(sgt); > + vb2_put_vma(buf->vma); > + kfree(buf); > +} > + > static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, > - unsigned long size, int write) > + unsigned long size, int write) > { > struct vb2_dc_buf *buf; > - struct vm_area_struct *vma; > - dma_addr_t dma_addr = 0; > - int ret; > + unsigned long start; > + unsigned long end; > + unsigned long offset; > + struct page **pages; > + int n_pages; > + int ret = 0; > + struct sg_table *sgt; > + unsigned long contig_size; > > buf = kzalloc(sizeof *buf, GFP_KERNEL); > if (!buf) > return ERR_PTR(-ENOMEM); > > - ret = vb2_get_contig_userptr(vaddr, size,&vma,&dma_addr); > + buf->dev = alloc_ctx; > + buf->dma_dir = write ? DMA_FROM_DEVICE : DMA_TO_DEVICE; > + > + start = vaddr& PAGE_MASK; > + offset = vaddr& ~PAGE_MASK; > + end = PAGE_ALIGN(vaddr + size); > + n_pages = (end - start)>> PAGE_SHIFT; > + > + pages = kmalloc(n_pages * sizeof pages[0], GFP_KERNEL); > + if (!pages) { > + ret = -ENOMEM; > + printk(KERN_ERR "failed to allocate pages table\n"); > + goto fail_buf; > + } > + > + buf->vma = vb2_dc_get_user_vma(start, size); > + if (IS_ERR(buf->vma)) { > + printk(KERN_ERR "failed to get VMA\n"); > + ret = PTR_ERR(buf->vma); > + goto fail_pages; > + } > + > + /* extract page list from userspace mapping */ > + ret = vb2_dc_get_user_pages(start, pages, n_pages, buf->vma, write); > if (ret) { > - printk(KERN_ERR "Failed acquiring VMA for vaddr 0x%08lx\n", > - vaddr); > - kfree(buf); > - return ERR_PTR(ret); > + printk(KERN_ERR "failed to get user pages\n"); > + goto fail_vma; > + } > + > + sgt = vb2_dc_pages_to_sgt(pages, n_pages, offset, size); > + if (IS_ERR(sgt)) { > + printk(KERN_ERR "failed to create scatterlist table\n"); > + ret = -ENOMEM; > + goto fail_get_user_pages; > + } > + > + /* pages are no longer needed */ > + kfree(pages); > + pages = NULL; > + > + sgt->nents = dma_map_sg(buf->dev, sgt->sgl, sgt->orig_nents, > + buf->dma_dir); > + if (sgt->nents<= 0) { > + printk(KERN_ERR "failed to map scatterlist\n"); > + ret = -EIO; > + goto fail_sgt; > } > > + contig_size = vb2_dc_get_contiguous_size(sgt); > + if (contig_size< size) { > + printk(KERN_ERR "contiguous mapping is too small %lu/%lu\n", > + contig_size, size); > + ret = -EFAULT; > + goto fail_map_sg; > + } > + > + buf->dma_addr = sg_dma_address(sgt->sgl); > buf->size = size; > - buf->dma_addr = dma_addr; > - buf->vma = vma; > + buf->dma_sgt = sgt; > > return buf; > -} > > -static void vb2_dc_put_userptr(void *mem_priv) > -{ > - struct vb2_dc_buf *buf = mem_priv; > +fail_map_sg: > + dma_unmap_sg(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > - if (!buf) > - return; > +fail_sgt: > + if (!vma_is_io(buf->vma)) > + vb2_dc_sgt_foreach_page(sgt, put_page); > + vb2_dc_release_sgtable(sgt); > + > +fail_get_user_pages: > + if (pages&& !vma_is_io(buf->vma)) > + while (n_pages) > + put_page(pages[--n_pages]); > > +fail_vma: > vb2_put_vma(buf->vma); > + > +fail_pages: > + kfree(pages); /* kfree is NULL-proof */ > + > +fail_buf: > kfree(buf); > + > + return ERR_PTR(ret); > } > > /*********************************************/ From t.stanislaws at samsung.com Mon May 7 15:11:31 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Mon, 07 May 2012 17:11:31 +0200 Subject: [Linaro-mm-sig] [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <4FA7DE61.7000705@gmail.com> References: <1334933134-4688-1-git-send-email-t.stanislaws@samsung.com> <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> <4FA7DE61.7000705@gmail.com> Message-ID: <4FA7E623.2060500@samsung.com> Hi Subash, Could you provide a detailed description of a test case that causes a failure of vb2_dc_pages_to_sgt? Regards, Tomasz Stanislawski From jesse.barker at linaro.org Mon May 7 16:00:25 2012 From: jesse.barker at linaro.org (Jesse Barker) Date: Mon, 7 May 2012 09:00:25 -0700 Subject: [Linaro-mm-sig] Minutes from V4L2 update call In-Reply-To: References: <20120320010534.GA7318@async.com.br> <20120322145455.GA30254@async.com.br> <20120322180101.GB21393@phenom.ffwll.local> <2913216.445obUR0t9@avalon> <4F8D56BF.8070201@samsung.com> Message-ID: Hi, I never saw this answered (sorry if it was and I just missed it) and it seemed like a generally useful detail to clarify, so here's my understanding (from Documentation/dma-buf-sharing.txt): When the importer calls dma_buf_map_attachment(), the struct sg_table* returned by the exporter will already have been appropriately mapped for the importer's IOMMU. This is expected as part of the API contract and is possible because of the struct device* passed in by the importer in the call to dma_buf_attach(). cheers, Jesse On Tue, Apr 17, 2012 at 4:58 AM, Abhinav Kochhar wrote: > Hi, > What about the mapping for importing devices which an IOMMU? > To update the mapping in page tables accessed by importing device's IOMMU do > we need to create a mapping in the exporter side or the importing device > must use the mapped sg returned by exporter and create a mapping for IOMMU? > > Regards, > Abhinav > > > On Tue, Apr 17, 2012 at 8:40 PM, Tomasz Stanislawski > wrote: >> >> Hello everyone, >> I would like to ask about the agreement on a behavior of >> a DMABUF exporter for dma_buf_map_attachment. >> >> According to the DMABUF spec the exporter should return a scatterlist >> mapped into importers DMA space. However there were issues about the >> concept. >> >> I made a short survey for mapping strategy for DMABUF patches for some >> drivers: >> >> 1. V4L2 - support for dmabuf importing hopefully consistent with dmabuf >> spec. >> ? The patch "v4l: vb2-dma-contig: change map/unmap behaviour for >> importers" >> ? implement DMA mapping performed on the importer side. However the patch >> ? can be dropped at no cost. >> 2. Exynos DRM - the latest version implements mapping on the exporter side >> 3. Omap/DRM - 'mapping' is done on exporter side by setting a physical >> address >> ? as DMA address in the scatterlist. The dma_map_sg should be used for >> this >> ? purpose >> 4. nouveau/i915 by Dave Airlie - mapping for client is done on importer >> side. >> ? http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-dmabuf2 >> >> Does it mean that it is agreed that the exporter is responsible for >> mapping into the client space? >> >> Regards, >> Tomasz Stanislawski >> >> >> On 03/27/2012 11:39 AM, Laurent Pinchart wrote: >> > Hi Daniel, >> > >> > On Thursday 22 March 2012 19:01:01 Daniel Vetter wrote: >> >> On Thu, Mar 22, 2012 at 11:54:55AM -0300, Christian Robottom Reis >> >> wrote: >> >>> ? ? Tomasz: proposed extension to DMA Mapping -- dma_get_pages >> >>> >> >>> ? ? ? ? Currently difficult to change the camera address into list of >> >>> ? ? ? ? pages >> >>> ? ? ? ? DMA framework has the knowledge of this list and could do this >> >>> >> >>> ? ? ? ? ? ? Depends on dma_get_pages >> >>> ? ? ? ? ? ? Needs to be merged first >> >>> >> >>> ? ? ? ? Test application posted to dri-devel with dependencies to run >> >>> demo >> >>> >> >>> ? ? ? ? ? ? Many dependencies >> >> >> >> I kinda missed to yell at this patch when it first showed up, so I'll >> >> do >> >> that here ;-) >> >> >> >> I think this is a gross layering violation and I don't like it at all. >> >> The >> >> entire point of the dma api is that device drivers only get to see >> >> device >> >> addresses and can forget about all the remapping/contig-alloc madness. >> >> And >> >> dma-buf should just follow this with it's map/unmap interfaces. >> >> >> >> Furthermore the exporter memory might simply not have any associated >> >> struct pages. The two examples I always bring up: >> >> - special purpose remapping units (like omap's TILER) which are managed >> >> by >> >> ? the exporter and can do crazy things like tiling or rotation >> >> ? transparently for all devices. >> >> - special carve-out memory which is unknown to linux memory management. >> >> ? drm/i915 is totally abusing this, mostly because windows is lame and >> >> ? doesn't have decent largepage allocation support. This is just plain >> >> ? system memory, but there's no struct page for it (because it's not >> >> part >> >> ? of the system map). >> > >> > I agree with you that the DMA API is the proper layer to abstract >> > physical >> > memory and provide devices with a DMA address. DMA addresses are >> > specific to a >> > device, while dma-buf needs to share buffers between separate devices >> > (otherwise it would be pretty pointless). As DMA address are >> > device-local, >> > they can't be used to describe a cross-device buffer. >> > >> > When allocating a buffer using the DMA API, memory is "allocated" behind >> > the >> > scene and mapped to the device address space ("allocated" in this case >> > means >> > anything from plain physical memory allocation to reservation of a >> > special- >> > purpose memory range, like in the OMAP TILER example). All the device >> > driver >> > gets to see is the DMA address and/or the DMA scatter list. So far, so >> > good. >> > >> > Then, when we want to share the memory with a second device, we need a >> > way to >> > map the memory to the second device's address space. There are several >> > options >> > here (and this is related to the "[RFCv2 PATCH 7/9] v4l: vb2-dma-contig: >> > change map/unmap behaviour" mail thread). >> > >> > - Let the importer driver map the memory to its own address space. This >> > makes >> > sense from the importer device's point of view, as that's where >> > knowledge >> > about the importer device is located (although you could argue that >> > knowledge >> > about the importer device is located in its struct device, which can be >> > passed >> > around - and I could agree with that). The importer driver would thus >> > need to >> > receive a cookie identifying the memory. As explained before, the >> > exporter's >> > DMA address isn't enough. There are various options here as well (list >> > of >> > pages or page frame numbers, exporter's DMA address + exporter's struct >> > device, a new kind of DMA API-related cookie, ... to just list a few). >> > The >> > importer driver would then use that cookie to map the memory to the >> > importer >> > device's address space (and this should most probably be implemented in >> > the >> > DMA API, which would require extensions). >> > >> > - Let the exporter driver map the memory to the importer device's >> > address >> > space. This makes sense from the exporter device's point of view, as >> > that's >> > where knowledge about the exported memory is located. In this case we >> > also >> > most probably want to extend the DMA API to handle the mapping >> > operation, and >> > we will need to pass the same kind of cookie as in the first option to >> > the >> > API. >> > >> >> Now the core dma api isn't fully up to snuff for everything yet and >> >> there >> >> are things missing. But it's certainly not dma_get_pages, but more >> >> things >> >> like mmap support for coherent memory or allocating coherent memroy >> >> which >> >> doesn't have a static mapping in the kernel address space. I very much >> >> hope that the interfaces we develop for dma-buf (and the insights >> >> gained) >> >> could help as examples here, so that in the further there's not such a >> >> gaping difference for the driver between dma_coherent allocations of >> >> it's >> >> own and imported buffer objects. >> > >> >> >> _______________________________________________ >> Linaro-mm-sig mailing list >> Linaro-mm-sig at lists.linaro.org >> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig > > > > _______________________________________________ > Linaro-mm-sig mailing list > Linaro-mm-sig at lists.linaro.org > http://lists.linaro.org/mailman/listinfo/linaro-mm-sig > From subashrp at gmail.com Tue May 8 05:41:08 2012 From: subashrp at gmail.com (Subash Patel) Date: Tue, 08 May 2012 11:11:08 +0530 Subject: [Linaro-mm-sig] [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <4FA7E623.2060500@samsung.com> References: <1334933134-4688-1-git-send-email-t.stanislaws@samsung.com> <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> <4FA7DE61.7000705@gmail.com> <4FA7E623.2060500@samsung.com> Message-ID: <4FA8B1F4.2000200@gmail.com> Hi Thomasz, I have extended the MFC-FIMC testcase posted by Kamil sometime ago to sanity test the UMM patches. This test is multi-threaded(further explanation for developers who may not have seen it yet), where thread one parses the encoded stream and feeds into the codec IP driver(aka MFC driver). Second thread will dequeue the buffer from MFC driver (DMA_BUF export) and queues it into a CSC IP(aka FIMC) driver(DMA_BUF import). Third thread dequeues the frame from FIMC driver and either pushes it into LCD driver for display or dumps to a flat file for analysis. MFC driver exports the fd's and FIMC driver imports and attaches it. During FIMC QBUF (thats when the attach and map happens), it is observed that in the function vb2_dc_map_dmabuf() call to vb2_dc_get_contiguous_size() fails. This is because contig_size < buf->size. contig_size is calculated from the SGT which we would have constructed in the function vb2_dc_pages_to_sgt(). Let me know if you need more details. Regards, Subash On 05/07/2012 08:41 PM, Tomasz Stanislawski wrote: > Hi Subash, > Could you provide a detailed description of a test case > that causes a failure of vb2_dc_pages_to_sgt? > > Regards, > Tomasz Stanislawski From laurent.pinchart at ideasonboard.com Tue May 8 09:14:44 2012 From: laurent.pinchart at ideasonboard.com (Laurent Pinchart) Date: Tue, 08 May 2012 11:14:44 +0200 Subject: [Linaro-mm-sig] [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <4FA7DE61.7000705@gmail.com> References: <1334933134-4688-1-git-send-email-t.stanislaws@samsung.com> <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> <4FA7DE61.7000705@gmail.com> Message-ID: <4675433.ieio0xx0Y0@avalon> Hi Subash, On Monday 07 May 2012 20:08:25 Subash Patel wrote: > Hello Thomasz, Laurent, > > I found an issue in the function vb2_dc_pages_to_sgt() below. I saw that > during the attach, the size of the SGT and size requested mis-matched > (by atleast 8k bytes). Hence I made a small correction to the code as > below. I could then attach the importer properly. Thank you for the report. Could you print the content of the sglist (number of chunks and size of each chunk) before and after your modifications, as well as the values of n_pages, offset and size ? > On 04/20/2012 08:15 PM, Tomasz Stanislawski wrote: [snip] > > +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, > > + unsigned int n_pages, unsigned long offset, unsigned long size) > > +{ > > + struct sg_table *sgt; > > + unsigned int chunks; > > + unsigned int i; > > + unsigned int cur_page; > > + int ret; > > + struct scatterlist *s; > > + > > + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); > > + if (!sgt) > > + return ERR_PTR(-ENOMEM); > > + > > + /* compute number of chunks */ > > + chunks = 1; > > + for (i = 1; i< n_pages; ++i) > > + if (pages[i] != pages[i - 1] + 1) > > + ++chunks; > > + > > + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); > > + if (ret) { > > + kfree(sgt); > > + return ERR_PTR(-ENOMEM); > > + } > > + > > + /* merging chunks and putting them into the scatterlist */ > > + cur_page = 0; > > + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { > > + unsigned long chunk_size; > > + unsigned int j; > > size = PAGE_SIZE; > > > + > > + for (j = cur_page + 1; j< n_pages; ++j) > > for (j = cur_page + 1; j < n_pages; ++j) { > > > + if (pages[j] != pages[j - 1] + 1) > > + break; > > size += PAGE > } > > > + > > + chunk_size = ((j - cur_page)<< PAGE_SHIFT) - offset; > > + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); > > [DELETE] size -= chunk_size; > > > + offset = 0; > > + cur_page = j; > > + } > > + > > + return sgt; > > +} -- Regards, Laurent Pinchart From subashrp at gmail.com Tue May 8 11:15:43 2012 From: subashrp at gmail.com (Subash Patel) Date: Tue, 08 May 2012 16:45:43 +0530 Subject: [Linaro-mm-sig] [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <4675433.ieio0xx0Y0@avalon> References: <1334933134-4688-1-git-send-email-t.stanislaws@samsung.com> <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> <4FA7DE61.7000705@gmail.com> <4675433.ieio0xx0Y0@avalon> Message-ID: <4FA9005F.6020901@gmail.com> Hi Laurent, On 05/08/2012 02:44 PM, Laurent Pinchart wrote: > Hi Subash, > > On Monday 07 May 2012 20:08:25 Subash Patel wrote: >> Hello Thomasz, Laurent, >> >> I found an issue in the function vb2_dc_pages_to_sgt() below. I saw that >> during the attach, the size of the SGT and size requested mis-matched >> (by atleast 8k bytes). Hence I made a small correction to the code as >> below. I could then attach the importer properly. > > Thank you for the report. > > Could you print the content of the sglist (number of chunks and size of each > chunk) before and after your modifications, as well as the values of n_pages, > offset and size ? I will put back all the printk's and generate this. As of now, my setup has changed and will do this when I get sometime. > >> On 04/20/2012 08:15 PM, Tomasz Stanislawski wrote: > > [snip] > >>> +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, >>> + unsigned int n_pages, unsigned long offset, unsigned long size) >>> +{ >>> + struct sg_table *sgt; >>> + unsigned int chunks; >>> + unsigned int i; >>> + unsigned int cur_page; >>> + int ret; >>> + struct scatterlist *s; >>> + >>> + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); >>> + if (!sgt) >>> + return ERR_PTR(-ENOMEM); >>> + >>> + /* compute number of chunks */ >>> + chunks = 1; >>> + for (i = 1; i< n_pages; ++i) >>> + if (pages[i] != pages[i - 1] + 1) >>> + ++chunks; >>> + >>> + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); >>> + if (ret) { >>> + kfree(sgt); >>> + return ERR_PTR(-ENOMEM); >>> + } >>> + >>> + /* merging chunks and putting them into the scatterlist */ >>> + cur_page = 0; >>> + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { >>> + unsigned long chunk_size; >>> + unsigned int j; >> >> size = PAGE_SIZE; >> >>> + >>> + for (j = cur_page + 1; j< n_pages; ++j) >> >> for (j = cur_page + 1; j< n_pages; ++j) { >> >>> + if (pages[j] != pages[j - 1] + 1) >>> + break; >> >> size += PAGE >> } >> >>> + >>> + chunk_size = ((j - cur_page)<< PAGE_SHIFT) - offset; >>> + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); >> >> [DELETE] size -= chunk_size; >> >>> + offset = 0; >>> + cur_page = j; >>> + } >>> + >>> + return sgt; >>> +} > Regards, Subash From subashrp at gmail.com Wed May 9 06:46:31 2012 From: subashrp at gmail.com (Subash Patel) Date: Wed, 09 May 2012 12:16:31 +0530 Subject: [Linaro-mm-sig] [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <4FA9005F.6020901@gmail.com> References: <1334933134-4688-1-git-send-email-t.stanislaws@samsung.com> <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> <4FA7DE61.7000705@gmail.com> <4675433.ieio0xx0Y0@avalon> <4FA9005F.6020901@gmail.com> Message-ID: <4FAA12C7.8020307@gmail.com> Hello Tomasz, Laurent, I have printed some logs during the dmabuf export and attach for the SGT issue below. Please find it in the attachment. I hope it will be useful. Regards, Subash On 05/08/2012 04:45 PM, Subash Patel wrote: > Hi Laurent, > > On 05/08/2012 02:44 PM, Laurent Pinchart wrote: >> Hi Subash, >> >> On Monday 07 May 2012 20:08:25 Subash Patel wrote: >>> Hello Thomasz, Laurent, >>> >>> I found an issue in the function vb2_dc_pages_to_sgt() below. I saw that >>> during the attach, the size of the SGT and size requested mis-matched >>> (by atleast 8k bytes). Hence I made a small correction to the code as >>> below. I could then attach the importer properly. >> >> Thank you for the report. >> >> Could you print the content of the sglist (number of chunks and size >> of each >> chunk) before and after your modifications, as well as the values of >> n_pages, >> offset and size ? > I will put back all the printk's and generate this. As of now, my setup > has changed and will do this when I get sometime. >> >>> On 04/20/2012 08:15 PM, Tomasz Stanislawski wrote: >> >> [snip] >> >>>> +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, >>>> + unsigned int n_pages, unsigned long offset, unsigned long size) >>>> +{ >>>> + struct sg_table *sgt; >>>> + unsigned int chunks; >>>> + unsigned int i; >>>> + unsigned int cur_page; >>>> + int ret; >>>> + struct scatterlist *s; >>>> + >>>> + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); >>>> + if (!sgt) >>>> + return ERR_PTR(-ENOMEM); >>>> + >>>> + /* compute number of chunks */ >>>> + chunks = 1; >>>> + for (i = 1; i< n_pages; ++i) >>>> + if (pages[i] != pages[i - 1] + 1) >>>> + ++chunks; >>>> + >>>> + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); >>>> + if (ret) { >>>> + kfree(sgt); >>>> + return ERR_PTR(-ENOMEM); >>>> + } >>>> + >>>> + /* merging chunks and putting them into the scatterlist */ >>>> + cur_page = 0; >>>> + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { >>>> + unsigned long chunk_size; >>>> + unsigned int j; >>> >>> size = PAGE_SIZE; >>> >>>> + >>>> + for (j = cur_page + 1; j< n_pages; ++j) >>> >>> for (j = cur_page + 1; j< n_pages; ++j) { >>> >>>> + if (pages[j] != pages[j - 1] + 1) >>>> + break; >>> >>> size += PAGE >>> } >>> >>>> + >>>> + chunk_size = ((j - cur_page)<< PAGE_SHIFT) - offset; >>>> + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); >>> >>> [DELETE] size -= chunk_size; >>> >>>> + offset = 0; >>>> + cur_page = j; >>>> + } >>>> + >>>> + return sgt; >>>> +} >> > Regards, > Subash -------------- next part -------------- [ 178.545000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.545000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.550000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.555000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 178.560000] vb2_dc_pages_to_sgt():89 size=4294836224 [ 178.565000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.570000] vb2_dc_pages_to_sgt():83 cur_page=32 [ 178.575000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.580000] vb2_dc_pages_to_sgt():86 chunk_size=262144 [ 178.585000] vb2_dc_pages_to_sgt():89 size=4294574080 [ 178.590000] vb2_dc_pages_to_sgt() sgt->orig_nents=1 [ 178.595000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.600000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.605000] vb2_dc_pages_to_sgt():86 chunk_size=8192 [ 178.610000] vb2_dc_pages_to_sgt():89 size=4294959104 [ 178.625000] vb2_dc_pages_to_sgt() sgt->orig_nents=1 [ 178.625000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.630000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.635000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 178.640000] vb2_dc_pages_to_sgt():89 size=4294836224 [ 178.645000] vb2_dc_pages_to_sgt() sgt->orig_nents=1 [ 178.650000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.655000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.660000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 178.665000] vb2_dc_pages_to_sgt():89 size=4294836224 [ 178.670000] vb2_dc_mmap: mapped dma addr 0x20060000 at 0xb6e01000, size 131072 [ 178.670000] vb2_dc_mmap: mapped dma addr 0x20080000 at 0xb6de1000, size 131072 [ 178.680000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.685000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.690000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.695000] vb2_dc_pages_to_sgt():86 chunk_size=4096 [ 178.700000] vb2_dc_pages_to_sgt():89 size=4294963200 [ 178.705000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.710000] vb2_dc_pages_to_sgt():83 cur_page=1 [ 178.715000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.715000] vb2_dc_pages_to_sgt():86 chunk_size=8192 [ 178.720000] vb2_dc_pages_to_sgt():89 size=4294955008 [ 178.725000] vb2_dc_pages_to_sgt() sgt->orig_nents=1 [ 178.730000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.735000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.740000] vb2_dc_pages_to_sgt():86 chunk_size=8192 [ 178.745000] vb2_dc_pages_to_sgt():89 size=4294959104 [ 178.750000] vb2_dc_pages_to_sgt() sgt->orig_nents=1 [ 178.755000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.760000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.765000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 178.770000] vb2_dc_pages_to_sgt():89 size=4294836224 [ 178.780000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.780000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.785000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.790000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 178.795000] vb2_dc_pages_to_sgt():89 size=4294901760 [ 178.800000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.805000] vb2_dc_pages_to_sgt():83 cur_page=16 [ 178.810000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.815000] vb2_dc_pages_to_sgt():86 chunk_size=393216 [ 178.820000] vb2_dc_pages_to_sgt():89 size=4294508544 [ 178.825000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 178.830000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.835000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.840000] vb2_dc_pages_to_sgt():86 chunk_size=32768 [ 178.845000] vb2_dc_pages_to_sgt():89 size=4294934528 [ 178.850000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 178.855000] vb2_dc_pages_to_sgt():83 cur_page=8 [ 178.855000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.860000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 178.865000] vb2_dc_pages_to_sgt():89 size=4294868992 [ 178.870000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 178.875000] vb2_dc_pages_to_sgt():83 cur_page=24 [ 178.880000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.885000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 178.890000] vb2_dc_pages_to_sgt():89 size=4294737920 [ 178.895000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.900000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.905000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.910000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 178.915000] vb2_dc_pages_to_sgt():89 size=4294901760 [ 178.920000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 178.925000] vb2_dc_pages_to_sgt():83 cur_page=16 [ 178.930000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.935000] vb2_dc_pages_to_sgt():86 chunk_size=393216 [ 178.940000] vb2_dc_pages_to_sgt():89 size=4294508544 [ 178.945000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 178.950000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 178.955000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.960000] vb2_dc_pages_to_sgt():86 chunk_size=32768 [ 178.965000] vb2_dc_pages_to_sgt():89 size=4294934528 [ 178.970000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 178.975000] vb2_dc_pages_to_sgt():83 cur_page=8 [ 178.975000] vb2_dc_pages_to_sgt():84 offset=0 [ 178.980000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 178.985000] vb2_dc_pages_to_sgt():89 size=4294868992 [ 178.990000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 178.995000] vb2_dc_pages_to_sgt():83 cur_page=24 [ 179.000000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.005000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 179.010000] vb2_dc_pages_to_sgt():89 size=4294737920 [ 179.015000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.020000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.025000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.030000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.035000] vb2_dc_pages_to_sgt():89 size=4294901760 [ 179.040000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.045000] vb2_dc_pages_to_sgt():83 cur_page=16 [ 179.050000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.055000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 179.060000] vb2_dc_pages_to_sgt():89 size=4294770688 [ 179.065000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.070000] vb2_dc_pages_to_sgt():83 cur_page=48 [ 179.075000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.080000] vb2_dc_pages_to_sgt():86 chunk_size=262144 [ 179.085000] vb2_dc_pages_to_sgt():89 size=4294508544 [ 179.090000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.095000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.100000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.100000] vb2_dc_pages_to_sgt():86 chunk_size=32768 [ 179.105000] vb2_dc_pages_to_sgt():89 size=4294934528 [ 179.110000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.115000] vb2_dc_pages_to_sgt():83 cur_page=8 [ 179.120000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.125000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.130000] vb2_dc_pages_to_sgt():89 size=4294868992 [ 179.135000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.140000] vb2_dc_pages_to_sgt():83 cur_page=24 [ 179.145000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.150000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 179.155000] vb2_dc_pages_to_sgt():89 size=4294737920 [ 179.160000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 179.165000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.170000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.175000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.180000] vb2_dc_pages_to_sgt():89 size=4294901760 [ 179.185000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 179.190000] vb2_dc_pages_to_sgt():83 cur_page=16 [ 179.195000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.200000] vb2_dc_pages_to_sgt():86 chunk_size=393216 [ 179.205000] vb2_dc_pages_to_sgt():89 size=4294508544 [ 179.210000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.215000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.220000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.220000] vb2_dc_pages_to_sgt():86 chunk_size=32768 [ 179.225000] vb2_dc_pages_to_sgt():89 size=4294934528 [ 179.230000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.235000] vb2_dc_pages_to_sgt():83 cur_page=8 [ 179.240000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.245000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.250000] vb2_dc_pages_to_sgt():89 size=4294868992 [ 179.255000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.260000] vb2_dc_pages_to_sgt():83 cur_page=24 [ 179.265000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.270000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 179.275000] vb2_dc_pages_to_sgt():89 size=4294737920 [ 179.280000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.285000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.290000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.295000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.300000] vb2_dc_pages_to_sgt():89 size=4294901760 [ 179.305000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.310000] vb2_dc_pages_to_sgt():83 cur_page=16 [ 179.315000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.320000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 179.325000] vb2_dc_pages_to_sgt():89 size=4294770688 [ 179.330000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.335000] vb2_dc_pages_to_sgt():83 cur_page=48 [ 179.340000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.340000] vb2_dc_pages_to_sgt():86 chunk_size=262144 [ 179.350000] vb2_dc_pages_to_sgt():89 size=4294508544 [ 179.355000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.355000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.360000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.365000] vb2_dc_pages_to_sgt():86 chunk_size=32768 [ 179.370000] vb2_dc_pages_to_sgt():89 size=4294934528 [ 179.375000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.380000] vb2_dc_pages_to_sgt():83 cur_page=8 [ 179.385000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.390000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.395000] vb2_dc_pages_to_sgt():89 size=4294868992 [ 179.400000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.405000] vb2_dc_pages_to_sgt():83 cur_page=24 [ 179.410000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.415000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 179.420000] vb2_dc_pages_to_sgt():89 size=4294737920 [ 179.425000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 179.430000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.435000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.440000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.445000] vb2_dc_pages_to_sgt():89 size=4294901760 [ 179.450000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 179.455000] vb2_dc_pages_to_sgt():83 cur_page=16 [ 179.460000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.460000] vb2_dc_pages_to_sgt():86 chunk_size=393216 [ 179.470000] vb2_dc_pages_to_sgt():89 size=4294508544 [ 179.475000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.475000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.480000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.485000] vb2_dc_pages_to_sgt():86 chunk_size=32768 [ 179.490000] vb2_dc_pages_to_sgt():89 size=4294934528 [ 179.495000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.500000] vb2_dc_pages_to_sgt():83 cur_page=8 [ 179.505000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.510000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.515000] vb2_dc_pages_to_sgt():89 size=4294868992 [ 179.520000] vb2_dc_pages_to_sgt() sgt->orig_nents=3 [ 179.525000] vb2_dc_pages_to_sgt():83 cur_page=24 [ 179.530000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.535000] vb2_dc_pages_to_sgt():86 chunk_size=131072 [ 179.540000] vb2_dc_pages_to_sgt():89 size=4294737920 [ 179.545000] mmc0: Too large timeout requested for CMD25! [ 179.545000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 179.550000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.555000] mmc0: Too large timeout requested for CMD25! [ 179.555000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.560000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.565000] mmc0: Too large timeout requested for CMD25! [ 179.565000] vb2_dc_pages_to_sgt():89 size=4294901760 [ 179.570000] vb2_dc_pages_to_sgt() sgt->orig_nents=2 [ 179.575000] vb2_dc_pages_to_sgt():83 cur_page=16 [ 179.580000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.580000] vb2_dc_pages_to_sgt():86 chunk_size=262144 [ 179.590000] vb2_dc_pages_to_sgt():89 size=4294639616 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x40000000 at 0xb6d71000, size 458752 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x20100000 at 0xb6d39000, size 229376 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x40080000 at 0xb6cc9000, size 458752 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x20140000 at 0xb6c91000, size 229376 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x40100000 at 0xb6c21000, size 458752 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x20180000 at 0xb6be9000, size 229376 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x40180000 at 0xb6b79000, size 458752 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x201c0000 at 0xb6b41000, size 229376 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x40200000 at 0xb6ad1000, size 458752 [ 179.595000] vb2_dc_mmap: mapped dma addr 0x20200000 at 0xb6a99000, size 229376 [ 179.600000] vb2_dc_mmap: mapped dma addr 0x40280000 at 0xb6a29000, size 458752 [ 179.600000] vb2_dc_mmap: mapped dma addr 0x20240000 at 0xb69f1000, size 229376 [ 179.600000] vb2_dc_pages_to_sgt() sgt->orig_nents=4 [ 179.605000] vb2_dc_pages_to_sgt():83 cur_page=0 [ 179.610000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.615000] vb2_dc_pages_to_sgt():86 chunk_size=8192 [ 179.620000] vb2_dc_pages_to_sgt():89 size=4294959104 [ 179.625000] vb2_dc_pages_to_sgt() sgt->orig_nents=4 [ 179.630000] vb2_dc_pages_to_sgt():83 cur_page=2 [ 179.635000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.635000] vb2_dc_pages_to_sgt():86 chunk_size=16384 [ 179.640000] vb2_dc_pages_to_sgt():89 size=4294942720 [ 179.645000] vb2_dc_pages_to_sgt() sgt->orig_nents=4 [ 179.650000] vb2_dc_pages_to_sgt():83 cur_page=6 [ 179.655000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.660000] vb2_dc_pages_to_sgt():86 chunk_size=65536 [ 179.665000] vb2_dc_pages_to_sgt():89 size=4294877184 [ 179.670000] vb2_dc_pages_to_sgt() sgt->orig_nents=4 [ 179.675000] vb2_dc_pages_to_sgt():83 cur_page=22 [ 179.680000] vb2_dc_pages_to_sgt():84 offset=0 [ 179.685000] vb2_dc_pages_to_sgt():86 chunk_size=524288 [ 179.690000] vb2_dc_pages_to_sgt():89 size=4294352896 [ 179.695000] vb2_dc_mmap: mapped dma addr 0x20000000 at 0xb695b000, size 614400 [ 180.000000] mmc0: Too large timeout requested for CMD25! [ 180.075000] Entering vb2_dc_get_contiguous_size() [ 180.080000] sg_dma_address=0x20100000, sg_dma_len=393216 [ 180.085000] expected=0x20160000 [ 180.090000] Leaving vb2_dc_get_contiguous_size() [ 180.095000] vb2_dc_map_dmabuf():671contig_size=393216, buf->size=430080 [ 180.100000] contiguous chunk is too small 393216/430080 b From t.stanislaws at samsung.com Wed May 9 09:58:53 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 09 May 2012 11:58:53 +0200 Subject: [Linaro-mm-sig] [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <4FAA12C7.8020307@gmail.com> References: <1334933134-4688-1-git-send-email-t.stanislaws@samsung.com> <1334933134-4688-9-git-send-email-t.stanislaws@samsung.com> <4FA7DE61.7000705@gmail.com> <4675433.ieio0xx0Y0@avalon> <4FA9005F.6020901@gmail.com> <4FAA12C7.8020307@gmail.com> Message-ID: <4FAA3FDD.1040400@samsung.com> Hi Subash, Could you post the code of vb2_dc_pages_to_sgt with all printk in it. It will help us avoid guessing where and what is debugged in the log. Moreover, I found a line 'size=4294836224' in the log. It means that size is equal to -131072 (!?!) or there are some invalid conversions in printk. Are you suze that you do not pass size = 0 as the function argument? Notice that earlier versions of dmabuf-for-vb2 patches has offset2 argument instead of size. It was the offset at the end of the buffer. In (I guess) 95% of cases this offset was 0. Could you provide only function arguments that causes the failure? I mean pages array + size (I assume that offset is zero for your test). Having the arguments we could reproduce that bug. Regards, Tomasz Stanislawski On 05/09/2012 08:46 AM, Subash Patel wrote: > Hello Tomasz, Laurent, > > I have printed some logs during the dmabuf export and attach for the SGT issue below. Please find it in the attachment. I hope > it will be useful. > > Regards, > Subash > > On 05/08/2012 04:45 PM, Subash Patel wrote: >> Hi Laurent, >> >> On 05/08/2012 02:44 PM, Laurent Pinchart wrote: >>> Hi Subash, >>> >>> On Monday 07 May 2012 20:08:25 Subash Patel wrote: >>>> Hello Thomasz, Laurent, >>>> >>>> I found an issue in the function vb2_dc_pages_to_sgt() below. I saw that >>>> during the attach, the size of the SGT and size requested mis-matched >>>> (by atleast 8k bytes). Hence I made a small correction to the code as >>>> below. I could then attach the importer properly. >>> >>> Thank you for the report. >>> >>> Could you print the content of the sglist (number of chunks and size >>> of each >>> chunk) before and after your modifications, as well as the values of >>> n_pages, >>> offset and size ? >> I will put back all the printk's and generate this. As of now, my setup >> has changed and will do this when I get sometime. >>> >>>> On 04/20/2012 08:15 PM, Tomasz Stanislawski wrote: >>> >>> [snip] >>> >>>>> +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, >>>>> + unsigned int n_pages, unsigned long offset, unsigned long size) >>>>> +{ >>>>> + struct sg_table *sgt; >>>>> + unsigned int chunks; >>>>> + unsigned int i; >>>>> + unsigned int cur_page; >>>>> + int ret; >>>>> + struct scatterlist *s; >>>>> + >>>>> + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); >>>>> + if (!sgt) >>>>> + return ERR_PTR(-ENOMEM); >>>>> + >>>>> + /* compute number of chunks */ >>>>> + chunks = 1; >>>>> + for (i = 1; i< n_pages; ++i) >>>>> + if (pages[i] != pages[i - 1] + 1) >>>>> + ++chunks; >>>>> + >>>>> + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); >>>>> + if (ret) { >>>>> + kfree(sgt); >>>>> + return ERR_PTR(-ENOMEM); >>>>> + } >>>>> + >>>>> + /* merging chunks and putting them into the scatterlist */ >>>>> + cur_page = 0; >>>>> + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { >>>>> + unsigned long chunk_size; >>>>> + unsigned int j; >>>> >>>> size = PAGE_SIZE; >>>> >>>>> + >>>>> + for (j = cur_page + 1; j< n_pages; ++j) >>>> >>>> for (j = cur_page + 1; j< n_pages; ++j) { >>>> >>>>> + if (pages[j] != pages[j - 1] + 1) >>>>> + break; >>>> >>>> size += PAGE >>>> } >>>> >>>>> + >>>>> + chunk_size = ((j - cur_page)<< PAGE_SHIFT) - offset; >>>>> + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); >>>> >>>> [DELETE] size -= chunk_size; >>>> >>>>> + offset = 0; >>>>> + cur_page = j; >>>>> + } >>>>> + >>>>> + return sgt; >>>>> +} >>> >> Regards, >> Subash From m.szyprowski at samsung.com Thu May 10 13:19:35 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 10 May 2012 15:19:35 +0200 Subject: [Linaro-mm-sig] [PATCH] drivers: cma: don't fail if migration returns -EAGAIN In-Reply-To: <1333462221-3987-1-git-send-email-m.szyprowski@samsung.com> References: <1333462221-3987-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1336655975-15729-1-git-send-email-m.szyprowski@samsung.com> alloc_contig_range() function might return -EAGAIN if migrate_pages() call fails for some temporarily locked pages. Such case should not be fatal to dma_alloc_from_contiguous(), which should retry allocation like in case of -EBUSY error. Reported-by: Haojian Zhuang Signed-off-by: Marek Szyprowski --- drivers/base/dma-contiguous.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c index 78efb03..e46e2fb 100644 --- a/drivers/base/dma-contiguous.c +++ b/drivers/base/dma-contiguous.c @@ -346,7 +346,7 @@ struct page *dma_alloc_from_contiguous(struct device *dev, int count, if (ret == 0) { bitmap_set(cma->bitmap, pageno, count); break; - } else if (ret != -EBUSY) { + } else if (ret != -EBUSY && ret != -EAGAIN) { goto error; } pr_debug("%s(): memory range at %p is busy, retrying\n", -- 1.7.1.569.g6f426 From minchan at kernel.org Thu May 10 15:14:27 2012 From: minchan at kernel.org (Minchan Kim) Date: Fri, 11 May 2012 00:14:27 +0900 Subject: [Linaro-mm-sig] [PATCH] drivers: cma: don't fail if migration returns -EAGAIN In-Reply-To: <1336655975-15729-1-git-send-email-m.szyprowski@samsung.com> References: <1333462221-3987-1-git-send-email-m.szyprowski@samsung.com> <1336655975-15729-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <20120510151427.GB2394@barrios> On Thu, May 10, 2012 at 03:19:35PM +0200, Marek Szyprowski wrote: > alloc_contig_range() function might return -EAGAIN if migrate_pages() call migrate_page never return -EAGAIN and I can't find any -EAGAIN return in alloc_contig_range. Am I seeing different tree? > fails for some temporarily locked pages. Such case should not be fatal > to dma_alloc_from_contiguous(), which should retry allocation like in case > of -EBUSY error. > > Reported-by: Haojian Zhuang > Signed-off-by: Marek Szyprowski > --- > drivers/base/dma-contiguous.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c > index 78efb03..e46e2fb 100644 > --- a/drivers/base/dma-contiguous.c > +++ b/drivers/base/dma-contiguous.c > @@ -346,7 +346,7 @@ struct page *dma_alloc_from_contiguous(struct device *dev, int count, > if (ret == 0) { > bitmap_set(cma->bitmap, pageno, count); > break; > - } else if (ret != -EBUSY) { > + } else if (ret != -EBUSY && ret != -EAGAIN) { > goto error; > } > pr_debug("%s(): memory range at %p is busy, retrying\n", > -- > 1.7.1.569.g6f426 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo at kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: email at kvack.org From lauraa at codeaurora.org Thu May 10 20:07:41 2012 From: lauraa at codeaurora.org (Laura Abbott) Date: Thu, 10 May 2012 13:07:41 -0700 Subject: [Linaro-mm-sig] Bad use of highmem with buffer_migrate_page? Message-ID: <4FAC200D.2080306@codeaurora.org> Hi, I did a backport of the Contiguous Memory Allocator to a 3.0.8 tree. I wrote fairly simple test case that, in 1MB chunks, allocs up to 40MB from a reserved area, maps, writes, unmaps and then frees in an infinite loop. When running this with another program in parallel to put some stress on the filesystem, I hit data aborts in the filesystem/journal layer, although not always the same backtrace. As an example: [] (__ext4_check_dir_entry+0x20/0x184) from [] (add_dirent_to_buf+0x70/0x2ac) [] (add_dirent_to_buf+0x70/0x2ac) from [] (ext4_add_entry+0xd8/0x4bc) [] (ext4_add_entry+0xd8/0x4bc) from [] (ext4_add_nondir+0x14/0x64) [] (ext4_add_nondir+0x14/0x64) from [] (ext4_create+0xd8/0x120) [] (ext4_create+0xd8/0x120) from [] (vfs_create+0x74/0xa4) [] (vfs_create+0x74/0xa4) from [] (do_last+0x588/0x8d4) [] (do_last+0x588/0x8d4) from [] (path_openat+0xc4/0x394) [] (path_openat+0xc4/0x394) from [] (do_filp_open+0x30/0x7c) [] (do_filp_open+0x30/0x7c) from [] (do_sys_open+0xd8/0x174) [] (do_sys_open+0xd8/0x174) from [] (ret_fast_syscall+0x0/0x30) Every panic had the same issue where a struct buffer_head [1] had a b_data that was unexpectedly NULL. During the course of CMA, buffer_migrate_page could be called to migrate from a CMA page to a new page. buffer_migrate_page calls set_bh_page[2] to set the new page for the buffer_head. If the new page is a highmem page though, the bh->b_data ends up as NULL, which could produce the panics seen above. This seems to indicate that highmem pages are not not appropriate for use as pages to migrate to. The following made the problem go away for me: --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5753,7 +5753,7 @@ static struct page * __alloc_contig_migrate_alloc(struct page *page, unsigned long private, int **resultp) { - return alloc_page(GFP_HIGHUSER_MOVABLE); + return alloc_page(GFP_USER | __GFP_MOVABLE); } Does this seem like an actual issue or is this an artifact of my backport to 3.0? I'm not familiar enough with the filesystem layer to be able to tell where highmem can actually be used. Thanks, Laura [1] http://lxr.free-electrons.com/source/include/linux/buffer_head.h#L59 [2] http://lxr.free-electrons.com/source/fs/buffer.c?v=3.0#L1441 -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. From kochhar.abhinav at gmail.com Fri May 11 02:01:46 2012 From: kochhar.abhinav at gmail.com (Abhinav Kochhar) Date: Fri, 11 May 2012 11:01:46 +0900 Subject: [Linaro-mm-sig] [PATCH 0/3] [RFC] Kernel Virtual Memory allocation issue in dma-mapping framework In-Reply-To: References: Message-ID: Hello, This is a request for comments on dma-mapping patches for ARM. I did some additions for issue related to kernel virtual memory allocations in the iommu ops defined in dma-mapping framework. The patches are based on: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git3.4-rc3-arm-dma-v9 The code has been tested on Samsung Exynos5 SMDK5250. These patches do the following: 1. Define a new dma attribute to identify user space allocation. 2. Add new wrapper functions to pass the dma attribute defined in (1) above, as in the current framework there is no way to pass the new attribute which can be used to differentiate between kernel and user allocations. 3. Extend the existing arm_dma_ops for iommu enabled devices to differentiate between kernel and user space allocations. Patch summary: [PATCH 1/3]: Common: add DMA_ATTR_USER_SPACE to dma-attr. This can be passed to arm_dma_ops to identify the type of allocation which can be either from kernel or from user. [PATCH 2/3]: ARM: add "struct page_infodma" to hold information for allocated pages. This can be attached to any of the devices which is making use of dma-mapping APIs. Any interested device should allocate this structure and store all the relevant information about the allocated pages to be able to do a look up for all future references. ARM: add dma_alloc_writecombine_user() function to pass DMA_ATTR_USER_SPACE attribute ARM: add dma_free_writecombine_user() function to pass DMA_ATTR_USER_SPACE attribute ARM: add dma_mmap_writecombine_user() function to pass DMA_ATTR_USER_SPACE attribute [PATCH 3/3]: ARM: add check for allocation type in __dma_alloc_remap() function ARM: add check for allocation type in arm_iommu_alloc_attrs() function ARM: add check for allocation type in arm_iommu_mmap_attrs() function ARM: re-used dma_addr as a flag to check for memory allocation type. It was an unused argument and the prototype does not pass dma-attrs, so used this as a means to pass the flag. ARM: add check for allocation type in arm_iommu_free_attrs() function arch/arm/include/asm/dma-mapping.h | 31 +++++++ arch/arm/mm/dma-mapping.c | 168 ++++++++++++++++++++++++++---------- include/linux/dma-attrs.h | 1 + 3 files changed, 155 insertions(+), 45 deletions(-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kochhar.abhinav at gmail.com Fri May 11 02:01:52 2012 From: kochhar.abhinav at gmail.com (Abhinav Kochhar) Date: Fri, 11 May 2012 11:01:52 +0900 Subject: [Linaro-mm-sig] [PATCH 1/3] [RFC] Kernel Virtual Memory allocation issue in dma-mapping framework In-Reply-To: References: Message-ID: With this add a new attribute that can be passsed to dma-mapping IOMMU apis to differentiate between kernel and user allcoations. diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h index 547ab56..861df09 100644 --- a/include/linux/dma-attrs.h +++ b/include/linux/dma-attrs.h @@ -15,6 +15,7 @@ enum dma_attr { DMA_ATTR_WEAK_ORDERING, DMA_ATTR_WRITE_COMBINE, DMA_ATTR_NON_CONSISTENT, + DMA_ATTR_USER_SPACE, DMA_ATTR_MAX, }; -------------- next part -------------- An HTML attachment was scrubbed... URL: From kochhar.abhinav at gmail.com Fri May 11 02:01:58 2012 From: kochhar.abhinav at gmail.com (Abhinav Kochhar) Date: Fri, 11 May 2012 11:01:58 +0900 Subject: [Linaro-mm-sig] [PATCH 2/3] [RFC] Kernel Virtual Memory allocation issue in dma-mapping framework In-Reply-To: References: Message-ID: With this define new wrapper functions which enables to pass the new dma attribute to IOMMU ops of dma-mapping framework diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index bbef15d..7fc003a 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -14,6 +14,12 @@ #define DMA_ERROR_CODE (~0) extern struct dma_map_ops arm_dma_ops; +struct page_infodma { + struct page **pages; + unsigned long nr_pages; + unsigned long shared; +}; + static inline struct dma_map_ops *get_dma_ops(struct device *dev) { if (dev && dev->archdata.dma_ops) @@ -205,6 +211,14 @@ static inline void *dma_alloc_writecombine(struct device *dev, size_t size, return dma_alloc_attrs(dev, size, dma_handle, flag, &attrs); } +static inline void *dma_alloc_writecombine_user(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t flag) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_USER_SPACE, &attrs); + return dma_alloc_attrs(dev, size, dma_handle, flag, &attrs); +} + static inline void dma_free_writecombine(struct device *dev, size_t size, void *cpu_addr, dma_addr_t dma_handle) { @@ -213,6 +227,14 @@ static inline void dma_free_writecombine(struct device *dev, size_t size, return dma_free_attrs(dev, size, cpu_addr, dma_handle, &attrs); } +static inline void dma_free_writecombine_user(struct device *dev, size_t size, + void *cpu_addr, dma_addr_t dma_handle) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_USER_SPACE, &attrs); + return dma_free_attrs(dev, size, cpu_addr, dma_handle, &attrs); +} + static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size) { @@ -221,6 +243,14 @@ static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struc return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs); } +static inline int dma_mmap_writecombine_user(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_USER_SPACE, &attrs); + return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs); +} + /* * This can be called during boot to increase the size of the consistent * DMA region above it's default value of 2MB. It must be called before the -------------- next part -------------- An HTML attachment was scrubbed... URL: From kochhar.abhinav at gmail.com Fri May 11 02:02:02 2012 From: kochhar.abhinav at gmail.com (Abhinav Kochhar) Date: Fri, 11 May 2012 11:02:02 +0900 Subject: [Linaro-mm-sig] [PATCH 3/3] [RFC] Kernel Virtual Memory allocation issue in dma-mapping framework In-Reply-To: References: Message-ID: With this we can do a run time check on the allocation type for either kernel or user using the dma attribute passed to dma-mapping iommu ops. diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 2d11aa0..1f454cc 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -428,6 +428,7 @@ static void __dma_free_remap(void *cpu_addr, size_t size) arm_vmregion_free(&consistent_head, c); } + #else /* !CONFIG_MMU */ #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) @@ -894,6 +895,35 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) size_t align; size_t count = size >> PAGE_SHIFT; int bit; + unsigned long mem_type = (unsigned long)gfp; + + + if(mem_type){ + + struct page_infodma *pages_in; + + pages_in = kzalloc( sizeof(struct page_infodma*), GFP_KERNEL); + if(!pages_in) + return NULL; + + pages_in->nr_pages = count; + + return (void*)pages_in; + + } + + /* + * Align the virtual region allocation - maximum alignment is + * a section size, minimum is a page size. This helps reduce + * fragmentation of the DMA space, and also prevents allocations + * smaller than a section from crossing a section boundary. + */ + + bit = fls(size - 1); + if (bit > SECTION_SHIFT) + bit = SECTION_SHIFT; + align = 1 << bit; + if (!consistent_pte[0]) { pr_err("%s: not initialised\n", __func__); @@ -901,16 +931,6 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) return NULL; } - /* - * Align the virtual region allocation - maximum alignment is - * a section size, minimum is a page size. This helps reduce - * fragmentation of the DMA space, and also prevents allocations - * smaller than a section from crossing a section boundary. - */ - bit = fls(size - 1); - if (bit > SECTION_SHIFT) - bit = SECTION_SHIFT; - align = 1 << bit; /* * Allocate a virtual address in the consistent mapping region. @@ -946,6 +966,7 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) return NULL; } + /* * Create a mapping in device IO address space for specified pages */ @@ -973,13 +994,16 @@ __iommu_create_mapping(struct device *dev, struct page **pages, size_t size) len = (j - i) << PAGE_SHIFT; ret = iommu_map(mapping->domain, iova, phys, len, 0); + if (ret < 0) goto fail; + iova += len; i = j; } return dma_addr; fail: + iommu_unmap(mapping->domain, dma_addr, iova-dma_addr); __free_iova(mapping, dma_addr, size); return DMA_ERROR_CODE; @@ -1007,6 +1031,8 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); struct page **pages; void *addr = NULL; + struct page_infodma *page_ret; + unsigned long mem_type; *handle = DMA_ERROR_CODE; size = PAGE_ALIGN(size); @@ -1019,11 +1045,19 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, if (*handle == DMA_ERROR_CODE) goto err_buffer; - addr = __iommu_alloc_remap(pages, size, gfp, prot); + mem_type = dma_get_attr(DMA_ATTR_USER_SPACE, attrs); + + addr = __iommu_alloc_remap(pages, size, mem_type, prot); if (!addr) goto err_mapping; - return addr; + if(mem_type){ + page_ret = (struct page_infodma *)addr; + page_ret->pages = pages; + return page_ret; + } + else + return addr; err_mapping: __iommu_remove_mapping(dev, *handle, size); @@ -1071,18 +1105,34 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) { - struct arm_vmregion *c; + + unsigned long mem_type = dma_get_attr(DMA_ATTR_USER_SPACE, attrs); + size = PAGE_ALIGN(size); - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); - if (c) { - struct page **pages = c->priv; - __dma_free_remap(cpu_addr, size); - __iommu_remove_mapping(dev, handle, size); - __iommu_free_buffer(dev, pages, size); + + if(mem_type){ + + struct page_infodma *pagesin = cpu_addr; + if (pagesin) { + struct page **pages = pagesin->pages; + __iommu_remove_mapping(dev, handle, size); + __iommu_free_buffer(dev, pages, size); + } + } + else{ + struct arm_vmregion *c; + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); + if (c) { + struct page **pages = c->priv; + __dma_free_remap(cpu_addr, size); + __iommu_remove_mapping(dev, handle, size); + __iommu_free_buffer(dev, pages, size); + } } } + /* * Map a part of the scatter-gather list into contiguous io address space */ -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.szyprowski at samsung.com Fri May 11 07:52:53 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Fri, 11 May 2012 09:52:53 +0200 Subject: [Linaro-mm-sig] [PATCHv9 10/10] ARM: dma-mapping: add support for IOMMU mapper In-Reply-To: References: <1334756652-30830-1-git-send-email-m.szyprowski@samsung.com> <1334756652-30830-11-git-send-email-m.szyprowski@samsung.com> Message-ID: <02e301cd2f4b$11e02f90$35a08eb0$%szyprowski@samsung.com> Hello, On Friday, May 11, 2012 4:09 AM Paul Gortmaker wrote: > On Wed, Apr 18, 2012 at 9:44 AM, Marek Szyprowski > wrote: > > This patch add a complete implementation of DMA-mapping API for > > devices which have IOMMU support. > > Hi Marek, > > It looks like this patch breaks no-MMU builds on ARM, at least > according to git bisect. Here is a link to a linux-next failure: > > http://kisskb.ellerman.id.au/kisskb/buildresult/6291233/ > > arch/arm/mm/dma-mapping.c:726:42: error: 'pgprot_kernel' undeclared > (first use in this function) > make[2]: *** [arch/arm/mm/dma-mapping.o] Error 1 > > Please have a look, thanks. Thanks for reporting this issue, I will send a fix in a minute. Best regards -- Marek Szyprowski Samsung Poland R&D Center From m.szyprowski at samsung.com Fri May 11 08:30:47 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Fri, 11 May 2012 10:30:47 +0200 Subject: [Linaro-mm-sig] Bad use of highmem with buffer_migrate_page? In-Reply-To: <4FAC200D.2080306@codeaurora.org> References: <4FAC200D.2080306@codeaurora.org> Message-ID: <02fc01cd2f50$5d77e4c0$1867ae40$%szyprowski@samsung.com> Hello, On Thursday, May 10, 2012 10:08 PM Laura Abbott wrote: > I did a backport of the Contiguous Memory Allocator to a 3.0.8 tree. I > wrote fairly simple test case that, in 1MB chunks, allocs up to 40MB > from a reserved area, maps, writes, unmaps and then frees in an infinite > loop. When running this with another program in parallel to put some > stress on the filesystem, I hit data aborts in the filesystem/journal > layer, although not always the same backtrace. As an example: > > [] (__ext4_check_dir_entry+0x20/0x184) from [] > (add_dirent_to_buf+0x70/0x2ac) > [] (add_dirent_to_buf+0x70/0x2ac) from [] > (ext4_add_entry+0xd8/0x4bc) > [] (ext4_add_entry+0xd8/0x4bc) from [] > (ext4_add_nondir+0x14/0x64) > [] (ext4_add_nondir+0x14/0x64) from [] > (ext4_create+0xd8/0x120) > [] (ext4_create+0xd8/0x120) from [] > (vfs_create+0x74/0xa4) > [] (vfs_create+0x74/0xa4) from [] (do_last+0x588/0x8d4) > [] (do_last+0x588/0x8d4) from [] > (path_openat+0xc4/0x394) > [] (path_openat+0xc4/0x394) from [] > (do_filp_open+0x30/0x7c) > [] (do_filp_open+0x30/0x7c) from [] > (do_sys_open+0xd8/0x174) > [] (do_sys_open+0xd8/0x174) from [] > (ret_fast_syscall+0x0/0x30) > > Every panic had the same issue where a struct buffer_head [1] had a > b_data that was unexpectedly NULL. > > During the course of CMA, buffer_migrate_page could be called to migrate > from a CMA page to a new page. buffer_migrate_page calls set_bh_page[2] > to set the new page for the buffer_head. If the new page is a highmem > page though, the bh->b_data ends up as NULL, which could produce the > panics seen above. > > This seems to indicate that highmem pages are not not appropriate for > use as pages to migrate to. The following made the problem go away for me: > > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5753,7 +5753,7 @@ static struct page * > __alloc_contig_migrate_alloc(struct page *page, unsigned long private, > int **resultp) > { > - return alloc_page(GFP_HIGHUSER_MOVABLE); > + return alloc_page(GFP_USER | __GFP_MOVABLE); > } > > > Does this seem like an actual issue or is this an artifact of my > backport to 3.0? I'm not familiar enough with the filesystem layer to be > able to tell where highmem can actually be used. I will need to investigate this further as this issue doesn't appear on v3.3+ kernels, but I remember I saw something similar when I tried CMA backported to v3.0. You have pointed to an important issue which we need to solve somehow. CMA wasn't fully tested with high mem and it looks that there are some issues here and there. Your patch will prevent using highmem for any migration triggered by CMA. IMHO this is a bit limited, but right now I have no better idea. For a quick backport it should be ok. Best regards -- Marek Szyprowski Samsung Poland R&D Center From m.szyprowski at samsung.com Fri May 11 08:33:36 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Fri, 11 May 2012 10:33:36 +0200 Subject: [Linaro-mm-sig] [PATCH] ARM: dma-mapping: fix build break on no-MMU systems In-Reply-To: <1334756652-30830-1-git-send-email-m.szyprowski@samsung.com> References: <1334756652-30830-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1336725216-24434-1-git-send-email-m.szyprowski@samsung.com> Fix the following build issue: arch/arm/mm/dma-mapping.c:726:42: error: 'pgprot_kernel' undeclared (first use in this function) make[2]: *** [arch/arm/mm/dma-mapping.o] Error 1 Reported-by: Paul Gortmaker Signed-off-by: Marek Szyprowski --- arch/arm/mm/dma-mapping.c | 17 +++++++++-------- 1 files changed, 9 insertions(+), 8 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 2d11aa0..686ef02 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -428,10 +428,19 @@ static void __dma_free_remap(void *cpu_addr, size_t size) arm_vmregion_free(&consistent_head, c); } +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) +{ + prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? + pgprot_writecombine(prot) : + pgprot_dmacoherent(prot); + return prot; +} + #else /* !CONFIG_MMU */ #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) #define __dma_free_remap(addr, size) do { } while (0) +#define __get_dma_pgprot(attrs, prot) __pgprot(0) #endif /* CONFIG_MMU */ @@ -471,14 +480,6 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, return addr; } -static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) -{ - prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? - pgprot_writecombine(prot) : - pgprot_dmacoherent(prot); - return prot; -} - /* * Allocate DMA-coherent memory space and return both the kernel remapped * virtual and bus address for that space. -- 1.7.1.569.g6f426 From m.szyprowski at samsung.com Fri May 11 13:27:49 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Fri, 11 May 2012 15:27:49 +0200 Subject: [Linaro-mm-sig] how to avoid allocating or freeze MOVABLE memory in userspace In-Reply-To: References: Message-ID: <035501cd2f79$dbfde410$93f9ac30$%szyprowski@samsung.com> Hello, On Wednesday, April 18, 2012 9:37 AM Haojian Zhuang wrote: > On Mon, Apr 16, 2012 at 9:55 PM, Christoph Lameter wrote: > > On Sat, 14 Apr 2012, Haojian Zhuang wrote: > > > >> On Sat, Apr 14, 2012 at 2:27 AM, Christoph Lameter wrote: > >> > On Fri, 13 Apr 2012, Haojian Zhuang wrote: > >> > > >> >> I have one question on memory migration. As we know, malloc() from > >> >> user app will allocate MIGRATE_MOVABLE pages. But if we want to use > >> >> this memory as DMA usage, we can't accept MIGRATE_MOVABLE type. Could > >> >> we change its behavior before DMA working? > >> > > >> > MIGRATE_MOVABLE works fine for DMA. If you keep a reference from a device > >> > driver to user pages then you will have to increase the page refcount > >> > which will in turn pin the page and make it non movable for as long as you > >> > keep the refcount. > >> > >> Hi Christoph, > >> > >> Thanks for your illustration. But it's a little abstract. Could you > >> give me a simple example > >> or show me the code? > > > > Run get_user_pages() on the memory you are interest in pinning. See how > > other drivers do that by looking up other use cases. F.e. ib_umem_get() > > does a similar thing. > > > > > Got it. And I think there's conflict in CMA. > > For example, user process A malloc() memory, page->_count is 1. After > using get_user_pages() > in device driver for DMA usage, page->_count becomes 2. > > If the page is in CMA region, it results migrate_pages() returns > -EAGAIN. But error handling in CMA is in below. > > ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA); > if (ret == 0) { > bitmap_set(cma->bitmap, pageno, count); > break; > } else if (ret != -EBUSY) { > goto error; > } > > Since EAGAIN doesn't equal to EBUSY, dma_alloc_from_contiguous() > aborts. Should dma_alloc_from_contiguous() handle EAGAIN? > I've checked again and it is really not possible for migrate_pages() to return -EAGAIN, so the CMA code is correct. The only special case which needs retry is -EBUSY. Best regards -- Marek Szyprowski Samsung Poland R&D Center From paul.gortmaker at windriver.com Fri May 11 02:08:39 2012 From: paul.gortmaker at windriver.com (Paul Gortmaker) Date: Thu, 10 May 2012 22:08:39 -0400 Subject: [Linaro-mm-sig] [PATCHv9 10/10] ARM: dma-mapping: add support for IOMMU mapper In-Reply-To: <1334756652-30830-11-git-send-email-m.szyprowski@samsung.com> References: <1334756652-30830-1-git-send-email-m.szyprowski@samsung.com> <1334756652-30830-11-git-send-email-m.szyprowski@samsung.com> Message-ID: On Wed, Apr 18, 2012 at 9:44 AM, Marek Szyprowski wrote: > This patch add a complete implementation of DMA-mapping API for > devices which have IOMMU support. Hi Marek, It looks like this patch breaks no-MMU builds on ARM, at least according to git bisect. Here is a link to a linux-next failure: http://kisskb.ellerman.id.au/kisskb/buildresult/6291233/ arch/arm/mm/dma-mapping.c:726:42: error: 'pgprot_kernel' undeclared (first use in this function) make[2]: *** [arch/arm/mm/dma-mapping.o] Error 1 Please have a look, thanks. Paul. --- > > This implementation tries to optimize dma address space usage by remapping > all possible physical memory chunks into a single dma address space chunk. > > DMA address space is managed on top of the bitmap stored in the > dma_iommu_mapping structure stored in device->archdata. Platform setup > code has to initialize parameters of the dma address space (base address, > size, allocation precision order) with arm_iommu_create_mapping() > function. > To reduce the size of the bitmap, all allocations are aligned to the > specified order of base 4 KiB pages. > > dma_alloc_* functions allocate physical memory in chunks, each with > alloc_pages() function to avoid failing if the physical memory gets > fragmented. In worst case the allocated buffer is composed of 4 KiB page > chunks. > > dma_map_sg() function minimizes the total number of dma address space > chunks by merging of physical memory chunks into one larger dma address > space chunk. If requested chunk (scatter list entry) boundaries > match physical page boundaries, most calls to dma_map_sg() requests will > result in creating only one chunk in dma address space. > > dma_map_page() simply creates a mapping for the given page(s) in the dma > address space. > > All dma functions also perform required cache operation like their > counterparts from the arm linear physical memory mapping version. > > This patch contains code and fixes kindly provided by: > - Krishna Reddy , > - Andrzej Pietrasiewicz , > - Hiroshi DOYU > > Signed-off-by: Marek Szyprowski > Acked-by: Kyungmin Park > Reviewed-by: Konrad Rzeszutek Wilk > Tested-By: Subash Patel > --- > ?arch/arm/Kconfig ? ? ? ? ? ? ? ? | ? ?8 + > ?arch/arm/include/asm/device.h ? ?| ? ?3 + > ?arch/arm/include/asm/dma-iommu.h | ? 34 ++ > ?arch/arm/mm/dma-mapping.c ? ? ? ?| ?727 > +++++++++++++++++++++++++++++++++++++- > ?arch/arm/mm/vmregion.h ? ? ? ? ? | ? ?2 +- > ?5 files changed, 759 insertions(+), 15 deletions(-) > ?create mode 100644 arch/arm/include/asm/dma-iommu.h > From rob.clark at linaro.org Fri May 11 15:30:47 2012 From: rob.clark at linaro.org (Rob Clark) Date: Fri, 11 May 2012 10:30:47 -0500 Subject: [Linaro-mm-sig] [PATCH] dma-buf: mmap support In-Reply-To: <1335258532-20739-1-git-send-email-daniel.vetter@ffwll.ch> References: <1335258532-20739-1-git-send-email-daniel.vetter@ffwll.ch> Message-ID: On Tue, Apr 24, 2012 at 4:08 AM, Daniel Vetter wrote: > Compared to Rob Clark's RFC I've ditched the prepare/finish hooks > and corresponding ioctls on the dma_buf file. The major reason for > that is that many people seem to be under the impression that this is > also for synchronization with outstanding asynchronous processsing. > I'm pretty massively opposed to this because: > > - It boils down reinventing a new rather general-purpose userspace > ?synchronization interface. If we look at things like futexes, this > ?is hard to get right. > - Furthermore a lot of kernel code has to interact with this > ?synchronization primitive. This smells a look like the dri1 hw_lock, > ?a horror show I prefer not to reinvent. > - Even more fun is that multiple different subsystems would interact > ?here, so we have plenty of opportunities to create funny deadlock > ?scenarios. > > I think synchronization is a wholesale different problem from data > sharing and should be tackled as an orthogonal problem. > > Now we could demand that prepare/finish may only ensure cache > coherency (as Rob intended), but that runs up into the next problem: > We not only need mmap support to facilitate sw-only processing nodes > in a pipeline (without jumping through hoops by importing the dma_buf > into some sw-access only importer), which allows for a nicer > ION->dma-buf upgrade path for existing Android userspace. We also need > mmap support for existing importing subsystems to support existing > userspace libraries. And a loot of these subsystems are expected to > export coherent userspace mappings. > > So prepare/finish can only ever be optional and the exporter /needs/ > to support coherent mappings. Given that mmap access is always > somewhat fallback-y in nature I've decided to drop this optimization, > instead of just making it optional. If we demonstrate a clear need for > this, supported by benchmark results, we can always add it in again > later as an optional extension. > > Other differences compared to Rob's RFC is the above mentioned support > for mapping a dma-buf through facilities provided by the importer. > Which results in mmap support no longer being optional. > > Note that this dma-buf mmap patch does _not_ support every possible > insanity an existing subsystem could pull of with mmap: Because it > does not allow to intercept pagefaults and shoot down ptes importing > subsystems can't add some magic of their own at these points (e.g. to > automatically synchronize with outstanding rendering or set up some > special resources). I've done a cursory read through a few mmap > implementions of various subsytems and I'm hopeful that we can avoid > this (and the complexity it'd bring with it). > > Additonally I've extended the documentation a bit to explain the hows > and whys of this mmap extension. > > In case we ever want to add support for explicitly cache maneged > userspace mmap with a prepare/finish ioctl pair, we could specify that > userspace needs to mmap a different part of the dma_buf, e.g. the > range starting at dma_buf->size up to dma_buf->size*2. This works > because the size of a dma_buf is invariant over it's lifetime. The > exporter would obviously need to fall back to coherent mappings for > both ranges if a legacy clients maps the coherent range and the > architecture cannot suppor conflicting caching policies. Also, this > would obviously be optional and userspace needs to be able to fall > back to coherent mappings. > > v2: > - Spelling fixes from Rob Clark. > - Compile fix for !DMA_BUF from Rob Clark. > - Extend commit message to explain how explicitly cache managed mmap > ?support could be added later. > - Extend the documentation with implementations notes for exporters > ?that need to manually fake coherency. > > v3: > - dma_buf pointer initialization goof-up noticed by Rebecca Schultz > ?Zavin. > > Cc: Rob Clark > Cc: Rebecca Schultz Zavin > Signed-Off-by: Daniel Vetter Acked-by: Rob Clark > --- > ?Documentation/dma-buf-sharing.txt | ? 98 ++++++++++++++++++++++++++++++++++--- > ?drivers/base/dma-buf.c ? ? ? ? ? ?| ? 64 +++++++++++++++++++++++- > ?include/linux/dma-buf.h ? ? ? ? ? | ? 16 ++++++ > ?3 files changed, 170 insertions(+), 8 deletions(-) > > diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt > index 3bbd5c5..5ff4d2b 100644 > --- a/Documentation/dma-buf-sharing.txt > +++ b/Documentation/dma-buf-sharing.txt > @@ -29,13 +29,6 @@ The buffer-user > ? ?in memory, mapped into its own address space, so it can access the same area > ? ?of memory. > > -*IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details] > -For this first version, A buffer shared using the dma_buf sharing API: > -- *may* be exported to user space using "mmap" *ONLY* by exporter, outside of > - ?this framework. > -- with this new iteration of the dma-buf api cpu access from the kernel has been > - ?enable, see below for the details. > - > ?dma-buf operations for device dma only > ?-------------------------------------- > > @@ -313,6 +306,83 @@ Access to a dma_buf from the kernel context involves three steps: > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?enum dma_data_direction dir); > > > +Direct Userspace Access/mmap Support > +------------------------------------ > + > +Being able to mmap an export dma-buf buffer object has 2 main use-cases: > +- CPU fallback processing in a pipeline and > +- supporting existing mmap interfaces in importers. > + > +1. CPU fallback processing in a pipeline > + > + ? In many processing pipelines it is sometimes required that the cpu can access > + ? the data in a dma-buf (e.g. for thumbnail creation, snapshots, ...). To avoid > + ? the need to handle this specially in userspace frameworks for buffer sharing > + ? it's ideal if the dma_buf fd itself can be used to access the backing storage > + ? from userspace using mmap. > + > + ? Furthermore Android's ION framework already supports this (and is otherwise > + ? rather similar to dma-buf from a userspace consumer side with using fds as > + ? handles, too). So it's beneficial to support this in a similar fashion on > + ? dma-buf to have a good transition path for existing Android userspace. > + > + ? No special interfaces, userspace simply calls mmap on the dma-buf fd. > + > +2. Supporting existing mmap interfaces in exporters > + > + ? Similar to the motivation for kernel cpu access it is again important that > + ? the userspace code of a given importing subsystem can use the same interfaces > + ? with a imported dma-buf buffer object as with a native buffer object. This is > + ? especially important for drm where the userspace part of contemporary OpenGL, > + ? X, and other drivers is huge, and reworking them to use a different way to > + ? mmap a buffer rather invasive. > + > + ? The assumption in the current dma-buf interfaces is that redirecting the > + ? initial mmap is all that's needed. A survey of some of the existing > + ? subsystems shows that no driver seems to do any nefarious thing like syncing > + ? up with outstanding asynchronous processing on the device or allocating > + ? special resources at fault time. So hopefully this is good enough, since > + ? adding interfaces to intercept pagefaults and allow pte shootdowns would > + ? increase the complexity quite a bit. > + > + ? Interface: > + ? ? ?int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, > + ? ? ? ? ? ? ? ? ? ? ?unsigned long); > + > + ? If the importing subsystem simply provides a special-purpose mmap call to set > + ? up a mapping in userspace, calling do_mmap with dma_buf->file will equally > + ? achieve that for a dma-buf object. > + > +3. Implementation notes for exporters > + > + ? Because dma-buf buffers have invariant size over their lifetime, the dma-buf > + ? core checks whether a vma is too large and rejects such mappings. The > + ? exporter hence does not need to duplicate this check. > + > + ? Because existing importing subsystems might presume coherent mappings for > + ? userspace, the exporter needs to set up a coherent mapping. If that's not > + ? possible, it needs to fake coherency by manually shooting down ptes when > + ? leaving the cpu domain and flushing caches at fault time. Note that all the > + ? dma_buf files share the same anon inode, hence the exporter needs to replace > + ? the dma_buf file stored in vma->vm_file with it's own if pte shootdown is > + ? requred. This is because the kernel uses the underlying inode's address_space > + ? for vma tracking (and hence pte tracking at shootdown time with > + ? unmap_mapping_range). > + > + ? If the above shootdown dance turns out to be too expensive in certain > + ? scenarios, we can extend dma-buf with a more explicit cache tracking scheme > + ? for userspace mappings. But the current assumption is that using mmap is > + ? always a slower path, so some inefficiencies should be acceptable. > + > + ? Exporters that shoot down mappings (for any reasons) shall not do any > + ? synchronization at fault time with outstanding device operations. > + ? Synchronization is an orthogonal issue to sharing the backing storage of a > + ? buffer and hence should not be handled by dma-buf itself. This is explictly > + ? mentioned here because many people seem to want something like this, but if > + ? different exporters handle this differently, buffer sharing can fail in > + ? interesting ways depending upong the exporter (if userspace starts depending > + ? upon this implicit synchronization). > + > ?Miscellaneous notes > ?------------------- > > @@ -336,6 +406,20 @@ Miscellaneous notes > ? the exporting driver to create a dmabuf fd must provide a way to let > ? userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd(). > > +- If an exporter needs to manually flush caches and hence needs to fake > + ?coherency for mmap support, it needs to be able to zap all the ptes pointing > + ?at the backing storage. Now linux mm needs a struct address_space associated > + ?with the struct file stored in vma->vm_file to do that with the function > + ?unmap_mapping_range. But the dma_buf framework only backs every dma_buf fd > + ?with the anon_file struct file, i.e. all dma_bufs share the same file. > + > + ?Hence exporters need to setup their own file (and address_space) association > + ?by setting vma->vm_file and adjusting vma->vm_pgoff in the dma_buf mmap > + ?callback. In the specific case of a gem driver the exporter could use the > + ?shmem file already provided by gem (and set vm_pgoff = 0). Exporters can then > + ?zap ptes by unmapping the corresponding range of the struct address_space > + ?associated with their own file. > + > ?References: > ?[1] struct dma_buf_ops in include/linux/dma-buf.h > ?[2] All interfaces mentioned above defined in include/linux/dma-buf.h > diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c > index 07cbbc6..7cfb405 100644 > --- a/drivers/base/dma-buf.c > +++ b/drivers/base/dma-buf.c > @@ -44,8 +44,26 @@ static int dma_buf_release(struct inode *inode, struct file *file) > ? ? ? ?return 0; > ?} > > +static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) > +{ > + ? ? ? struct dma_buf *dmabuf; > + > + ? ? ? if (!is_dma_buf_file(file)) > + ? ? ? ? ? ? ? return -EINVAL; > + > + ? ? ? dmabuf = file->private_data; > + > + ? ? ? /* check for overflowing the buffer's size */ > + ? ? ? if (vma->vm_pgoff + ((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) > > + ? ? ? ? ? dmabuf->size >> PAGE_SHIFT) > + ? ? ? ? ? ? ? return -EINVAL; > + > + ? ? ? return dmabuf->ops->mmap(dmabuf, vma); > +} > + > ?static const struct file_operations dma_buf_fops = { > ? ? ? ?.release ? ? ? ?= dma_buf_release, > + ? ? ? .mmap ? ? ? ? ? = dma_buf_mmap_internal, > ?}; > > ?/* > @@ -82,7 +100,8 @@ struct dma_buf *dma_buf_export(void *priv, const struct dma_buf_ops *ops, > ? ? ? ? ? ? ? ? ? ? ? ? ?|| !ops->unmap_dma_buf > ? ? ? ? ? ? ? ? ? ? ? ? ?|| !ops->release > ? ? ? ? ? ? ? ? ? ? ? ? ?|| !ops->kmap_atomic > - ? ? ? ? ? ? ? ? ? ? ? ? || !ops->kmap)) { > + ? ? ? ? ? ? ? ? ? ? ? ? || !ops->kmap > + ? ? ? ? ? ? ? ? ? ? ? ? || !ops->mmap)) { > ? ? ? ? ? ? ? ?return ERR_PTR(-EINVAL); > ? ? ? ?} > > @@ -406,3 +425,46 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, > ? ? ? ? ? ? ? ?dmabuf->ops->kunmap(dmabuf, page_num, vaddr); > ?} > ?EXPORT_SYMBOL_GPL(dma_buf_kunmap); > + > + > +/** > + * dma_buf_mmap - Setup up a userspace mmap with the given vma > + * @dma_buf: ? [in] ? ?buffer that should back the vma > + * @vma: ? ? ? [in] ? ?vma for the mmap > + * @pgoff: ? ? [in] ? ?offset in pages where this mmap should start within the > + * ? ? ? ? ? ? ? ? ? ? dma-buf buffer. > + * > + * This function adjusts the passed in vma so that it points at the file of the > + * dma_buf operation. It alsog adjusts the starting pgoff and does bounds > + * checking on the size of the vma. Then it calls the exporters mmap function to > + * set up the mapping. > + * > + * Can return negative error values, returns 0 on success. > + */ > +int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, > + ? ? ? ? ? ? ? ?unsigned long pgoff) > +{ > + ? ? ? if (WARN_ON(!dmabuf || !vma)) > + ? ? ? ? ? ? ? return -EINVAL; > + > + ? ? ? /* check for offset overflow */ > + ? ? ? if (pgoff + ((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) < pgoff) > + ? ? ? ? ? ? ? return -EOVERFLOW; > + > + ? ? ? /* check for overflowing the buffer's size */ > + ? ? ? if (pgoff + ((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) > > + ? ? ? ? ? dmabuf->size >> PAGE_SHIFT) > + ? ? ? ? ? ? ? return -EINVAL; > + > + ? ? ? /* readjust the vma */ > + ? ? ? if (vma->vm_file) > + ? ? ? ? ? ? ? fput(vma->vm_file); > + > + ? ? ? vma->vm_file = dmabuf->file; > + ? ? ? get_file(vma->vm_file); > + > + ? ? ? vma->vm_pgoff = pgoff; > + > + ? ? ? return dmabuf->ops->mmap(dmabuf, vma); > +} > +EXPORT_SYMBOL_GPL(dma_buf_mmap); > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index 3efbfc2..1f78d15 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -61,6 +61,10 @@ struct dma_buf_attachment; > ?* ? ? ? ? ? ? ? ?This Callback must not sleep. > ?* @kmap: maps a page from the buffer into kernel address space. > ?* @kunmap: [optional] unmaps a page from the buffer. > + * @mmap: used to expose the backing storage to userspace. Note that the > + * ? ? ? mapping needs to be coherent - if the exporter doesn't directly > + * ? ? ? support this, it needs to fake coherency by shooting down any ptes > + * ? ? ? when transitioning away from the cpu domain. > ?*/ > ?struct dma_buf_ops { > ? ? ? ?int (*attach)(struct dma_buf *, struct device *, > @@ -92,6 +96,8 @@ struct dma_buf_ops { > ? ? ? ?void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); > ? ? ? ?void *(*kmap)(struct dma_buf *, unsigned long); > ? ? ? ?void (*kunmap)(struct dma_buf *, unsigned long, void *); > + > + ? ? ? int (*mmap)(struct dma_buf *, struct vm_area_struct *vma); > ?}; > > ?/** > @@ -167,6 +173,9 @@ void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); > ?void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); > ?void *dma_buf_kmap(struct dma_buf *, unsigned long); > ?void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); > + > +int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, > + ? ? ? ? ? ? ? ?unsigned long); > ?#else > > ?static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > @@ -248,6 +257,13 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?unsigned long pnum, void *vaddr) > ?{ > ?} > + > +static inline int dma_buf_mmap(struct dma_buf *dmabuf, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?struct vm_area_struct *vma, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?unsigned long pgoff) > +{ > + ? ? ? return -ENODEV; > +} > ?#endif /* CONFIG_DMA_SHARED_BUFFER */ > > ?#endif /* __DMA_BUF_H__ */ > -- > 1.7.10 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-media" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at ?http://vger.kernel.org/majordomo-info.html From lauraa at codeaurora.org Fri May 11 21:51:24 2012 From: lauraa at codeaurora.org (Laura Abbott) Date: Fri, 11 May 2012 14:51:24 -0700 Subject: [Linaro-mm-sig] Bad use of highmem with buffer_migrate_page? In-Reply-To: <02fc01cd2f50$5d77e4c0$1867ae40$%szyprowski@samsung.com> References: <4FAC200D.2080306@codeaurora.org> <02fc01cd2f50$5d77e4c0$1867ae40$%szyprowski@samsung.com> Message-ID: <4FAD89DC.2090307@codeaurora.org> On 5/11/2012 1:30 AM, Marek Szyprowski wrote: > Hello, > > On Thursday, May 10, 2012 10:08 PM Laura Abbott wrote: > >> I did a backport of the Contiguous Memory Allocator to a 3.0.8 tree. I >> wrote fairly simple test case that, in 1MB chunks, allocs up to 40MB >> from a reserved area, maps, writes, unmaps and then frees in an infinite >> loop. When running this with another program in parallel to put some >> stress on the filesystem, I hit data aborts in the filesystem/journal >> layer, although not always the same backtrace. As an example: >> >> [] (__ext4_check_dir_entry+0x20/0x184) from [] >> (add_dirent_to_buf+0x70/0x2ac) >> [] (add_dirent_to_buf+0x70/0x2ac) from [] >> (ext4_add_entry+0xd8/0x4bc) >> [] (ext4_add_entry+0xd8/0x4bc) from [] >> (ext4_add_nondir+0x14/0x64) >> [] (ext4_add_nondir+0x14/0x64) from [] >> (ext4_create+0xd8/0x120) >> [] (ext4_create+0xd8/0x120) from [] >> (vfs_create+0x74/0xa4) >> [] (vfs_create+0x74/0xa4) from [] (do_last+0x588/0x8d4) >> [] (do_last+0x588/0x8d4) from [] >> (path_openat+0xc4/0x394) >> [] (path_openat+0xc4/0x394) from [] >> (do_filp_open+0x30/0x7c) >> [] (do_filp_open+0x30/0x7c) from [] >> (do_sys_open+0xd8/0x174) >> [] (do_sys_open+0xd8/0x174) from [] >> (ret_fast_syscall+0x0/0x30) >> >> Every panic had the same issue where a struct buffer_head [1] had a >> b_data that was unexpectedly NULL. >> >> During the course of CMA, buffer_migrate_page could be called to migrate >> from a CMA page to a new page. buffer_migrate_page calls set_bh_page[2] >> to set the new page for the buffer_head. If the new page is a highmem >> page though, the bh->b_data ends up as NULL, which could produce the >> panics seen above. >> >> This seems to indicate that highmem pages are not not appropriate for >> use as pages to migrate to. The following made the problem go away for me: >> >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -5753,7 +5753,7 @@ static struct page * >> __alloc_contig_migrate_alloc(struct page *page, unsigned long private, >> int **resultp) >> { >> - return alloc_page(GFP_HIGHUSER_MOVABLE); >> + return alloc_page(GFP_USER | __GFP_MOVABLE); >> } >> >> >> Does this seem like an actual issue or is this an artifact of my >> backport to 3.0? I'm not familiar enough with the filesystem layer to be >> able to tell where highmem can actually be used. > > I will need to investigate this further as this issue doesn't appear on > v3.3+ kernels, but I remember I saw something similar when I tried CMA > backported to v3.0. > The 3.0 kernel was the most stable I had around and easiest to work with. I'll be trying 3.4 sometime in the near future. > You have pointed to an important issue which we need to solve somehow. CMA > wasn't fully tested with high mem and it looks that there are some issues here > and there. Your patch will prevent using highmem for any migration triggered by > CMA. IMHO this is a bit limited, but right now I have no better idea. For a > quick backport it should be ok. > All the systems I'll be testing with will have highmem so I'll continue to report any problems I find. I'll be curious if this will show up again when I try 3.4. If it does, I'll see if I can get evidence of why this is the proper fix or a more nuanced patch. > Best regards Thanks, Laura -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. From abhinav.k at samsung.com Sat May 12 08:41:42 2012 From: abhinav.k at samsung.com (Abhinav) Date: Sat, 12 May 2012 14:11:42 +0530 Subject: [Linaro-mm-sig] [PATCH 1/3] [RFC]:DMA-MAPPING:Add a new DMA attribute to avoid Kernel virtual memory allocation Message-ID: <1336812102-9975-1-git-send-email-abhinav.k@samsung.com> This dma attribute can be used to pass to the iommu ops of dma-mapping framework to differentiate between kernel and user space allocations. Signed-off-by: Abhinav --- include/linux/dma-attrs.h | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h index 547ab56..861df09 100644 --- a/include/linux/dma-attrs.h +++ b/include/linux/dma-attrs.h @@ -15,6 +15,7 @@ enum dma_attr { DMA_ATTR_WEAK_ORDERING, DMA_ATTR_WRITE_COMBINE, DMA_ATTR_NON_CONSISTENT, + DMA_ATTR_USER_SPACE, DMA_ATTR_MAX, }; -- 1.7.0.4 From abhinav.k at samsung.com Sat May 12 08:41:58 2012 From: abhinav.k at samsung.com (Abhinav) Date: Sat, 12 May 2012 14:11:58 +0530 Subject: [Linaro-mm-sig] [PATCH 2/3] [RFC]:DMA-MAPPING:Define new wrapper functions to pass new dma attribute to IOMMU ops Message-ID: <1336812118-10024-1-git-send-email-abhinav.k@samsung.com> With these new functions the drivers can pass the new dma attribute to IOMMU ops of dma mapping framework to differentiate between kernel and user space allocations. Signed-off-by: Abhinav --- arch/arm/include/asm/dma-mapping.h | 30 ++++++++++++++++++++++++++++++ 1 files changed, 30 insertions(+), 0 deletions(-) diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index bbef15d..7fc003a 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -14,6 +14,12 @@ #define DMA_ERROR_CODE (~0) extern struct dma_map_ops arm_dma_ops; +struct page_infodma { + struct page **pages; + unsigned long nr_pages; + unsigned long shared; +}; + static inline struct dma_map_ops *get_dma_ops(struct device *dev) { if (dev && dev->archdata.dma_ops) @@ -205,6 +211,14 @@ static inline void *dma_alloc_writecombine(struct device *dev, size_t size, return dma_alloc_attrs(dev, size, dma_handle, flag, &attrs); } +static inline void *dma_alloc_writecombine_user(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t flag) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_USER_SPACE, &attrs); + return dma_alloc_attrs(dev, size, dma_handle, flag, &attrs); +} + static inline void dma_free_writecombine(struct device *dev, size_t size, void *cpu_addr, dma_addr_t dma_handle) { @@ -213,6 +227,14 @@ static inline void dma_free_writecombine(struct device *dev, size_t size, return dma_free_attrs(dev, size, cpu_addr, dma_handle, &attrs); } +static inline void dma_free_writecombine_user(struct device *dev, size_t size, + void *cpu_addr, dma_addr_t dma_handle) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_USER_SPACE, &attrs); + return dma_free_attrs(dev, size, cpu_addr, dma_handle, &attrs); +} + static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size) { @@ -221,6 +243,14 @@ static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struc return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs); } +static inline int dma_mmap_writecombine_user(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_USER_SPACE, &attrs); + return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs); +} + /* * This can be called during boot to increase the size of the consistent * DMA region above it's default value of 2MB. It must be called before the -- 1.7.0.4 From abhinav.k at samsung.com Sat May 12 08:42:10 2012 From: abhinav.k at samsung.com (Abhinav) Date: Sat, 12 May 2012 14:12:10 +0530 Subject: [Linaro-mm-sig] [PATCH 3/3] [RFC]:DMA-MAPPING:Add check inside IOMMU ops for kernel or user space allcoation Message-ID: <1336812130-10132-1-git-send-email-abhinav.k@samsung.com> With this we can do a run time check on the allocation type for either kernel or user using the dma attribute passed to dma-mapping iommu ops. Signed-off-by: Abhinav --- arch/arm/mm/dma-mapping.c | 88 +++++++++++++++++++++++++++++++++++---------- 1 files changed, 69 insertions(+), 19 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 2c5a285..4cd46b4 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -428,6 +428,7 @@ static void __dma_free_remap(void *cpu_addr, size_t size) arm_vmregion_free(&consistent_head, c); } + #else /* !CONFIG_MMU */ #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) @@ -894,6 +895,35 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) size_t align; size_t count = size >> PAGE_SHIFT; int bit; + unsigned long mem_type = (unsigned long)gfp; + + + if(mem_type){ + + struct page_infodma *pages_in; + + pages_in = kzalloc( sizeof(struct page_infodma*), GFP_KERNEL); + if(!pages_in) + return NULL; + + pages_in->nr_pages = count; + + return (void*)pages_in; + + } + + /* + * Align the virtual region allocation - maximum alignment is + * a section size, minimum is a page size. This helps reduce + * fragmentation of the DMA space, and also prevents allocations + * smaller than a section from crossing a section boundary. + */ + + bit = fls(size - 1); + if (bit > SECTION_SHIFT) + bit = SECTION_SHIFT; + align = 1 << bit; + if (!consistent_pte[0]) { pr_err("%s: not initialised\n", __func__); @@ -901,16 +931,6 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) return NULL; } - /* - * Align the virtual region allocation - maximum alignment is - * a section size, minimum is a page size. This helps reduce - * fragmentation of the DMA space, and also prevents allocations - * smaller than a section from crossing a section boundary. - */ - bit = fls(size - 1); - if (bit > SECTION_SHIFT) - bit = SECTION_SHIFT; - align = 1 << bit; /* * Allocate a virtual address in the consistent mapping region. @@ -946,6 +966,7 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) return NULL; } + /* * Create a mapping in device IO address space for specified pages */ @@ -973,13 +994,16 @@ __iommu_create_mapping(struct device *dev, struct page **pages, size_t size) len = (j - i) << PAGE_SHIFT; ret = iommu_map(mapping->domain, iova, phys, len, 0); + if (ret < 0) goto fail; + iova += len; i = j; } return dma_addr; fail: + iommu_unmap(mapping->domain, dma_addr, iova-dma_addr); __free_iova(mapping, dma_addr, size); return DMA_ERROR_CODE; @@ -1007,6 +1031,8 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); struct page **pages; void *addr = NULL; + struct page_infodma *page_ret; + unsigned long mem_type; *handle = DMA_ERROR_CODE; size = PAGE_ALIGN(size); @@ -1019,11 +1045,19 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, if (*handle == DMA_ERROR_CODE) goto err_buffer; - addr = __iommu_alloc_remap(pages, size, gfp, prot); + mem_type = dma_get_attr(DMA_ATTR_USER_SPACE, attrs); + + addr = __iommu_alloc_remap(pages, size, mem_type, prot); if (!addr) goto err_mapping; - return addr; + if(mem_type){ + page_ret = (struct page_infodma *)addr; + page_ret->pages = pages; + return page_ret; + } + else + return addr; err_mapping: __iommu_remove_mapping(dev, *handle, size); @@ -1071,18 +1105,34 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) { - struct arm_vmregion *c; + + unsigned long mem_type = dma_get_attr(DMA_ATTR_USER_SPACE, attrs); + size = PAGE_ALIGN(size); - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); - if (c) { - struct page **pages = c->priv; - __dma_free_remap(cpu_addr, size); - __iommu_remove_mapping(dev, handle, size); - __iommu_free_buffer(dev, pages, size); + + if(mem_type){ + + struct page_infodma *pagesin = cpu_addr; + if (pagesin) { + struct page **pages = pagesin->pages; + __iommu_remove_mapping(dev, handle, size); + __iommu_free_buffer(dev, pages, size); + } + } + else{ + struct arm_vmregion *c; + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); + if (c) { + struct page **pages = c->priv; + __dma_free_remap(cpu_addr, size); + __iommu_remove_mapping(dev, handle, size); + __iommu_free_buffer(dev, pages, size); + } } } + /* * Map a part of the scatter-gather list into contiguous io address space */ -- 1.7.0.4 From vitalya at ti.com Mon May 14 17:49:56 2012 From: vitalya at ti.com (Vitaly Andrianov) Date: Mon, 14 May 2012 13:49:56 -0400 Subject: [Linaro-mm-sig] [PATCH] ARM: dma-mapping: use PMD size for section unmap Message-ID: <1337017796-27267-1-git-send-email-vitalya@ti.com> The dma_contiguous_remap() function clears existing section maps using the wrong size (PGDIR_SIZE instead of PMD_SIZE). This is a bug which does not affect non-LPAE systems, where PGDIR_SIZE and PMD_SIZE are the same. On LPAE systems, however, this bug causes the kernel to hang at this point. This fix has been tested on both LPAE and non-LPAE kernel builds. Signed-off-by: Vitaly Andrianov --- arch/arm/mm/dma-mapping.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 446dc1b..d220d4f 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -414,7 +414,7 @@ void __init dma_contiguous_remap(void) * Clear previous low-memory mapping */ for (addr = __phys_to_virt(start); addr < __phys_to_virt(end); - addr += PGDIR_SIZE) + addr += PMD_SIZE) pmd_clear(pmd_off_k(addr)); iotable_init(&map, 1); -- 1.7.5.4 From vitalya at ti.com Mon May 14 17:54:57 2012 From: vitalya at ti.com (Vitaly Andrianov) Date: Mon, 14 May 2012 13:54:57 -0400 Subject: [Linaro-mm-sig] [PATCH] ARM: LPAE: fix access flag setup in mem_type_table Message-ID: <1337018097-27308-1-git-send-email-vitalya@ti.com> A zero value for prot_sect in the memory types table implies that section mappings should never be created for the memory type in question. This is checked for in alloc_init_section(). With LPAE, we set a bit to mask access flag faults for kernel mappings. This breaks the aforementioned (!prot_sect) check in alloc_init_section(). This patch fixes this bug by first checking for a non-zero prot_sect before setting the PMD_SECT_AF flag. Signed-off-by: Vitaly Andrianov Acked-by: Catalin Marinas --- arch/arm/mm/mmu.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index b9fbec2..1e16b20 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -494,7 +494,8 @@ static void __init build_mem_type_table(void) */ for (i = 0; i < ARRAY_SIZE(mem_types); i++) { mem_types[i].prot_pte |= PTE_EXT_AF; - mem_types[i].prot_sect |= PMD_SECT_AF; + if (mem_types[i].prot_sect) + mem_types[i].prot_sect |= PMD_SECT_AF; } kern_pgprot |= PTE_EXT_AF; vecs_pgprot |= PTE_EXT_AF; -- 1.7.5.4 From vitalya at ti.com Mon May 14 17:56:29 2012 From: vitalya at ti.com (Vitaly Andrianov) Date: Mon, 14 May 2012 13:56:29 -0400 Subject: [Linaro-mm-sig] [PATCH] ARM: LPAE: fix access flag setup in mem_type_table Message-ID: <1337018189-27356-1-git-send-email-vitalya@ti.com> A zero value for prot_sect in the memory types table implies that section mappings should never be created for the memory type in question. This is checked for in alloc_init_section(). With LPAE, we set a bit to mask access flag faults for kernel mappings. This breaks the aforementioned (!prot_sect) check in alloc_init_section(). This patch fixes this bug by first checking for a non-zero prot_sect before setting the PMD_SECT_AF flag. Signed-off-by: Vitaly Andrianov Acked-by: Catalin Marinas --- arch/arm/mm/mmu.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index b9fbec2..1e16b20 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -494,7 +494,8 @@ static void __init build_mem_type_table(void) */ for (i = 0; i < ARRAY_SIZE(mem_types); i++) { mem_types[i].prot_pte |= PTE_EXT_AF; - mem_types[i].prot_sect |= PMD_SECT_AF; + if (mem_types[i].prot_sect) + mem_types[i].prot_sect |= PMD_SECT_AF; } kern_pgprot |= PTE_EXT_AF; vecs_pgprot |= PTE_EXT_AF; -- 1.7.5.4 From linux at arm.linux.org.uk Tue May 15 12:22:16 2012 From: linux at arm.linux.org.uk (Russell King - ARM Linux) Date: Tue, 15 May 2012 13:22:16 +0100 Subject: [Linaro-mm-sig] [PATCH] ARM: LPAE: fix access flag setup in mem_type_table In-Reply-To: <1337018097-27308-1-git-send-email-vitalya@ti.com> References: <1337018097-27308-1-git-send-email-vitalya@ti.com> Message-ID: <20120515122215.GK10453@n2100.arm.linux.org.uk> On Mon, May 14, 2012 at 01:54:57PM -0400, Vitaly Andrianov wrote: > A zero value for prot_sect in the memory types table implies that > section mappings should never be created for the memory type in question. > This is checked for in alloc_init_section(). > > With LPAE, we set a bit to mask access flag faults for kernel mappings. > This breaks the aforementioned (!prot_sect) check in alloc_init_section(). > > This patch fixes this bug by first checking for a non-zero > prot_sect before setting the PMD_SECT_AF flag. > > Signed-off-by: Vitaly Andrianov > > Acked-by: Catalin Marinas Please put this in the patch system. No blank line is needed between s-off-by and acked-by. From linux at arm.linux.org.uk Tue May 15 12:22:36 2012 From: linux at arm.linux.org.uk (Russell King - ARM Linux) Date: Tue, 15 May 2012 13:22:36 +0100 Subject: [Linaro-mm-sig] [PATCH] ARM: LPAE: fix access flag setup in mem_type_table In-Reply-To: <1337018189-27356-1-git-send-email-vitalya@ti.com> References: <1337018189-27356-1-git-send-email-vitalya@ti.com> Message-ID: <20120515122236.GL10453@n2100.arm.linux.org.uk> On Mon, May 14, 2012 at 01:56:29PM -0400, Vitaly Andrianov wrote: > A zero value for prot_sect in the memory types table implies that > section mappings should never be created for the memory type in question. > This is checked for in alloc_init_section(). > > With LPAE, we set a bit to mask access flag faults for kernel mappings. > This breaks the aforementioned (!prot_sect) check in alloc_init_section(). > > This patch fixes this bug by first checking for a non-zero > prot_sect before setting the PMD_SECT_AF flag. > > Signed-off-by: Vitaly Andrianov > > Acked-by: Catalin Marinas Same comments as for the previous patch. From m.szyprowski at samsung.com Tue May 15 20:51:27 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Tue, 15 May 2012 22:51:27 +0200 Subject: [Linaro-mm-sig] [PATCH] ARM: LPAE: fix access flag setup in mem_type_table In-Reply-To: <20120515122215.GK10453@n2100.arm.linux.org.uk> References: <1337018097-27308-1-git-send-email-vitalya@ti.com> <20120515122215.GK10453@n2100.arm.linux.org.uk> Message-ID: <4FB2C1CF.2030208@samsung.com> Hello, On 5/15/2012 2:22 PM, Russell King - ARM Linux wrote: > On Mon, May 14, 2012 at 01:54:57PM -0400, Vitaly Andrianov wrote: >> A zero value for prot_sect in the memory types table implies that >> section mappings should never be created for the memory type in question. >> This is checked for in alloc_init_section(). >> >> With LPAE, we set a bit to mask access flag faults for kernel mappings. >> This breaks the aforementioned (!prot_sect) check in alloc_init_section(). >> >> This patch fixes this bug by first checking for a non-zero >> prot_sect before setting the PMD_SECT_AF flag. >> >> Signed-off-by: Vitaly Andrianov >> >> Acked-by: Catalin Marinas > > Please put this in the patch system. No blank line is needed between > s-off-by and acked-by. This patch fixes the issue introduced by adding CMA to ARM architecture which I've pushed for testing to linux-next. I've added it to my for-next-cma branch: http://git.linaro.org/gitweb?p=people/mszyprowski/linux-dma-mapping.git;a=shortlog;h=refs/heads/for-next-cma Best regards -- Marek Szyprowski Samsung Poland R&D Center From airlied at gmail.com Thu May 17 10:31:50 2012 From: airlied at gmail.com (Dave Airlie) Date: Thu, 17 May 2012 11:31:50 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v2) Message-ID: <1337250710-31113-1-git-send-email-airlied@gmail.com> From: Dave Airlie The main requirement I have for this interface is for scanning out using the USB gpu devices. Since these devices have to read the framebuffer on updates and linearly compress it, using kmaps is a major overhead for every update. v2: fix warn issues pointed out by Sylwester Nawrocki. Signed-off-by: Dave Airlie --- drivers/base/dma-buf.c | 34 ++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 14 ++++++++++++++ 2 files changed, 48 insertions(+), 0 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 07cbbc6..750f92c 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -406,3 +406,37 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, dmabuf->ops->kunmap(dmabuf, page_num, vaddr); } EXPORT_SYMBOL_GPL(dma_buf_kunmap); + +/** + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. The same restrictions as for vmap and friends apply. + * @dma_buf: [in] buffer to vmap + * + * This call may fail due to lack of virtual mapping address space. + * These calls are optional in drivers. The intended use for them + * is for mapping objects linear in kernel space for high use objects. + * Please attempt to use kmap/kunmap before thinking about these interfaces. + */ +void *dma_buf_vmap(struct dma_buf *dmabuf) +{ + if (WARN_ON(!dmabuf)) + return NULL; + + if (dmabuf->ops->vmap) + return dmabuf->ops->vmap(dmabuf); + return NULL; +} +EXPORT_SYMBOL(dma_buf_vmap); + +/** + * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. + * @dma_buf: [in] buffer to vmap + */ +void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) +{ + if (WARN_ON(!dmabuf)) + return; + + if (dmabuf->ops->vunmap) + dmabuf->ops->vunmap(dmabuf, vaddr); +} +EXPORT_SYMBOL(dma_buf_vunmap); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 3efbfc2..b92b6de 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -92,6 +92,9 @@ struct dma_buf_ops { void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); void *(*kmap)(struct dma_buf *, unsigned long); void (*kunmap)(struct dma_buf *, unsigned long, void *); + + void *(*vmap)(struct dma_buf *); + void (*vunmap)(struct dma_buf *, void *vaddr); }; /** @@ -167,6 +170,9 @@ void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); void *dma_buf_kmap(struct dma_buf *, unsigned long); void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); + +void *dma_buf_vmap(struct dma_buf *); +void dma_buf_vunmap(struct dma_buf *, void *vaddr); #else static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, @@ -248,6 +254,14 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long pnum, void *vaddr) { } + +static inline void *dma_buf_vmap(struct dma_buf *) +{ +} + +static inline void dma_buf_vunmap(struct dma_buf *, void *vaddr); +{ +} #endif /* CONFIG_DMA_SHARED_BUFFER */ #endif /* __DMA_BUF_H__ */ -- 1.7.6 From m.szyprowski at samsung.com Thu May 17 10:54:41 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 12:54:41 +0200 Subject: [Linaro-mm-sig] [PATCHv2 0/4] ARM: replace custom consistent dma region with vmalloc Message-ID: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> Hello! Recent changes to ioremap and unification of vmalloc regions on ARM significantly reduces the possible size of the consistent dma region. They are significantly limited allowed dma coherent/writecombine allocations. This experimental patchset replaces custom consistent dma regions usage in dma-mapping framework in favour of generic vmalloc areas created on demand for each coherent and writecombine allocations. The main purpose for this patchset is to remove 2MiB limit of dma coherent/writecombine allocations. Atomic allocations are served from special pool preallocated on boot, becasue vmalloc areas cannot be reliably created in atomic context. This patch is based on vanilla v3.4-rc7 release. Atomic allocations have been tested with s3c-sdhci driver on Samsung UniversalC210 board with dmabounce code enabled to force dma_alloc_coherent() use on each dma_map_* call (some of them are made from interrupts). Best regards Marek Szyprowski Samsung Poland R&D Center Changelog: v2: - added support for atomic allocations (served from preallocated pool) - minor cleanup here and there - rebased onto v3.4-rc7 v1: http://thread.gmane.org/gmane.linux.kernel.mm/76703 - initial version Patch summary: Marek Szyprowski (4): mm: vmalloc: use const void * for caller argument mm: vmalloc: export find_vm_area() function mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework ARM: dma-mapping: remove custom consistent dma region Documentation/kernel-parameters.txt | 4 + arch/arm/include/asm/dma-mapping.h | 2 +- arch/arm/mm/dma-mapping.c | 360 ++++++++++++++++------------------- include/linux/vmalloc.h | 10 +- mm/vmalloc.c | 31 ++-- 5 files changed, 185 insertions(+), 196 deletions(-) -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 10:54:42 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 12:54:42 +0200 Subject: [Linaro-mm-sig] [PATCHv2 1/4] mm: vmalloc: use const void * for caller argument In-Reply-To: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337252085-22039-2-git-send-email-m.szyprowski@samsung.com> 'const void *' is a safer type for caller function type. This patch updates all references to caller function type. Signed-off-by: Marek Szyprowski Reviewed-by: Kyungmin Park --- include/linux/vmalloc.h | 8 ++++---- mm/vmalloc.c | 18 +++++++++--------- 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index dcdfc2b..2e28f4d 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -32,7 +32,7 @@ struct vm_struct { struct page **pages; unsigned int nr_pages; phys_addr_t phys_addr; - void *caller; + const void *caller; }; /* @@ -62,7 +62,7 @@ extern void *vmalloc_32_user(unsigned long size); extern void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot); extern void *__vmalloc_node_range(unsigned long size, unsigned long align, unsigned long start, unsigned long end, gfp_t gfp_mask, - pgprot_t prot, int node, void *caller); + pgprot_t prot, int node, const void *caller); extern void vfree(const void *addr); extern void *vmap(struct page **pages, unsigned int count, @@ -85,13 +85,13 @@ static inline size_t get_vm_area_size(const struct vm_struct *area) extern struct vm_struct *get_vm_area(unsigned long size, unsigned long flags); extern struct vm_struct *get_vm_area_caller(unsigned long size, - unsigned long flags, void *caller); + unsigned long flags, const void *caller); extern struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags, unsigned long start, unsigned long end); extern struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags, unsigned long start, unsigned long end, - void *caller); + const void *caller); extern struct vm_struct *remove_vm_area(const void *addr); extern int map_vm_area(struct vm_struct *area, pgprot_t prot, diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 94dff88..8bc7f3ef 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1279,7 +1279,7 @@ DEFINE_RWLOCK(vmlist_lock); struct vm_struct *vmlist; static void setup_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, - unsigned long flags, void *caller) + unsigned long flags, const void *caller) { vm->flags = flags; vm->addr = (void *)va->va_start; @@ -1305,7 +1305,7 @@ static void insert_vmalloc_vmlist(struct vm_struct *vm) } static void insert_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, - unsigned long flags, void *caller) + unsigned long flags, const void *caller) { setup_vmalloc_vm(vm, va, flags, caller); insert_vmalloc_vmlist(vm); @@ -1313,7 +1313,7 @@ static void insert_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long align, unsigned long flags, unsigned long start, - unsigned long end, int node, gfp_t gfp_mask, void *caller) + unsigned long end, int node, gfp_t gfp_mask, const void *caller) { struct vmap_area *va; struct vm_struct *area; @@ -1374,7 +1374,7 @@ EXPORT_SYMBOL_GPL(__get_vm_area); struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags, unsigned long start, unsigned long end, - void *caller) + const void *caller) { return __get_vm_area_node(size, 1, flags, start, end, -1, GFP_KERNEL, caller); @@ -1396,7 +1396,7 @@ struct vm_struct *get_vm_area(unsigned long size, unsigned long flags) } struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags, - void *caller) + const void *caller) { return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END, -1, GFP_KERNEL, caller); @@ -1567,9 +1567,9 @@ EXPORT_SYMBOL(vmap); static void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, pgprot_t prot, - int node, void *caller); + int node, const void *caller); static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, - pgprot_t prot, int node, void *caller) + pgprot_t prot, int node, const void *caller) { const int order = 0; struct page **pages; @@ -1642,7 +1642,7 @@ fail: */ void *__vmalloc_node_range(unsigned long size, unsigned long align, unsigned long start, unsigned long end, gfp_t gfp_mask, - pgprot_t prot, int node, void *caller) + pgprot_t prot, int node, const void *caller) { struct vm_struct *area; void *addr; @@ -1698,7 +1698,7 @@ fail: */ static void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, pgprot_t prot, - int node, void *caller) + int node, const void *caller) { return __vmalloc_node_range(size, align, VMALLOC_START, VMALLOC_END, gfp_mask, prot, node, caller); -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 10:54:43 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 12:54:43 +0200 Subject: [Linaro-mm-sig] [PATCHv2 2/4] mm: vmalloc: export find_vm_area() function In-Reply-To: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337252085-22039-3-git-send-email-m.szyprowski@samsung.com> find_vm_area() function is usefull for other core subsystems (like dma-mapping) to get access to vm_area internals. Signed-off-by: Marek Szyprowski Reviewed-by: Kyungmin Park --- include/linux/vmalloc.h | 1 + mm/vmalloc.c | 10 +++++++++- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 2e28f4d..6071e91 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -93,6 +93,7 @@ extern struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long start, unsigned long end, const void *caller); extern struct vm_struct *remove_vm_area(const void *addr); +extern struct vm_struct *find_vm_area(const void *addr); extern int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page ***pages); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 8bc7f3ef..8cb7f22 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1402,7 +1402,15 @@ struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags, -1, GFP_KERNEL, caller); } -static struct vm_struct *find_vm_area(const void *addr) +/** + * find_vm_area - find a continuous kernel virtual area + * @addr: base address + * + * Search for the kernel VM area starting at @addr, and return it. + * It is up to the caller to do all required locking to keep the returned + * pointer valid. + */ +struct vm_struct *find_vm_area(const void *addr) { struct vmap_area *va; -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 10:54:44 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 12:54:44 +0200 Subject: [Linaro-mm-sig] [PATCHv2 3/4] mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework In-Reply-To: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> Add new type of vm_area intented to be used for consisten mappings created by dma-mapping framework. Signed-off-by: Marek Szyprowski Reviewed-by: Kyungmin Park --- include/linux/vmalloc.h | 1 + mm/vmalloc.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 6071e91..8a9555a 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -14,6 +14,7 @@ struct vm_area_struct; /* vma defining user mapping in mm_types.h */ #define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */ #define VM_VPAGES 0x00000010 /* buffer for pages was vmalloc'ed */ #define VM_UNLIST 0x00000020 /* vm_struct is not listed in vmlist */ +#define VM_DMA 0x00000040 /* used by dma-mapping framework */ /* bits [20..32] reserved for arch specific ioremap internals */ /* diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 8cb7f22..9c13bab 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2582,6 +2582,9 @@ static int s_show(struct seq_file *m, void *p) if (v->flags & VM_IOREMAP) seq_printf(m, " ioremap"); + if (v->flags & VM_DMA) + seq_printf(m, " dma"); + if (v->flags & VM_ALLOC) seq_printf(m, " vmalloc"); -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 10:54:45 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 12:54:45 +0200 Subject: [Linaro-mm-sig] [PATCHv2 4/4] ARM: dma-mapping: remove custom consistent dma region In-Reply-To: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> This patch changes dma-mapping subsystem to use generic vmalloc areas for all consistent dma allocations. This increases the total size limit of the consistent allocations and removes platform hacks and a lot of duplicated code. Atomic allocations are served from special pool preallocated on boot, becasue vmalloc areas cannot be reliably created in atomic context. Signed-off-by: Marek Szyprowski --- Documentation/kernel-parameters.txt | 4 + arch/arm/include/asm/dma-mapping.h | 2 +- arch/arm/mm/dma-mapping.c | 360 ++++++++++++++++------------------- 3 files changed, 171 insertions(+), 195 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c1601e5..ba58f50 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -515,6 +515,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. a hypervisor. Default: yes + coherent_pool=nn[KMG] [ARM,KNL] + Sets the size of memory pool for coherent, atomic dma + allocations. + code_bytes [X86] How many bytes of object code to print in an oops report. Range: 0 - 8192 diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index cb3b7c9..92b0afb 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -210,7 +210,7 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *, * DMA region above it's default value of 2MB. It must be called before the * memory allocator is initialised, i.e. before any core_initcall. */ -extern void __init init_consistent_dma_size(unsigned long size); +static inline void init_consistent_dma_size(unsigned long size) { } #ifdef CONFIG_DMABOUNCE diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index db23ae4..3be4de2 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -19,6 +19,8 @@ #include #include #include +#include +#include #include #include @@ -119,210 +121,178 @@ static void __dma_free_buffer(struct page *page, size_t size) } #ifdef CONFIG_MMU - -#define CONSISTENT_OFFSET(x) (((unsigned long)(x) - consistent_base) >> PAGE_SHIFT) -#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - consistent_base) >> PMD_SHIFT) - -/* - * These are the page tables (2MB each) covering uncached, DMA consistent allocations - */ -static pte_t **consistent_pte; - -#define DEFAULT_CONSISTENT_DMA_SIZE SZ_2M - -unsigned long consistent_base = CONSISTENT_END - DEFAULT_CONSISTENT_DMA_SIZE; - -void __init init_consistent_dma_size(unsigned long size) -{ - unsigned long base = CONSISTENT_END - ALIGN(size, SZ_2M); - - BUG_ON(consistent_pte); /* Check we're called before DMA region init */ - BUG_ON(base < VMALLOC_END); - - /* Grow region to accommodate specified size */ - if (base < consistent_base) - consistent_base = base; -} - -#include "vmregion.h" - -static struct arm_vmregion_head consistent_head = { - .vm_lock = __SPIN_LOCK_UNLOCKED(&consistent_head.vm_lock), - .vm_list = LIST_HEAD_INIT(consistent_head.vm_list), - .vm_end = CONSISTENT_END, -}; - #ifdef CONFIG_HUGETLB_PAGE #error ARM Coherent DMA allocator does not (yet) support huge TLB #endif -/* - * Initialise the consistent memory allocation. - */ -static int __init consistent_init(void) -{ - int ret = 0; - pgd_t *pgd; - pud_t *pud; - pmd_t *pmd; - pte_t *pte; - int i = 0; - unsigned long base = consistent_base; - unsigned long num_ptes = (CONSISTENT_END - base) >> PMD_SHIFT; - - consistent_pte = kmalloc(num_ptes * sizeof(pte_t), GFP_KERNEL); - if (!consistent_pte) { - pr_err("%s: no memory\n", __func__); - return -ENOMEM; - } - - pr_debug("DMA memory: 0x%08lx - 0x%08lx:\n", base, CONSISTENT_END); - consistent_head.vm_start = base; - - do { - pgd = pgd_offset(&init_mm, base); - - pud = pud_alloc(&init_mm, pgd, base); - if (!pud) { - printk(KERN_ERR "%s: no pud tables\n", __func__); - ret = -ENOMEM; - break; - } - - pmd = pmd_alloc(&init_mm, pud, base); - if (!pmd) { - printk(KERN_ERR "%s: no pmd tables\n", __func__); - ret = -ENOMEM; - break; - } - WARN_ON(!pmd_none(*pmd)); - - pte = pte_alloc_kernel(pmd, base); - if (!pte) { - printk(KERN_ERR "%s: no pte tables\n", __func__); - ret = -ENOMEM; - break; - } - - consistent_pte[i++] = pte; - base += PMD_SIZE; - } while (base < CONSISTENT_END); - - return ret; -} - -core_initcall(consistent_init); - static void * __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, const void *caller) { - struct arm_vmregion *c; - size_t align; - int bit; + struct vm_struct *area; + unsigned long addr; - if (!consistent_pte) { - printk(KERN_ERR "%s: not initialised\n", __func__); + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); + if (!area) + return NULL; + addr = (unsigned long)area->addr; + area->phys_addr = __pfn_to_phys(page_to_pfn(page)); + + if (ioremap_page_range(addr, addr + size, area->phys_addr, prot)) { + vunmap((void *)addr); + return NULL; + } + return (void *)addr; +} + +static void __dma_free_remap(void *cpu_addr, size_t size) +{ + struct vm_struct *area; + + read_lock(&vmlist_lock); + area = find_vm_area(cpu_addr); + if (!area) { + pr_err("%s: trying to free invalid coherent area: %p\n", + __func__, cpu_addr); + dump_stack(); + read_unlock(&vmlist_lock); + return; + } + unmap_kernel_range((unsigned long)cpu_addr, size); + read_unlock(&vmlist_lock); + vunmap(cpu_addr); +} + +struct dma_pool { + size_t size; + spinlock_t lock; + unsigned long *bitmap; + unsigned long count; + void *vaddr; + struct page *page; +}; + +static struct dma_pool atomic_pool = { + .size = SZ_256K, +}; + +static int __init early_coherent_pool(char *p) +{ + atomic_pool.size = memparse(p, &p); + return 0; +} +early_param("coherent_pool", early_coherent_pool); + +/* + * Initialise the coherent pool for atomic allocations. + */ +static int __init atomic_pool_init(void) +{ + struct dma_pool *pool = &atomic_pool; + pgprot_t prot = pgprot_dmacoherent(pgprot_kernel); + unsigned long count = pool->size >> PAGE_SHIFT; + gfp_t gfp = GFP_KERNEL | GFP_DMA; + unsigned long *bitmap; + struct page *page; + void *ptr; + int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); + + bitmap = kzalloc(bitmap_size, GFP_KERNEL); + if (!bitmap) + goto no_bitmap; + + page = __dma_alloc_buffer(NULL, pool->size, gfp); + if (!page) + goto no_page; + + ptr = __dma_alloc_remap(page, pool->size, gfp, prot, NULL); + if (ptr) { + spin_lock_init(&pool->lock); + pool->vaddr = ptr; + pool->page = page; + pool->bitmap = bitmap; + pool->count = count; + pr_info("DMA: preallocated %u KiB pool for atomic coherent allocations\n", + (unsigned)pool->size / 1024); + return 0; + } + + __dma_free_buffer(page, pool->size); +no_page: + kfree(bitmap); +no_bitmap: + pr_err("DMA: failed to allocate %u KiB pool for atomic coherent allocation\n", + (unsigned)pool->size / 1024); + return -ENOMEM; +} +core_initcall(atomic_pool_init); + +static void *__alloc_from_pool(size_t size, struct page **ret_page) +{ + struct dma_pool *pool = &atomic_pool; + unsigned int count = size >> PAGE_SHIFT; + unsigned int pageno; + unsigned long flags; + void *ptr = NULL; + size_t align; + + if (!pool->vaddr) { + pr_err("%s: coherent pool not initialised!\n", __func__); dump_stack(); return NULL; } /* - * Align the virtual region allocation - maximum alignment is - * a section size, minimum is a page size. This helps reduce - * fragmentation of the DMA space, and also prevents allocations - * smaller than a section from crossing a section boundary. + * Align the region allocation - allocations from pool are rather + * small, so align them to their order in pages, minimum is a page + * size. This helps reduce fragmentation of the DMA space. */ - bit = fls(size - 1); - if (bit > SECTION_SHIFT) - bit = SECTION_SHIFT; - align = 1 << bit; + align = PAGE_SIZE << get_order(size); - /* - * Allocate a virtual address in the consistent mapping region. - */ - c = arm_vmregion_alloc(&consistent_head, align, size, - gfp & ~(__GFP_DMA | __GFP_HIGHMEM), caller); - if (c) { - pte_t *pte; - int idx = CONSISTENT_PTE_INDEX(c->vm_start); - u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); - - pte = consistent_pte[idx] + off; - c->vm_pages = page; - - do { - BUG_ON(!pte_none(*pte)); - - set_pte_ext(pte, mk_pte(page, prot), 0); - page++; - pte++; - off++; - if (off >= PTRS_PER_PTE) { - off = 0; - pte = consistent_pte[++idx]; - } - } while (size -= PAGE_SIZE); - - dsb(); - - return (void *)c->vm_start; + spin_lock_irqsave(&pool->lock, flags); + pageno = bitmap_find_next_zero_area(pool->bitmap, pool->count, + 0, count, (1 << align) - 1); + if (pageno < pool->count) { + bitmap_set(pool->bitmap, pageno, count); + ptr = pool->vaddr + PAGE_SIZE * pageno; + *ret_page = pool->page + pageno; } - return NULL; + spin_unlock_irqrestore(&pool->lock, flags); + + return ptr; } -static void __dma_free_remap(void *cpu_addr, size_t size) +static int __free_from_pool(void *start, size_t size) { - struct arm_vmregion *c; - unsigned long addr; - pte_t *ptep; - int idx; - u32 off; + struct dma_pool *pool = &atomic_pool; + unsigned long pageno, count; + unsigned long flags; - c = arm_vmregion_find_remove(&consistent_head, (unsigned long)cpu_addr); - if (!c) { - printk(KERN_ERR "%s: trying to free invalid coherent area: %p\n", - __func__, cpu_addr); + if (start < pool->vaddr || start > pool->vaddr + pool->size) + return 0; + + if (start + size > pool->vaddr + pool->size) { + pr_err("%s: freeing wrong coherent size from pool\n", __func__); dump_stack(); - return; + return 0; } - if ((c->vm_end - c->vm_start) != size) { - printk(KERN_ERR "%s: freeing wrong coherent size (%ld != %d)\n", - __func__, c->vm_end - c->vm_start, size); - dump_stack(); - size = c->vm_end - c->vm_start; - } + pageno = (start - pool->vaddr) >> PAGE_SHIFT; + count = size >> PAGE_SHIFT; - idx = CONSISTENT_PTE_INDEX(c->vm_start); - off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); - ptep = consistent_pte[idx] + off; - addr = c->vm_start; - do { - pte_t pte = ptep_get_and_clear(&init_mm, addr, ptep); + spin_lock_irqsave(&pool->lock, flags); + bitmap_clear(pool->bitmap, pageno, count); + spin_unlock_irqrestore(&pool->lock, flags); - ptep++; - addr += PAGE_SIZE; - off++; - if (off >= PTRS_PER_PTE) { - off = 0; - ptep = consistent_pte[++idx]; - } - - if (pte_none(pte) || !pte_present(pte)) - printk(KERN_CRIT "%s: bad page in kernel page table\n", - __func__); - } while (size -= PAGE_SIZE); - - flush_tlb_kernel_range(c->vm_start, c->vm_end); - - arm_vmregion_free(&consistent_head, c); + return 1; } #else /* !CONFIG_MMU */ #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) #define __dma_free_remap(addr, size) do { } while (0) +#define __alloc_from_pool(size, ret_page) NULL +#define __free_from_pool(addr, size) 0 #endif /* CONFIG_MMU */ @@ -345,6 +315,16 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, *handle = ~0; size = PAGE_ALIGN(size); + /* + * Atomic allocations need special handling + */ + if (gfp & GFP_ATOMIC && !arch_is_coherent()) { + addr = __alloc_from_pool(size, &page); + if (addr) + *handle = pfn_to_dma(dev, page_to_pfn(page)); + return addr; + } + page = __dma_alloc_buffer(dev, size, gfp); if (!page) return NULL; @@ -398,24 +378,16 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma, { int ret = -ENXIO; #ifdef CONFIG_MMU - unsigned long user_size, kern_size; - struct arm_vmregion *c; + unsigned long user_count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; + unsigned long pfn = dma_to_pfn(dev, dma_addr); + unsigned long off = vma->vm_pgoff; - user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; - - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); - if (c) { - unsigned long off = vma->vm_pgoff; - - kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT; - - if (off < kern_size && - user_size <= (kern_size - off)) { - ret = remap_pfn_range(vma, vma->vm_start, - page_to_pfn(c->vm_pages) + off, - user_size << PAGE_SHIFT, - vma->vm_page_prot); - } + if (off < count && user_count <= (count - off)) { + ret = remap_pfn_range(vma, vma->vm_start, + pfn + off, + user_count << PAGE_SHIFT, + vma->vm_page_prot); } #endif /* CONFIG_MMU */ @@ -444,13 +416,16 @@ EXPORT_SYMBOL(dma_mmap_writecombine); */ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle) { - WARN_ON(irqs_disabled()); - if (dma_release_from_coherent(dev, get_order(size), cpu_addr)) return; size = PAGE_ALIGN(size); + if (__free_from_pool(cpu_addr, size)) + return; + + WARN_ON(irqs_disabled()); + if (!arch_is_coherent()) __dma_free_remap(cpu_addr, size); @@ -726,9 +701,6 @@ EXPORT_SYMBOL(dma_set_mask); static int __init dma_debug_do_init(void) { -#ifdef CONFIG_MMU - arm_vmregion_create_proc("dma-mappings", &consistent_head); -#endif dma_debug_init(PREALLOC_DMA_DEBUG_ENTRIES); return 0; } -- 1.7.10.1 From rob.clark at linaro.org Thu May 17 12:32:19 2012 From: rob.clark at linaro.org (Rob Clark) Date: Thu, 17 May 2012 06:32:19 -0600 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v2) In-Reply-To: <1337250710-31113-1-git-send-email-airlied@gmail.com> References: <1337250710-31113-1-git-send-email-airlied@gmail.com> Message-ID: On Thu, May 17, 2012 at 4:31 AM, Dave Airlie wrote: > From: Dave Airlie > > The main requirement I have for this interface is for scanning out > using the USB gpu devices. Since these devices have to read the > framebuffer on updates and linearly compress it, using kmaps > is a major overhead for every update. > > v2: fix warn issues pointed out by Sylwester Nawrocki. > > Signed-off-by: Dave Airlie > --- > ?drivers/base/dma-buf.c ?| ? 34 ++++++++++++++++++++++++++++++++++ > ?include/linux/dma-buf.h | ? 14 ++++++++++++++ > ?2 files changed, 48 insertions(+), 0 deletions(-) > > diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c > index 07cbbc6..750f92c 100644 > --- a/drivers/base/dma-buf.c > +++ b/drivers/base/dma-buf.c > @@ -406,3 +406,37 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, > ? ? ? ? ? ? ? ?dmabuf->ops->kunmap(dmabuf, page_num, vaddr); > ?} > ?EXPORT_SYMBOL_GPL(dma_buf_kunmap); > + > +/** > + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. The same restrictions as for vmap and friends apply. > + * @dma_buf: ? [in] ? ?buffer to vmap > + * > + * This call may fail due to lack of virtual mapping address space. > + * These calls are optional in drivers. The intended use for them > + * is for mapping objects linear in kernel space for high use objects. > + * Please attempt to use kmap/kunmap before thinking about these interfaces. > + */ > +void *dma_buf_vmap(struct dma_buf *dmabuf) > +{ > + ? ? ? if (WARN_ON(!dmabuf)) > + ? ? ? ? ? ? ? return NULL; > + > + ? ? ? if (dmabuf->ops->vmap) > + ? ? ? ? ? ? ? return dmabuf->ops->vmap(dmabuf); > + ? ? ? return NULL; > +} > +EXPORT_SYMBOL(dma_buf_vmap); > + > +/** > + * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. > + * @dma_buf: ? [in] ? ?buffer to vmap > + */ > +void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) > +{ > + ? ? ? if (WARN_ON(!dmabuf)) > + ? ? ? ? ? ? ? return; > + > + ? ? ? if (dmabuf->ops->vunmap) > + ? ? ? ? ? ? ? dmabuf->ops->vunmap(dmabuf, vaddr); > +} > +EXPORT_SYMBOL(dma_buf_vunmap); > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index 3efbfc2..b92b6de 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -92,6 +92,9 @@ struct dma_buf_ops { > ? ? ? ?void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); > ? ? ? ?void *(*kmap)(struct dma_buf *, unsigned long); > ? ? ? ?void (*kunmap)(struct dma_buf *, unsigned long, void *); > + > + ? ? ? void *(*vmap)(struct dma_buf *); > + ? ? ? void (*vunmap)(struct dma_buf *, void *vaddr); > ?}; > > ?/** > @@ -167,6 +170,9 @@ void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); > ?void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); > ?void *dma_buf_kmap(struct dma_buf *, unsigned long); > ?void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); > + > +void *dma_buf_vmap(struct dma_buf *); > +void dma_buf_vunmap(struct dma_buf *, void *vaddr); > ?#else > > ?static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > @@ -248,6 +254,14 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?unsigned long pnum, void *vaddr) > ?{ > ?} > + > +static inline void *dma_buf_vmap(struct dma_buf *) > +{ > +} > + > +static inline void dma_buf_vunmap(struct dma_buf *, void *vaddr); > +{ > +} I think these two will cause compile issues for !CONFIG_DMA_SHARED_BUFFER case due to no parameter name for first arg. Other than that, it looks ok.. although is vmap really less overhead? Using kmap can use existing lowmem address for lowmem pages. Or is the issue that you somehow need access to the entire buffer in one shot? BR, -R > ?#endif /* CONFIG_DMA_SHARED_BUFFER */ > > ?#endif /* __DMA_BUF_H__ */ > -- > 1.7.6 > > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel From airlied at gmail.com Thu May 17 13:14:15 2012 From: airlied at gmail.com (Dave Airlie) Date: Thu, 17 May 2012 14:14:15 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v2) In-Reply-To: References: <1337250710-31113-1-git-send-email-airlied@gmail.com> Message-ID: On Thu, May 17, 2012 at 1:32 PM, Rob Clark wrote: > On Thu, May 17, 2012 at 4:31 AM, Dave Airlie wrote: >> From: Dave Airlie >> >> The main requirement I have for this interface is for scanning out >> using the USB gpu devices. Since these devices have to read the >> framebuffer on updates and linearly compress it, using kmaps >> is a major overhead for every update. >> >> v2: fix warn issues pointed out by Sylwester Nawrocki. >> >> Signed-off-by: Dave Airlie >> --- >> ?drivers/base/dma-buf.c ?| ? 34 ++++++++++++++++++++++++++++++++++ >> ?include/linux/dma-buf.h | ? 14 ++++++++++++++ >> ?2 files changed, 48 insertions(+), 0 deletions(-) >> >> diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c >> index 07cbbc6..750f92c 100644 >> --- a/drivers/base/dma-buf.c >> +++ b/drivers/base/dma-buf.c >> @@ -406,3 +406,37 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, >> ? ? ? ? ? ? ? ?dmabuf->ops->kunmap(dmabuf, page_num, vaddr); >> ?} >> ?EXPORT_SYMBOL_GPL(dma_buf_kunmap); >> + >> +/** >> + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. The same restrictions as for vmap and friends apply. >> + * @dma_buf: ? [in] ? ?buffer to vmap >> + * >> + * This call may fail due to lack of virtual mapping address space. >> + * These calls are optional in drivers. The intended use for them >> + * is for mapping objects linear in kernel space for high use objects. >> + * Please attempt to use kmap/kunmap before thinking about these interfaces. >> + */ >> +void *dma_buf_vmap(struct dma_buf *dmabuf) >> +{ >> + ? ? ? if (WARN_ON(!dmabuf)) >> + ? ? ? ? ? ? ? return NULL; >> + >> + ? ? ? if (dmabuf->ops->vmap) >> + ? ? ? ? ? ? ? return dmabuf->ops->vmap(dmabuf); >> + ? ? ? return NULL; >> +} >> +EXPORT_SYMBOL(dma_buf_vmap); >> + >> +/** >> + * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. >> + * @dma_buf: ? [in] ? ?buffer to vmap >> + */ >> +void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) >> +{ >> + ? ? ? if (WARN_ON(!dmabuf)) >> + ? ? ? ? ? ? ? return; >> + >> + ? ? ? if (dmabuf->ops->vunmap) >> + ? ? ? ? ? ? ? dmabuf->ops->vunmap(dmabuf, vaddr); >> +} >> +EXPORT_SYMBOL(dma_buf_vunmap); >> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h >> index 3efbfc2..b92b6de 100644 >> --- a/include/linux/dma-buf.h >> +++ b/include/linux/dma-buf.h >> @@ -92,6 +92,9 @@ struct dma_buf_ops { >> ? ? ? ?void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); >> ? ? ? ?void *(*kmap)(struct dma_buf *, unsigned long); >> ? ? ? ?void (*kunmap)(struct dma_buf *, unsigned long, void *); >> + >> + ? ? ? void *(*vmap)(struct dma_buf *); >> + ? ? ? void (*vunmap)(struct dma_buf *, void *vaddr); >> ?}; >> >> ?/** >> @@ -167,6 +170,9 @@ void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); >> ?void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); >> ?void *dma_buf_kmap(struct dma_buf *, unsigned long); >> ?void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); >> + >> +void *dma_buf_vmap(struct dma_buf *); >> +void dma_buf_vunmap(struct dma_buf *, void *vaddr); >> ?#else >> >> ?static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, >> @@ -248,6 +254,14 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?unsigned long pnum, void *vaddr) >> ?{ >> ?} >> + >> +static inline void *dma_buf_vmap(struct dma_buf *) >> +{ >> +} >> + >> +static inline void dma_buf_vunmap(struct dma_buf *, void *vaddr); >> +{ >> +} > > I think these two will cause compile issues for > !CONFIG_DMA_SHARED_BUFFER case due to no parameter name for first arg. Oops, will send a new one, > > Other than that, it looks ok.. although is vmap really less overhead? > Using kmap can use existing lowmem address for lowmem pages. ?Or is > the issue that you somehow need access to the entire buffer in one > shot? Well the USB code has to linearly read chunks of the framebuffer and RLE compress them, having to cut things up into kmapped chunks and shove those out, now if we are rendering something that draws a lot, the CPU overheads start to mount right up. Dave. From m.szyprowski at samsung.com Thu May 17 13:13:50 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:50 +0200 Subject: [Linaro-mm-sig] [PATCHv10 00/11] ARM: DMA-mapping framework redesign Message-ID: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Hello, This is another update on dma-mapping redesign patches for ARM. I integrated a few minor fixes that were needed to solve the issues reported after putting these patches for testing in linux-next branch. The patches are also available on my git repository at: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.4-rc7-arm-dma-v10 History of the development: v1: (initial version of the DMA-mapping redesign patches): http://www.spinics.net/lists/linux-mm/msg21241.html v2: http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000571.html http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000577.html v3: http://www.spinics.net/lists/linux-mm/msg25490.html v4 and v5: http://www.spinics.net/lists/arm-kernel/msg151147.html http://www.spinics.net/lists/arm-kernel/msg154889.html v6: http://www.spinics.net/lists/linux-mm/msg29903.html v7: http://www.spinics.net/lists/arm-kernel/msg162149.html v8: http://www.spinics.net/lists/arm-kernel/msg168478.html v9: http://www.spinics.net/lists/linux-arch/msg17443.html Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: Marek Szyprowski (11): common: add dma_mmap_from_coherent() function ARM: dma-mapping: use dma_mmap_from_coherent() ARM: dma-mapping: use pr_* instread of printk ARM: dma-mapping: introduce DMA_ERROR_CODE constant ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops ARM: dma-mapping: use asm-generic/dma-mapping-common.h ARM: dma-mapping: implement dma sg methods on top of any generic dma ops ARM: dma-mapping: move all dma bounce code to separate dma ops structure ARM: dma-mapping: remove redundant code and do the cleanup ARM: dma-mapping: use alloc, mmap, free from dma_ops ARM: dma-mapping: add support for IOMMU mapper arch/arm/Kconfig | 9 + arch/arm/common/dmabounce.c | 84 ++- arch/arm/include/asm/device.h | 4 + arch/arm/include/asm/dma-iommu.h | 34 ++ arch/arm/include/asm/dma-mapping.h | 407 ++++----------- arch/arm/mm/dma-mapping.c | 1015 ++++++++++++++++++++++++++++++------ arch/arm/mm/vmregion.h | 2 +- drivers/base/dma-coherent.c | 42 ++ include/asm-generic/dma-coherent.h | 4 +- 9 files changed, 1134 insertions(+), 467 deletions(-) create mode 100644 arch/arm/include/asm/dma-iommu.h -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:51 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:51 +0200 Subject: [Linaro-mm-sig] [PATCHv10 01/11] common: add dma_mmap_from_coherent() function In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-2-git-send-email-m.szyprowski@samsung.com> Add a common helper for dma-mapping core for mapping a coherent buffer to userspace. Reported-by: Subash Patel Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Tested-By: Subash Patel --- drivers/base/dma-coherent.c | 42 ++++++++++++++++++++++++++++++++++++ include/asm-generic/dma-coherent.h | 4 +++- 2 files changed, 45 insertions(+), 1 deletion(-) diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c index bb0025c..1b85949 100644 --- a/drivers/base/dma-coherent.c +++ b/drivers/base/dma-coherent.c @@ -10,6 +10,7 @@ struct dma_coherent_mem { void *virt_base; dma_addr_t device_base; + phys_addr_t pfn_base; int size; int flags; unsigned long *bitmap; @@ -44,6 +45,7 @@ int dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr, dev->dma_mem->virt_base = mem_base; dev->dma_mem->device_base = device_addr; + dev->dma_mem->pfn_base = PFN_DOWN(bus_addr); dev->dma_mem->size = pages; dev->dma_mem->flags = flags; @@ -176,3 +178,43 @@ int dma_release_from_coherent(struct device *dev, int order, void *vaddr) return 0; } EXPORT_SYMBOL(dma_release_from_coherent); + +/** + * dma_mmap_from_coherent() - try to mmap the memory allocated from + * per-device coherent memory pool to userspace + * @dev: device from which the memory was allocated + * @vma: vm_area for the userspace memory + * @vaddr: cpu address returned by dma_alloc_from_coherent + * @size: size of the memory buffer allocated by dma_alloc_from_coherent + * + * This checks whether the memory was allocated from the per-device + * coherent memory pool and if so, maps that memory to the provided vma. + * + * Returns 1 if we correctly mapped the memory, or 0 if + * dma_release_coherent() should proceed with mapping memory from + * generic pools. + */ +int dma_mmap_from_coherent(struct device *dev, struct vm_area_struct *vma, + void *vaddr, size_t size, int *ret) +{ + struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL; + + if (mem && vaddr >= mem->virt_base && vaddr + size <= + (mem->virt_base + (mem->size << PAGE_SHIFT))) { + unsigned long off = vma->vm_pgoff; + int start = (vaddr - mem->virt_base) >> PAGE_SHIFT; + int user_count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + int count = size >> PAGE_SHIFT; + + *ret = -ENXIO; + if (off < count && user_count <= count - off) { + unsigned pfn = mem->pfn_base + start + off; + *ret = remap_pfn_range(vma, vma->vm_start, pfn, + user_count << PAGE_SHIFT, + vma->vm_page_prot); + } + return 1; + } + return 0; +} +EXPORT_SYMBOL(dma_mmap_from_coherent); diff --git a/include/asm-generic/dma-coherent.h b/include/asm-generic/dma-coherent.h index 85a3ffa..abfb268 100644 --- a/include/asm-generic/dma-coherent.h +++ b/include/asm-generic/dma-coherent.h @@ -3,13 +3,15 @@ #ifdef CONFIG_HAVE_GENERIC_DMA_COHERENT /* - * These two functions are only for dma allocator. + * These three functions are only for dma allocator. * Don't use them in device drivers. */ int dma_alloc_from_coherent(struct device *dev, ssize_t size, dma_addr_t *dma_handle, void **ret); int dma_release_from_coherent(struct device *dev, int order, void *vaddr); +int dma_mmap_from_coherent(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, size_t size, int *ret); /* * Standard interface */ -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:52 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:52 +0200 Subject: [Linaro-mm-sig] [PATCHv10 02/11] ARM: dma-mapping: use dma_mmap_from_coherent() In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-3-git-send-email-m.szyprowski@samsung.com> Signed-off-by: Marek Szyprowski --- arch/arm/mm/dma-mapping.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index db23ae4..7ec0863 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -401,6 +401,9 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma, unsigned long user_size, kern_size; struct arm_vmregion *c; + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) + return ret; + user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:53 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:53 +0200 Subject: [Linaro-mm-sig] [PATCHv10 03/11] ARM: dma-mapping: use pr_* instread of printk In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-4-git-send-email-m.szyprowski@samsung.com> Replace all calls to printk with pr_* functions family. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Acked-by: Arnd Bergmann Tested-By: Subash Patel --- arch/arm/mm/dma-mapping.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 7ec0863..322c70a 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -184,14 +184,14 @@ static int __init consistent_init(void) pud = pud_alloc(&init_mm, pgd, base); if (!pud) { - printk(KERN_ERR "%s: no pud tables\n", __func__); + pr_err("%s: no pud tables\n", __func__); ret = -ENOMEM; break; } pmd = pmd_alloc(&init_mm, pud, base); if (!pmd) { - printk(KERN_ERR "%s: no pmd tables\n", __func__); + pr_err("%s: no pmd tables\n", __func__); ret = -ENOMEM; break; } @@ -199,7 +199,7 @@ static int __init consistent_init(void) pte = pte_alloc_kernel(pmd, base); if (!pte) { - printk(KERN_ERR "%s: no pte tables\n", __func__); + pr_err("%s: no pte tables\n", __func__); ret = -ENOMEM; break; } @@ -222,7 +222,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, int bit; if (!consistent_pte) { - printk(KERN_ERR "%s: not initialised\n", __func__); + pr_err("%s: not initialised\n", __func__); dump_stack(); return NULL; } @@ -281,14 +281,14 @@ static void __dma_free_remap(void *cpu_addr, size_t size) c = arm_vmregion_find_remove(&consistent_head, (unsigned long)cpu_addr); if (!c) { - printk(KERN_ERR "%s: trying to free invalid coherent area: %p\n", + pr_err("%s: trying to free invalid coherent area: %p\n", __func__, cpu_addr); dump_stack(); return; } if ((c->vm_end - c->vm_start) != size) { - printk(KERN_ERR "%s: freeing wrong coherent size (%ld != %d)\n", + pr_err("%s: freeing wrong coherent size (%ld != %d)\n", __func__, c->vm_end - c->vm_start, size); dump_stack(); size = c->vm_end - c->vm_start; @@ -310,8 +310,8 @@ static void __dma_free_remap(void *cpu_addr, size_t size) } if (pte_none(pte) || !pte_present(pte)) - printk(KERN_CRIT "%s: bad page in kernel page table\n", - __func__); + pr_crit("%s: bad page in kernel page table\n", + __func__); } while (size -= PAGE_SIZE); flush_tlb_kernel_range(c->vm_start, c->vm_end); -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:54 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:54 +0200 Subject: [Linaro-mm-sig] [PATCHv10 04/11] ARM: dma-mapping: introduce DMA_ERROR_CODE constant In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-5-git-send-email-m.szyprowski@samsung.com> Replace all uses of ~0 with DMA_ERROR_CODE, what should make the code easier to read. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Tested-By: Subash Patel --- arch/arm/common/dmabounce.c | 6 +++--- arch/arm/include/asm/dma-mapping.h | 4 +++- arch/arm/mm/dma-mapping.c | 2 +- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c index 595ecd29..210ad1b 100644 --- a/arch/arm/common/dmabounce.c +++ b/arch/arm/common/dmabounce.c @@ -254,7 +254,7 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size, if (buf == NULL) { dev_err(dev, "%s: unable to map unsafe buffer %p!\n", __func__, ptr); - return ~0; + return DMA_ERROR_CODE; } dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n", @@ -320,7 +320,7 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page, ret = needs_bounce(dev, dma_addr, size); if (ret < 0) - return ~0; + return DMA_ERROR_CODE; if (ret == 0) { __dma_page_cpu_to_dev(page, offset, size, dir); @@ -329,7 +329,7 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page, if (PageHighMem(page)) { dev_err(dev, "DMA buffer bouncing of HIGHMEM pages is not supported\n"); - return ~0; + return DMA_ERROR_CODE; } return map_single(dev, page_address(page) + offset, size, dir); diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index cb3b7c9..6a838da 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -10,6 +10,8 @@ #include #include +#define DMA_ERROR_CODE (~0) + #ifdef __arch_page_to_dma #error Please update to __arch_pfn_to_dma #endif @@ -123,7 +125,7 @@ extern int dma_set_mask(struct device *, u64); */ static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr) { - return dma_addr == ~0; + return dma_addr == DMA_ERROR_CODE; } /* diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 322c70a..e4ac5fc 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -342,7 +342,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, */ gfp &= ~(__GFP_COMP); - *handle = ~0; + *handle = DMA_ERROR_CODE; size = PAGE_ALIGN(size); page = __dma_alloc_buffer(dev, size, gfp); -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:55 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:55 +0200 Subject: [Linaro-mm-sig] [PATCHv10 05/11] ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-6-git-send-email-m.szyprowski@samsung.com> This patch removes the need for the offset parameter in dma bounce functions. This is required to let dma-mapping framework on ARM architecture to use common, generic dma_map_ops based dma-mapping helpers. Background and more detailed explaination: dma_*_range_* functions are available from the early days of the dma mapping api. They are the correct way of doing a partial syncs on the buffer (usually used by the network device drivers). This patch changes only the internal implementation of the dma bounce functions to let them tunnel through dma_map_ops structure. The driver api stays unchanged, so driver are obliged to call dma_*_range_* functions to keep code clean and easy to understand. The only drawback from this patch is reduced detection of the dma api abuse. Let us consider the following code: dma_addr = dma_map_single(dev, ptr, 64, DMA_TO_DEVICE); dma_sync_single_range_for_cpu(dev, dma_addr+16, 0, 32, DMA_TO_DEVICE); Without the patch such code fails, because dma bounce code is unable to find the bounce buffer for the given dma_address. After the patch the above sync call will be equivalent to: dma_sync_single_range_for_cpu(dev, dma_addr, 16, 32, DMA_TO_DEVICE); which succeeds. I don't consider this as a real problem, because DMA API abuse should be caught by debug_dma_* function family. This patch lets us to simplify the internal low-level implementation without chaning the driver visible API. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Tested-By: Subash Patel --- arch/arm/common/dmabounce.c | 13 +++++-- arch/arm/include/asm/dma-mapping.h | 67 ++++++++++++++++++------------------ arch/arm/mm/dma-mapping.c | 4 +-- 3 files changed, 45 insertions(+), 39 deletions(-) diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c index 210ad1b..32e9cc6 100644 --- a/arch/arm/common/dmabounce.c +++ b/arch/arm/common/dmabounce.c @@ -173,7 +173,8 @@ find_safe_buffer(struct dmabounce_device_info *device_info, dma_addr_t safe_dma_ read_lock_irqsave(&device_info->lock, flags); list_for_each_entry(b, &device_info->safe_buffers, node) - if (b->safe_dma_addr == safe_dma_addr) { + if (b->safe_dma_addr <= safe_dma_addr && + b->safe_dma_addr + b->size > safe_dma_addr) { rb = b; break; } @@ -362,9 +363,10 @@ void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size, EXPORT_SYMBOL(__dma_unmap_page); int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, - unsigned long off, size_t sz, enum dma_data_direction dir) + size_t sz, enum dma_data_direction dir) { struct safe_buffer *buf; + unsigned long off; dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n", __func__, addr, off, sz, dir); @@ -373,6 +375,8 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, if (!buf) return 1; + off = addr - buf->safe_dma_addr; + BUG_ON(buf->direction != dir); dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n", @@ -391,9 +395,10 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, EXPORT_SYMBOL(dmabounce_sync_for_cpu); int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr, - unsigned long off, size_t sz, enum dma_data_direction dir) + size_t sz, enum dma_data_direction dir) { struct safe_buffer *buf; + unsigned long off; dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n", __func__, addr, off, sz, dir); @@ -402,6 +407,8 @@ int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr, if (!buf) return 1; + off = addr - buf->safe_dma_addr; + BUG_ON(buf->direction != dir); dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n", diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 6a838da..eeddbe2 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -266,19 +266,17 @@ extern void __dma_unmap_page(struct device *, dma_addr_t, size_t, /* * Private functions */ -int dmabounce_sync_for_cpu(struct device *, dma_addr_t, unsigned long, - size_t, enum dma_data_direction); -int dmabounce_sync_for_device(struct device *, dma_addr_t, unsigned long, - size_t, enum dma_data_direction); +int dmabounce_sync_for_cpu(struct device *, dma_addr_t, size_t, enum dma_data_direction); +int dmabounce_sync_for_device(struct device *, dma_addr_t, size_t, enum dma_data_direction); #else static inline int dmabounce_sync_for_cpu(struct device *d, dma_addr_t addr, - unsigned long offset, size_t size, enum dma_data_direction dir) + size_t size, enum dma_data_direction dir) { return 1; } static inline int dmabounce_sync_for_device(struct device *d, dma_addr_t addr, - unsigned long offset, size_t size, enum dma_data_direction dir) + size_t size, enum dma_data_direction dir) { return 1; } @@ -401,6 +399,33 @@ static inline void dma_unmap_page(struct device *dev, dma_addr_t handle, __dma_unmap_page(dev, handle, size, dir); } + +static inline void dma_sync_single_for_cpu(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + BUG_ON(!valid_dma_direction(dir)); + + debug_dma_sync_single_for_cpu(dev, handle, size, dir); + + if (!dmabounce_sync_for_cpu(dev, handle, size, dir)) + return; + + __dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir); +} + +static inline void dma_sync_single_for_device(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + BUG_ON(!valid_dma_direction(dir)); + + debug_dma_sync_single_for_device(dev, handle, size, dir); + + if (!dmabounce_sync_for_device(dev, handle, size, dir)) + return; + + __dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir); +} + /** * dma_sync_single_range_for_cpu * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices @@ -423,40 +448,14 @@ static inline void dma_sync_single_range_for_cpu(struct device *dev, dma_addr_t handle, unsigned long offset, size_t size, enum dma_data_direction dir) { - BUG_ON(!valid_dma_direction(dir)); - - debug_dma_sync_single_for_cpu(dev, handle + offset, size, dir); - - if (!dmabounce_sync_for_cpu(dev, handle, offset, size, dir)) - return; - - __dma_single_dev_to_cpu(dma_to_virt(dev, handle) + offset, size, dir); + dma_sync_single_for_cpu(dev, handle + offset, size, dir); } static inline void dma_sync_single_range_for_device(struct device *dev, dma_addr_t handle, unsigned long offset, size_t size, enum dma_data_direction dir) { - BUG_ON(!valid_dma_direction(dir)); - - debug_dma_sync_single_for_device(dev, handle + offset, size, dir); - - if (!dmabounce_sync_for_device(dev, handle, offset, size, dir)) - return; - - __dma_single_cpu_to_dev(dma_to_virt(dev, handle) + offset, size, dir); -} - -static inline void dma_sync_single_for_cpu(struct device *dev, - dma_addr_t handle, size_t size, enum dma_data_direction dir) -{ - dma_sync_single_range_for_cpu(dev, handle, 0, size, dir); -} - -static inline void dma_sync_single_for_device(struct device *dev, - dma_addr_t handle, size_t size, enum dma_data_direction dir) -{ - dma_sync_single_range_for_device(dev, handle, 0, size, dir); + dma_sync_single_for_device(dev, handle + offset, size, dir); } /* diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index e4ac5fc..a16993a 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -660,7 +660,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, int i; for_each_sg(sg, s, nents, i) { - if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s), 0, + if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s), sg_dma_len(s), dir)) continue; @@ -686,7 +686,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, int i; for_each_sg(sg, s, nents, i) { - if (!dmabounce_sync_for_device(dev, sg_dma_address(s), 0, + if (!dmabounce_sync_for_device(dev, sg_dma_address(s), sg_dma_len(s), dir)) continue; -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:56 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:56 +0200 Subject: [Linaro-mm-sig] [PATCHv10 06/11] ARM: dma-mapping: use asm-generic/dma-mapping-common.h In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-7-git-send-email-m.szyprowski@samsung.com> This patch modifies dma-mapping implementation on ARM architecture to use common dma_map_ops structure and asm-generic/dma-mapping-common.h helpers. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Tested-By: Subash Patel --- arch/arm/Kconfig | 1 + arch/arm/include/asm/device.h | 1 + arch/arm/include/asm/dma-mapping.h | 196 +++++------------------------------- arch/arm/mm/dma-mapping.c | 148 +++++++++++++++------------ 4 files changed, 115 insertions(+), 231 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 36586dba..c8111c5 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -4,6 +4,7 @@ config ARM select HAVE_AOUT select HAVE_DMA_API_DEBUG select HAVE_IDE if PCI || ISA || PCMCIA + select HAVE_DMA_ATTRS select HAVE_MEMBLOCK select RTC_LIB select SYS_SUPPORTS_APM_EMULATION diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h index 7aa3680..6e2cb0e 100644 --- a/arch/arm/include/asm/device.h +++ b/arch/arm/include/asm/device.h @@ -7,6 +7,7 @@ #define ASMARM_DEVICE_H struct dev_archdata { + struct dma_map_ops *dma_ops; #ifdef CONFIG_DMABOUNCE struct dmabounce_device_info *dmabounce; #endif diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index eeddbe2..6725a08 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -11,6 +11,27 @@ #include #define DMA_ERROR_CODE (~0) +extern struct dma_map_ops arm_dma_ops; + +static inline struct dma_map_ops *get_dma_ops(struct device *dev) +{ + if (dev && dev->archdata.dma_ops) + return dev->archdata.dma_ops; + return &arm_dma_ops; +} + +static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops) +{ + BUG_ON(!dev); + dev->archdata.dma_ops = ops; +} + +#include + +static inline int dma_set_mask(struct device *dev, u64 mask) +{ + return get_dma_ops(dev)->set_dma_mask(dev, mask); +} #ifdef __arch_page_to_dma #error Please update to __arch_pfn_to_dma @@ -119,7 +140,6 @@ static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off, extern int dma_supported(struct device *, u64); extern int dma_set_mask(struct device *, u64); - /* * DMA errors are defined by all-bits-set in the DMA address. */ @@ -297,179 +317,17 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle, } #endif /* CONFIG_DMABOUNCE */ -/** - * dma_map_single - map a single buffer for streaming DMA - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @cpu_addr: CPU direct mapped address of buffer - * @size: size of buffer to map - * @dir: DMA transfer direction - * - * Ensure that any data held in the cache is appropriately discarded - * or written back. - * - * The device owns this memory once this call has completed. The CPU - * can regain ownership by calling dma_unmap_single() or - * dma_sync_single_for_cpu(). - */ -static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr, - size_t size, enum dma_data_direction dir) -{ - unsigned long offset; - struct page *page; - dma_addr_t addr; - - BUG_ON(!virt_addr_valid(cpu_addr)); - BUG_ON(!virt_addr_valid(cpu_addr + size - 1)); - BUG_ON(!valid_dma_direction(dir)); - - page = virt_to_page(cpu_addr); - offset = (unsigned long)cpu_addr & ~PAGE_MASK; - addr = __dma_map_page(dev, page, offset, size, dir); - debug_dma_map_page(dev, page, offset, size, dir, addr, true); - - return addr; -} - -/** - * dma_map_page - map a portion of a page for streaming DMA - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @page: page that buffer resides in - * @offset: offset into page for start of buffer - * @size: size of buffer to map - * @dir: DMA transfer direction - * - * Ensure that any data held in the cache is appropriately discarded - * or written back. - * - * The device owns this memory once this call has completed. The CPU - * can regain ownership by calling dma_unmap_page(). - */ -static inline dma_addr_t dma_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, enum dma_data_direction dir) -{ - dma_addr_t addr; - - BUG_ON(!valid_dma_direction(dir)); - - addr = __dma_map_page(dev, page, offset, size, dir); - debug_dma_map_page(dev, page, offset, size, dir, addr, false); - - return addr; -} - -/** - * dma_unmap_single - unmap a single buffer previously mapped - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @handle: DMA address of buffer - * @size: size of buffer (same as passed to dma_map_single) - * @dir: DMA transfer direction (same as passed to dma_map_single) - * - * Unmap a single streaming mode DMA translation. The handle and size - * must match what was provided in the previous dma_map_single() call. - * All other usages are undefined. - * - * After this call, reads by the CPU to the buffer are guaranteed to see - * whatever the device wrote there. - */ -static inline void dma_unmap_single(struct device *dev, dma_addr_t handle, - size_t size, enum dma_data_direction dir) -{ - debug_dma_unmap_page(dev, handle, size, dir, true); - __dma_unmap_page(dev, handle, size, dir); -} - -/** - * dma_unmap_page - unmap a buffer previously mapped through dma_map_page() - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @handle: DMA address of buffer - * @size: size of buffer (same as passed to dma_map_page) - * @dir: DMA transfer direction (same as passed to dma_map_page) - * - * Unmap a page streaming mode DMA translation. The handle and size - * must match what was provided in the previous dma_map_page() call. - * All other usages are undefined. - * - * After this call, reads by the CPU to the buffer are guaranteed to see - * whatever the device wrote there. - */ -static inline void dma_unmap_page(struct device *dev, dma_addr_t handle, - size_t size, enum dma_data_direction dir) -{ - debug_dma_unmap_page(dev, handle, size, dir, false); - __dma_unmap_page(dev, handle, size, dir); -} - - -static inline void dma_sync_single_for_cpu(struct device *dev, - dma_addr_t handle, size_t size, enum dma_data_direction dir) -{ - BUG_ON(!valid_dma_direction(dir)); - - debug_dma_sync_single_for_cpu(dev, handle, size, dir); - - if (!dmabounce_sync_for_cpu(dev, handle, size, dir)) - return; - - __dma_single_dev_to_cpu(dma_to_virt(dev, handle), size, dir); -} - -static inline void dma_sync_single_for_device(struct device *dev, - dma_addr_t handle, size_t size, enum dma_data_direction dir) -{ - BUG_ON(!valid_dma_direction(dir)); - - debug_dma_sync_single_for_device(dev, handle, size, dir); - - if (!dmabounce_sync_for_device(dev, handle, size, dir)) - return; - - __dma_single_cpu_to_dev(dma_to_virt(dev, handle), size, dir); -} - -/** - * dma_sync_single_range_for_cpu - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @handle: DMA address of buffer - * @offset: offset of region to start sync - * @size: size of region to sync - * @dir: DMA transfer direction (same as passed to dma_map_single) - * - * Make physical memory consistent for a single streaming mode DMA - * translation after a transfer. - * - * If you perform a dma_map_single() but wish to interrogate the - * buffer using the cpu, yet do not wish to teardown the PCI dma - * mapping, you must call this function before doing so. At the - * next point you give the PCI dma address back to the card, you - * must first the perform a dma_sync_for_device, and then the - * device again owns the buffer. - */ -static inline void dma_sync_single_range_for_cpu(struct device *dev, - dma_addr_t handle, unsigned long offset, size_t size, - enum dma_data_direction dir) -{ - dma_sync_single_for_cpu(dev, handle + offset, size, dir); -} - -static inline void dma_sync_single_range_for_device(struct device *dev, - dma_addr_t handle, unsigned long offset, size_t size, - enum dma_data_direction dir) -{ - dma_sync_single_for_device(dev, handle + offset, size, dir); -} - /* * The scatter list versions of the above methods. */ -extern int dma_map_sg(struct device *, struct scatterlist *, int, - enum dma_data_direction); -extern void dma_unmap_sg(struct device *, struct scatterlist *, int, +extern int arm_dma_map_sg(struct device *, struct scatterlist *, int, + enum dma_data_direction, struct dma_attrs *attrs); +extern void arm_dma_unmap_sg(struct device *, struct scatterlist *, int, + enum dma_data_direction, struct dma_attrs *attrs); +extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int, enum dma_data_direction); -extern void dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int, +extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int, enum dma_data_direction); -extern void dma_sync_sg_for_device(struct device *, struct scatterlist *, int, - enum dma_data_direction); - #endif /* __KERNEL__ */ #endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index a16993a..70be6e1 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -29,6 +29,85 @@ #include "mm.h" +/** + * arm_dma_map_page - map a portion of a page for streaming DMA + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices + * @page: page that buffer resides in + * @offset: offset into page for start of buffer + * @size: size of buffer to map + * @dir: DMA transfer direction + * + * Ensure that any data held in the cache is appropriately discarded + * or written back. + * + * The device owns this memory once this call has completed. The CPU + * can regain ownership by calling dma_unmap_page(). + */ +static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + return __dma_map_page(dev, page, offset, size, dir); +} + +/** + * arm_dma_unmap_page - unmap a buffer previously mapped through dma_map_page() + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices + * @handle: DMA address of buffer + * @size: size of buffer (same as passed to dma_map_page) + * @dir: DMA transfer direction (same as passed to dma_map_page) + * + * Unmap a page streaming mode DMA translation. The handle and size + * must match what was provided in the previous dma_map_page() call. + * All other usages are undefined. + * + * After this call, reads by the CPU to the buffer are guaranteed to see + * whatever the device wrote there. + */ +static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + __dma_unmap_page(dev, handle, size, dir); +} + +static inline void arm_dma_sync_single_for_cpu(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + unsigned int offset = handle & (PAGE_SIZE - 1); + struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); + if (!dmabounce_sync_for_cpu(dev, handle, size, dir)) + return; + + __dma_page_dev_to_cpu(page, offset, size, dir); +} + +static inline void arm_dma_sync_single_for_device(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + unsigned int offset = handle & (PAGE_SIZE - 1); + struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); + if (!dmabounce_sync_for_device(dev, handle, size, dir)) + return; + + __dma_page_cpu_to_dev(page, offset, size, dir); +} + +static int arm_dma_set_mask(struct device *dev, u64 dma_mask); + +struct dma_map_ops arm_dma_ops = { + .map_page = arm_dma_map_page, + .unmap_page = arm_dma_unmap_page, + .map_sg = arm_dma_map_sg, + .unmap_sg = arm_dma_unmap_sg, + .sync_single_for_cpu = arm_dma_sync_single_for_cpu, + .sync_single_for_device = arm_dma_sync_single_for_device, + .sync_sg_for_cpu = arm_dma_sync_sg_for_cpu, + .sync_sg_for_device = arm_dma_sync_sg_for_device, + .set_dma_mask = arm_dma_set_mask, +}; +EXPORT_SYMBOL(arm_dma_ops); + static u64 get_coherent_dma_mask(struct device *dev) { u64 mask = (u64)arm_dma_limit; @@ -461,47 +540,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr } EXPORT_SYMBOL(dma_free_coherent); -/* - * Make an area consistent for devices. - * Note: Drivers should NOT use this function directly, as it will break - * platforms with CONFIG_DMABOUNCE. - * Use the driver DMA support - see dma-mapping.h (dma_sync_*) - */ -void ___dma_single_cpu_to_dev(const void *kaddr, size_t size, - enum dma_data_direction dir) -{ - unsigned long paddr; - - BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1)); - - dmac_map_area(kaddr, size, dir); - - paddr = __pa(kaddr); - if (dir == DMA_FROM_DEVICE) { - outer_inv_range(paddr, paddr + size); - } else { - outer_clean_range(paddr, paddr + size); - } - /* FIXME: non-speculating: flush on bidirectional mappings? */ -} -EXPORT_SYMBOL(___dma_single_cpu_to_dev); - -void ___dma_single_dev_to_cpu(const void *kaddr, size_t size, - enum dma_data_direction dir) -{ - BUG_ON(!virt_addr_valid(kaddr) || !virt_addr_valid(kaddr + size - 1)); - - /* FIXME: non-speculating: not required */ - /* don't bother invalidating if DMA to device */ - if (dir != DMA_TO_DEVICE) { - unsigned long paddr = __pa(kaddr); - outer_inv_range(paddr, paddr + size); - } - - dmac_unmap_area(kaddr, size, dir); -} -EXPORT_SYMBOL(___dma_single_dev_to_cpu); - static void dma_cache_maint_page(struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, void (*op)(const void *, size_t, int)) @@ -599,21 +637,18 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu); * Device ownership issues as mentioned for dma_map_single are the same * here. */ -int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, - enum dma_data_direction dir) +int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, + enum dma_data_direction dir, struct dma_attrs *attrs) { struct scatterlist *s; int i, j; - BUG_ON(!valid_dma_direction(dir)); - for_each_sg(sg, s, nents, i) { s->dma_address = __dma_map_page(dev, sg_page(s), s->offset, s->length, dir); if (dma_mapping_error(dev, s->dma_address)) goto bad_mapping; } - debug_dma_map_sg(dev, sg, nents, nents, dir); return nents; bad_mapping: @@ -621,7 +656,6 @@ int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, __dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir); return 0; } -EXPORT_SYMBOL(dma_map_sg); /** * dma_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg @@ -633,18 +667,15 @@ EXPORT_SYMBOL(dma_map_sg); * Unmap a set of streaming mode DMA translations. Again, CPU access * rules concerning calls here are the same as for dma_unmap_single(). */ -void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, - enum dma_data_direction dir) +void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, + enum dma_data_direction dir, struct dma_attrs *attrs) { struct scatterlist *s; int i; - debug_dma_unmap_sg(dev, sg, nents, dir); - for_each_sg(sg, s, nents, i) __dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir); } -EXPORT_SYMBOL(dma_unmap_sg); /** * dma_sync_sg_for_cpu @@ -653,7 +684,7 @@ EXPORT_SYMBOL(dma_unmap_sg); * @nents: number of buffers to map (returned from dma_map_sg) * @dir: DMA transfer direction (same as was passed to dma_map_sg) */ -void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, +void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir) { struct scatterlist *s; @@ -667,10 +698,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, __dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir); } - - debug_dma_sync_sg_for_cpu(dev, sg, nents, dir); } -EXPORT_SYMBOL(dma_sync_sg_for_cpu); /** * dma_sync_sg_for_device @@ -679,7 +707,7 @@ EXPORT_SYMBOL(dma_sync_sg_for_cpu); * @nents: number of buffers to map (returned from dma_map_sg) * @dir: DMA transfer direction (same as was passed to dma_map_sg) */ -void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, +void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir) { struct scatterlist *s; @@ -693,10 +721,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); } - - debug_dma_sync_sg_for_device(dev, sg, nents, dir); } -EXPORT_SYMBOL(dma_sync_sg_for_device); /* * Return whether the given device DMA address mask can be supported @@ -712,7 +737,7 @@ int dma_supported(struct device *dev, u64 mask) } EXPORT_SYMBOL(dma_supported); -int dma_set_mask(struct device *dev, u64 dma_mask) +static int arm_dma_set_mask(struct device *dev, u64 dma_mask) { if (!dev->dma_mask || !dma_supported(dev, dma_mask)) return -EIO; @@ -723,7 +748,6 @@ int dma_set_mask(struct device *dev, u64 dma_mask) return 0; } -EXPORT_SYMBOL(dma_set_mask); #define PREALLOC_DMA_DEBUG_ENTRIES 4096 -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:57 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:57 +0200 Subject: [Linaro-mm-sig] [PATCHv10 07/11] ARM: dma-mapping: implement dma sg methods on top of any generic dma ops In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-8-git-send-email-m.szyprowski@samsung.com> This patch converts all dma_sg methods to be generic (independent of the current DMA mapping implementation for ARM architecture). All dma sg operations are now implemented on top of respective dma_map_page/dma_sync_single_for* operations from dma_map_ops structure. Before this patch there were custom methods for all scatter/gather related operations. They iterated over the whole scatter list and called cache related operations directly (which in turn checked if we use dma bounce code or not and called respective version). This patch changes them not to use such shortcut. Instead it provides similar loop over scatter list and calls methods from the device's dma_map_ops structure. This enables us to use device dependent implementations of cache related operations (direct linear or dma bounce) depending on the provided dma_map_ops structure. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Tested-By: Subash Patel --- arch/arm/mm/dma-mapping.c | 43 +++++++++++++++++++------------------------ 1 file changed, 19 insertions(+), 24 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 70be6e1..b50fa57 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -622,7 +622,7 @@ void ___dma_page_dev_to_cpu(struct page *page, unsigned long off, EXPORT_SYMBOL(___dma_page_dev_to_cpu); /** - * dma_map_sg - map a set of SG buffers for streaming mode DMA + * arm_dma_map_sg - map a set of SG buffers for streaming mode DMA * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices * @sg: list of buffers * @nents: number of buffers to map @@ -640,12 +640,13 @@ EXPORT_SYMBOL(___dma_page_dev_to_cpu); int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir, struct dma_attrs *attrs) { + struct dma_map_ops *ops = get_dma_ops(dev); struct scatterlist *s; int i, j; for_each_sg(sg, s, nents, i) { - s->dma_address = __dma_map_page(dev, sg_page(s), s->offset, - s->length, dir); + s->dma_address = ops->map_page(dev, sg_page(s), s->offset, + s->length, dir, attrs); if (dma_mapping_error(dev, s->dma_address)) goto bad_mapping; } @@ -653,12 +654,12 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, bad_mapping: for_each_sg(sg, s, i, j) - __dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir); + ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs); return 0; } /** - * dma_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg + * arm_dma_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices * @sg: list of buffers * @nents: number of buffers to unmap (same as was passed to dma_map_sg) @@ -670,15 +671,17 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir, struct dma_attrs *attrs) { + struct dma_map_ops *ops = get_dma_ops(dev); struct scatterlist *s; + int i; for_each_sg(sg, s, nents, i) - __dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir); + ops->unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, attrs); } /** - * dma_sync_sg_for_cpu + * arm_dma_sync_sg_for_cpu * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices * @sg: list of buffers * @nents: number of buffers to map (returned from dma_map_sg) @@ -687,21 +690,17 @@ void arm_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir) { + struct dma_map_ops *ops = get_dma_ops(dev); struct scatterlist *s; int i; - for_each_sg(sg, s, nents, i) { - if (!dmabounce_sync_for_cpu(dev, sg_dma_address(s), - sg_dma_len(s), dir)) - continue; - - __dma_page_dev_to_cpu(sg_page(s), s->offset, - s->length, dir); - } + for_each_sg(sg, s, nents, i) + ops->sync_single_for_cpu(dev, sg_dma_address(s), s->length, + dir); } /** - * dma_sync_sg_for_device + * arm_dma_sync_sg_for_device * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices * @sg: list of buffers * @nents: number of buffers to map (returned from dma_map_sg) @@ -710,17 +709,13 @@ void arm_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, void arm_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir) { + struct dma_map_ops *ops = get_dma_ops(dev); struct scatterlist *s; int i; - for_each_sg(sg, s, nents, i) { - if (!dmabounce_sync_for_device(dev, sg_dma_address(s), - sg_dma_len(s), dir)) - continue; - - __dma_page_cpu_to_dev(sg_page(s), s->offset, - s->length, dir); - } + for_each_sg(sg, s, nents, i) + ops->sync_single_for_device(dev, sg_dma_address(s), s->length, + dir); } /* -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:58 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:58 +0200 Subject: [Linaro-mm-sig] [PATCHv10 08/11] ARM: dma-mapping: move all dma bounce code to separate dma ops structure In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-9-git-send-email-m.szyprowski@samsung.com> This patch removes dma bounce hooks from the common dma mapping implementation on ARM architecture and creates a separate set of dma_map_ops for dma bounce devices. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Tested-By: Subash Patel --- arch/arm/common/dmabounce.c | 62 +++++++++++++++++----- arch/arm/include/asm/dma-mapping.h | 99 +----------------------------------- arch/arm/mm/dma-mapping.c | 79 ++++++++++++++++++++++++---- 3 files changed, 120 insertions(+), 120 deletions(-) diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c index 32e9cc6..813c29d 100644 --- a/arch/arm/common/dmabounce.c +++ b/arch/arm/common/dmabounce.c @@ -308,8 +308,9 @@ static inline void unmap_single(struct device *dev, struct safe_buffer *buf, * substitute the safe buffer for the unsafe one. * (basically move the buffer from an unsafe area to a safe one) */ -dma_addr_t __dma_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, enum dma_data_direction dir) +static dma_addr_t dmabounce_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) { dma_addr_t dma_addr; int ret; @@ -324,7 +325,7 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page, return DMA_ERROR_CODE; if (ret == 0) { - __dma_page_cpu_to_dev(page, offset, size, dir); + arm_dma_ops.sync_single_for_device(dev, dma_addr, size, dir); return dma_addr; } @@ -335,7 +336,6 @@ dma_addr_t __dma_map_page(struct device *dev, struct page *page, return map_single(dev, page_address(page) + offset, size, dir); } -EXPORT_SYMBOL(__dma_map_page); /* * see if a mapped address was really a "safe" buffer and if so, copy @@ -343,8 +343,8 @@ EXPORT_SYMBOL(__dma_map_page); * the safe buffer. (basically return things back to the way they * should be) */ -void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size, - enum dma_data_direction dir) +static void dmabounce_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size, + enum dma_data_direction dir, struct dma_attrs *attrs) { struct safe_buffer *buf; @@ -353,16 +353,14 @@ void __dma_unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size, buf = find_safe_buffer_dev(dev, dma_addr, __func__); if (!buf) { - __dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, dma_addr)), - dma_addr & ~PAGE_MASK, size, dir); + arm_dma_ops.sync_single_for_cpu(dev, dma_addr, size, dir); return; } unmap_single(dev, buf, size, dir); } -EXPORT_SYMBOL(__dma_unmap_page); -int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, +static int __dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, size_t sz, enum dma_data_direction dir) { struct safe_buffer *buf; @@ -392,9 +390,17 @@ int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, } return 0; } -EXPORT_SYMBOL(dmabounce_sync_for_cpu); -int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr, +static void dmabounce_sync_for_cpu(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + if (!__dmabounce_sync_for_cpu(dev, handle, size, dir)) + return; + + arm_dma_ops.sync_single_for_cpu(dev, handle, size, dir); +} + +static int __dmabounce_sync_for_device(struct device *dev, dma_addr_t addr, size_t sz, enum dma_data_direction dir) { struct safe_buffer *buf; @@ -424,7 +430,35 @@ int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr, } return 0; } -EXPORT_SYMBOL(dmabounce_sync_for_device); + +static void dmabounce_sync_for_device(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + if (!__dmabounce_sync_for_device(dev, handle, size, dir)) + return; + + arm_dma_ops.sync_single_for_device(dev, handle, size, dir); +} + +static int dmabounce_set_mask(struct device *dev, u64 dma_mask) +{ + if (dev->archdata.dmabounce) + return 0; + + return arm_dma_ops.set_dma_mask(dev, dma_mask); +} + +static struct dma_map_ops dmabounce_ops = { + .map_page = dmabounce_map_page, + .unmap_page = dmabounce_unmap_page, + .sync_single_for_cpu = dmabounce_sync_for_cpu, + .sync_single_for_device = dmabounce_sync_for_device, + .map_sg = arm_dma_map_sg, + .unmap_sg = arm_dma_unmap_sg, + .sync_sg_for_cpu = arm_dma_sync_sg_for_cpu, + .sync_sg_for_device = arm_dma_sync_sg_for_device, + .set_dma_mask = dmabounce_set_mask, +}; static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev, const char *name, unsigned long size) @@ -486,6 +520,7 @@ int dmabounce_register_dev(struct device *dev, unsigned long small_buffer_size, #endif dev->archdata.dmabounce = device_info; + set_dma_ops(dev, &dmabounce_ops); dev_info(dev, "dmabounce: registered device\n"); @@ -504,6 +539,7 @@ void dmabounce_unregister_dev(struct device *dev) struct dmabounce_device_info *device_info = dev->archdata.dmabounce; dev->archdata.dmabounce = NULL; + set_dma_ops(dev, NULL); if (!device_info) { dev_warn(dev, diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 6725a08..7a7c3c7 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -85,62 +85,6 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) #endif /* - * The DMA API is built upon the notion of "buffer ownership". A buffer - * is either exclusively owned by the CPU (and therefore may be accessed - * by it) or exclusively owned by the DMA device. These helper functions - * represent the transitions between these two ownership states. - * - * Note, however, that on later ARMs, this notion does not work due to - * speculative prefetches. We model our approach on the assumption that - * the CPU does do speculative prefetches, which means we clean caches - * before transfers and delay cache invalidation until transfer completion. - * - * Private support functions: these are not part of the API and are - * liable to change. Drivers must not use these. - */ -static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size, - enum dma_data_direction dir) -{ - extern void ___dma_single_cpu_to_dev(const void *, size_t, - enum dma_data_direction); - - if (!arch_is_coherent()) - ___dma_single_cpu_to_dev(kaddr, size, dir); -} - -static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size, - enum dma_data_direction dir) -{ - extern void ___dma_single_dev_to_cpu(const void *, size_t, - enum dma_data_direction); - - if (!arch_is_coherent()) - ___dma_single_dev_to_cpu(kaddr, size, dir); -} - -static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) -{ - extern void ___dma_page_cpu_to_dev(struct page *, unsigned long, - size_t, enum dma_data_direction); - - if (!arch_is_coherent()) - ___dma_page_cpu_to_dev(page, off, size, dir); -} - -static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) -{ - extern void ___dma_page_dev_to_cpu(struct page *, unsigned long, - size_t, enum dma_data_direction); - - if (!arch_is_coherent()) - ___dma_page_dev_to_cpu(page, off, size, dir); -} - -extern int dma_supported(struct device *, u64); -extern int dma_set_mask(struct device *, u64); -/* * DMA errors are defined by all-bits-set in the DMA address. */ static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr) @@ -163,6 +107,8 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size, { } +extern int dma_supported(struct device *dev, u64 mask); + /** * dma_alloc_coherent - allocate consistent memory for DMA * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices @@ -235,7 +181,6 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *, extern void __init init_consistent_dma_size(unsigned long size); -#ifdef CONFIG_DMABOUNCE /* * For SA-1111, IXP425, and ADI systems the dma-mapping functions are "magic" * and utilize bounce buffers as needed to work around limited DMA windows. @@ -275,47 +220,7 @@ extern int dmabounce_register_dev(struct device *, unsigned long, */ extern void dmabounce_unregister_dev(struct device *); -/* - * The DMA API, implemented by dmabounce.c. See below for descriptions. - */ -extern dma_addr_t __dma_map_page(struct device *, struct page *, - unsigned long, size_t, enum dma_data_direction); -extern void __dma_unmap_page(struct device *, dma_addr_t, size_t, - enum dma_data_direction); - -/* - * Private functions - */ -int dmabounce_sync_for_cpu(struct device *, dma_addr_t, size_t, enum dma_data_direction); -int dmabounce_sync_for_device(struct device *, dma_addr_t, size_t, enum dma_data_direction); -#else -static inline int dmabounce_sync_for_cpu(struct device *d, dma_addr_t addr, - size_t size, enum dma_data_direction dir) -{ - return 1; -} - -static inline int dmabounce_sync_for_device(struct device *d, dma_addr_t addr, - size_t size, enum dma_data_direction dir) -{ - return 1; -} - -static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, enum dma_data_direction dir) -{ - __dma_page_cpu_to_dev(page, offset, size, dir); - return pfn_to_dma(dev, page_to_pfn(page)) + offset; -} - -static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle, - size_t size, enum dma_data_direction dir) -{ - __dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)), - handle & ~PAGE_MASK, size, dir); -} -#endif /* CONFIG_DMABOUNCE */ /* * The scatter list versions of the above methods. diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index b50fa57..c949668 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -29,6 +29,75 @@ #include "mm.h" +/* + * The DMA API is built upon the notion of "buffer ownership". A buffer + * is either exclusively owned by the CPU (and therefore may be accessed + * by it) or exclusively owned by the DMA device. These helper functions + * represent the transitions between these two ownership states. + * + * Note, however, that on later ARMs, this notion does not work due to + * speculative prefetches. We model our approach on the assumption that + * the CPU does do speculative prefetches, which means we clean caches + * before transfers and delay cache invalidation until transfer completion. + * + * Private support functions: these are not part of the API and are + * liable to change. Drivers must not use these. + */ +static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size, + enum dma_data_direction dir) +{ + extern void ___dma_single_cpu_to_dev(const void *, size_t, + enum dma_data_direction); + + if (!arch_is_coherent()) + ___dma_single_cpu_to_dev(kaddr, size, dir); +} + +static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size, + enum dma_data_direction dir) +{ + extern void ___dma_single_dev_to_cpu(const void *, size_t, + enum dma_data_direction); + + if (!arch_is_coherent()) + ___dma_single_dev_to_cpu(kaddr, size, dir); +} + +static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off, + size_t size, enum dma_data_direction dir) +{ + extern void ___dma_page_cpu_to_dev(struct page *, unsigned long, + size_t, enum dma_data_direction); + + if (!arch_is_coherent()) + ___dma_page_cpu_to_dev(page, off, size, dir); +} + +static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off, + size_t size, enum dma_data_direction dir) +{ + extern void ___dma_page_dev_to_cpu(struct page *, unsigned long, + size_t, enum dma_data_direction); + + if (!arch_is_coherent()) + ___dma_page_dev_to_cpu(page, off, size, dir); +} + + +static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, enum dma_data_direction dir) +{ + __dma_page_cpu_to_dev(page, offset, size, dir); + return pfn_to_dma(dev, page_to_pfn(page)) + offset; +} + +static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir) +{ + __dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)), + handle & ~PAGE_MASK, size, dir); +} + /** * arm_dma_map_page - map a portion of a page for streaming DMA * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices @@ -76,9 +145,6 @@ static inline void arm_dma_sync_single_for_cpu(struct device *dev, { unsigned int offset = handle & (PAGE_SIZE - 1); struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); - if (!dmabounce_sync_for_cpu(dev, handle, size, dir)) - return; - __dma_page_dev_to_cpu(page, offset, size, dir); } @@ -87,9 +153,6 @@ static inline void arm_dma_sync_single_for_device(struct device *dev, { unsigned int offset = handle & (PAGE_SIZE - 1); struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); - if (!dmabounce_sync_for_device(dev, handle, size, dir)) - return; - __dma_page_cpu_to_dev(page, offset, size, dir); } @@ -599,7 +662,6 @@ void ___dma_page_cpu_to_dev(struct page *page, unsigned long off, } /* FIXME: non-speculating: flush on bidirectional mappings? */ } -EXPORT_SYMBOL(___dma_page_cpu_to_dev); void ___dma_page_dev_to_cpu(struct page *page, unsigned long off, size_t size, enum dma_data_direction dir) @@ -619,7 +681,6 @@ void ___dma_page_dev_to_cpu(struct page *page, unsigned long off, if (dir != DMA_TO_DEVICE && off == 0 && size >= PAGE_SIZE) set_bit(PG_dcache_clean, &page->flags); } -EXPORT_SYMBOL(___dma_page_dev_to_cpu); /** * arm_dma_map_sg - map a set of SG buffers for streaming mode DMA @@ -737,9 +798,7 @@ static int arm_dma_set_mask(struct device *dev, u64 dma_mask) if (!dev->dma_mask || !dma_supported(dev, dma_mask)) return -EIO; -#ifndef CONFIG_DMABOUNCE *dev->dma_mask = dma_mask; -#endif return 0; } -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:13:59 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:13:59 +0200 Subject: [Linaro-mm-sig] [PATCHv10 09/11] ARM: dma-mapping: remove redundant code and do the cleanup In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-10-git-send-email-m.szyprowski@samsung.com> This patch just performs a global cleanup in DMA mapping implementation for ARM architecture. Some of the tiny helper functions have been moved to the caller code, some have been merged together. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Acked-by: Arnd Bergmann Tested-By: Subash Patel --- arch/arm/mm/dma-mapping.c | 88 +++++++++++++-------------------------------- 1 file changed, 24 insertions(+), 64 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index c949668..dddb406 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -40,64 +40,12 @@ * the CPU does do speculative prefetches, which means we clean caches * before transfers and delay cache invalidation until transfer completion. * - * Private support functions: these are not part of the API and are - * liable to change. Drivers must not use these. */ -static inline void __dma_single_cpu_to_dev(const void *kaddr, size_t size, - enum dma_data_direction dir) -{ - extern void ___dma_single_cpu_to_dev(const void *, size_t, - enum dma_data_direction); - - if (!arch_is_coherent()) - ___dma_single_cpu_to_dev(kaddr, size, dir); -} - -static inline void __dma_single_dev_to_cpu(const void *kaddr, size_t size, - enum dma_data_direction dir) -{ - extern void ___dma_single_dev_to_cpu(const void *, size_t, - enum dma_data_direction); - - if (!arch_is_coherent()) - ___dma_single_dev_to_cpu(kaddr, size, dir); -} - -static inline void __dma_page_cpu_to_dev(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) -{ - extern void ___dma_page_cpu_to_dev(struct page *, unsigned long, +static void __dma_page_cpu_to_dev(struct page *, unsigned long, size_t, enum dma_data_direction); - - if (!arch_is_coherent()) - ___dma_page_cpu_to_dev(page, off, size, dir); -} - -static inline void __dma_page_dev_to_cpu(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) -{ - extern void ___dma_page_dev_to_cpu(struct page *, unsigned long, +static void __dma_page_dev_to_cpu(struct page *, unsigned long, size_t, enum dma_data_direction); - if (!arch_is_coherent()) - ___dma_page_dev_to_cpu(page, off, size, dir); -} - - -static inline dma_addr_t __dma_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, enum dma_data_direction dir) -{ - __dma_page_cpu_to_dev(page, offset, size, dir); - return pfn_to_dma(dev, page_to_pfn(page)) + offset; -} - -static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle, - size_t size, enum dma_data_direction dir) -{ - __dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)), - handle & ~PAGE_MASK, size, dir); -} - /** * arm_dma_map_page - map a portion of a page for streaming DMA * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices @@ -112,11 +60,13 @@ static inline void __dma_unmap_page(struct device *dev, dma_addr_t handle, * The device owns this memory once this call has completed. The CPU * can regain ownership by calling dma_unmap_page(). */ -static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page, +static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, struct dma_attrs *attrs) { - return __dma_map_page(dev, page, offset, size, dir); + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(page, offset, size, dir); + return pfn_to_dma(dev, page_to_pfn(page)) + offset; } /** @@ -133,27 +83,31 @@ static inline dma_addr_t arm_dma_map_page(struct device *dev, struct page *page, * After this call, reads by the CPU to the buffer are guaranteed to see * whatever the device wrote there. */ -static inline void arm_dma_unmap_page(struct device *dev, dma_addr_t handle, +static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir, struct dma_attrs *attrs) { - __dma_unmap_page(dev, handle, size, dir); + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)), + handle & ~PAGE_MASK, size, dir); } -static inline void arm_dma_sync_single_for_cpu(struct device *dev, +static void arm_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { unsigned int offset = handle & (PAGE_SIZE - 1); struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); - __dma_page_dev_to_cpu(page, offset, size, dir); + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(page, offset, size, dir); } -static inline void arm_dma_sync_single_for_device(struct device *dev, +static void arm_dma_sync_single_for_device(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { unsigned int offset = handle & (PAGE_SIZE - 1); struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); - __dma_page_cpu_to_dev(page, offset, size, dir); + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(page, offset, size, dir); } static int arm_dma_set_mask(struct device *dev, u64 dma_mask); @@ -647,7 +601,13 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, } while (left); } -void ___dma_page_cpu_to_dev(struct page *page, unsigned long off, +/* + * Make an area consistent for devices. + * Note: Drivers should NOT use this function directly, as it will break + * platforms with CONFIG_DMABOUNCE. + * Use the driver DMA support - see dma-mapping.h (dma_sync_*) + */ +static void __dma_page_cpu_to_dev(struct page *page, unsigned long off, size_t size, enum dma_data_direction dir) { unsigned long paddr; @@ -663,7 +623,7 @@ void ___dma_page_cpu_to_dev(struct page *page, unsigned long off, /* FIXME: non-speculating: flush on bidirectional mappings? */ } -void ___dma_page_dev_to_cpu(struct page *page, unsigned long off, +static void __dma_page_dev_to_cpu(struct page *page, unsigned long off, size_t size, enum dma_data_direction dir) { unsigned long paddr = page_to_phys(page) + off; -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:14:00 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:14:00 +0200 Subject: [Linaro-mm-sig] [PATCHv10 10/11] ARM: dma-mapping: use alloc, mmap, free from dma_ops In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-11-git-send-email-m.szyprowski@samsung.com> This patch converts dma_alloc/free/mmap_{coherent,writecombine} functions to use generic alloc/free/mmap methods from dma_map_ops structure. A new DMA_ATTR_WRITE_COMBINE DMA attribute have been introduced to implement writecombine methods. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Acked-by: Arnd Bergmann Tested-By: Subash Patel --- arch/arm/common/dmabounce.c | 3 + arch/arm/include/asm/dma-mapping.h | 107 ++++++++++++++++++++++++++---------- arch/arm/mm/dma-mapping.c | 60 ++++++++------------ 3 files changed, 104 insertions(+), 66 deletions(-) diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c index 813c29d..9d7eb53 100644 --- a/arch/arm/common/dmabounce.c +++ b/arch/arm/common/dmabounce.c @@ -449,6 +449,9 @@ static int dmabounce_set_mask(struct device *dev, u64 dma_mask) } static struct dma_map_ops dmabounce_ops = { + .alloc = arm_dma_alloc, + .free = arm_dma_free, + .mmap = arm_dma_mmap, .map_page = dmabounce_map_page, .unmap_page = dmabounce_unmap_page, .sync_single_for_cpu = dmabounce_sync_for_cpu, diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 7a7c3c7..bbef15d 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -5,6 +5,7 @@ #include #include +#include #include #include @@ -110,68 +111,115 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size, extern int dma_supported(struct device *dev, u64 mask); /** - * dma_alloc_coherent - allocate consistent memory for DMA + * arm_dma_alloc - allocate consistent memory for DMA * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices * @size: required memory size * @handle: bus-specific DMA address + * @attrs: optinal attributes that specific mapping properties * - * Allocate some uncached, unbuffered memory for a device for - * performing DMA. This function allocates pages, and will - * return the CPU-viewed address, and sets @handle to be the - * device-viewed address. + * Allocate some memory for a device for performing DMA. This function + * allocates pages, and will return the CPU-viewed address, and sets @handle + * to be the device-viewed address. */ -extern void *dma_alloc_coherent(struct device *, size_t, dma_addr_t *, gfp_t); +extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, + gfp_t gfp, struct dma_attrs *attrs); + +#define dma_alloc_coherent(d, s, h, f) dma_alloc_attrs(d, s, h, f, NULL) + +static inline void *dma_alloc_attrs(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t flag, + struct dma_attrs *attrs) +{ + struct dma_map_ops *ops = get_dma_ops(dev); + void *cpu_addr; + BUG_ON(!ops); + + cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs); + debug_dma_alloc_coherent(dev, size, *dma_handle, cpu_addr); + return cpu_addr; +} /** - * dma_free_coherent - free memory allocated by dma_alloc_coherent + * arm_dma_free - free memory allocated by arm_dma_alloc * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices * @size: size of memory originally requested in dma_alloc_coherent * @cpu_addr: CPU-view address returned from dma_alloc_coherent * @handle: device-view address returned from dma_alloc_coherent + * @attrs: optinal attributes that specific mapping properties * * Free (and unmap) a DMA buffer previously allocated by - * dma_alloc_coherent(). + * arm_dma_alloc(). * * References to memory and mappings associated with cpu_addr/handle * during and after this call executing are illegal. */ -extern void dma_free_coherent(struct device *, size_t, void *, dma_addr_t); +extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr, + dma_addr_t handle, struct dma_attrs *attrs); + +#define dma_free_coherent(d, s, c, h) dma_free_attrs(d, s, c, h, NULL) + +static inline void dma_free_attrs(struct device *dev, size_t size, + void *cpu_addr, dma_addr_t dma_handle, + struct dma_attrs *attrs) +{ + struct dma_map_ops *ops = get_dma_ops(dev); + BUG_ON(!ops); + + debug_dma_free_coherent(dev, size, cpu_addr, dma_handle); + ops->free(dev, size, cpu_addr, dma_handle, attrs); +} /** - * dma_mmap_coherent - map a coherent DMA allocation into user space + * arm_dma_mmap - map a coherent DMA allocation into user space * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices * @vma: vm_area_struct describing requested user mapping * @cpu_addr: kernel CPU-view address returned from dma_alloc_coherent * @handle: device-view address returned from dma_alloc_coherent * @size: size of memory originally requested in dma_alloc_coherent + * @attrs: optinal attributes that specific mapping properties * * Map a coherent DMA buffer previously allocated by dma_alloc_coherent * into user space. The coherent DMA buffer must not be freed by the * driver until the user space mapping has been released. */ -int dma_mmap_coherent(struct device *, struct vm_area_struct *, - void *, dma_addr_t, size_t); +extern int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs); +#define dma_mmap_coherent(d, v, c, h, s) dma_mmap_attrs(d, v, c, h, s, NULL) -/** - * dma_alloc_writecombine - allocate writecombining memory for DMA - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @size: required memory size - * @handle: bus-specific DMA address - * - * Allocate some uncached, buffered memory for a device for - * performing DMA. This function allocates pages, and will - * return the CPU-viewed address, and sets @handle to be the - * device-viewed address. - */ -extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *, - gfp_t); +static inline int dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, + size_t size, struct dma_attrs *attrs) +{ + struct dma_map_ops *ops = get_dma_ops(dev); + BUG_ON(!ops); + return ops->mmap(dev, vma, cpu_addr, dma_addr, size, attrs); +} -#define dma_free_writecombine(dev,size,cpu_addr,handle) \ - dma_free_coherent(dev,size,cpu_addr,handle) +static inline void *dma_alloc_writecombine(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t flag) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs); + return dma_alloc_attrs(dev, size, dma_handle, flag, &attrs); +} -int dma_mmap_writecombine(struct device *, struct vm_area_struct *, - void *, dma_addr_t, size_t); +static inline void dma_free_writecombine(struct device *dev, size_t size, + void *cpu_addr, dma_addr_t dma_handle) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs); + return dma_free_attrs(dev, size, cpu_addr, dma_handle, &attrs); +} + +static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size) +{ + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_WRITE_COMBINE, &attrs); + return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs); +} /* * This can be called during boot to increase the size of the consistent @@ -180,7 +228,6 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *, */ extern void __init init_consistent_dma_size(unsigned long size); - /* * For SA-1111, IXP425, and ADI systems the dma-mapping functions are "magic" * and utilize bounce buffers as needed to work around limited DMA windows. diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index dddb406..2501866 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -113,6 +113,9 @@ static void arm_dma_sync_single_for_device(struct device *dev, static int arm_dma_set_mask(struct device *dev, u64 dma_mask); struct dma_map_ops arm_dma_ops = { + .alloc = arm_dma_alloc, + .free = arm_dma_free, + .mmap = arm_dma_mmap, .map_page = arm_dma_map_page, .unmap_page = arm_dma_unmap_page, .map_sg = arm_dma_map_sg, @@ -415,10 +418,19 @@ static void __dma_free_remap(void *cpu_addr, size_t size) arm_vmregion_free(&consistent_head, c); } +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) +{ + prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? + pgprot_writecombine(prot) : + pgprot_dmacoherent(prot); + return prot; +} + #else /* !CONFIG_MMU */ #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) #define __dma_free_remap(addr, size) do { } while (0) +#define __get_dma_pgprot(attrs, prot) __pgprot(0) #endif /* CONFIG_MMU */ @@ -462,41 +474,33 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, * Allocate DMA-coherent memory space and return both the kernel remapped * virtual and bus address for that space. */ -void * -dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp) +void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, + gfp_t gfp, struct dma_attrs *attrs) { + pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); void *memory; if (dma_alloc_from_coherent(dev, size, handle, &memory)) return memory; - return __dma_alloc(dev, size, handle, gfp, - pgprot_dmacoherent(pgprot_kernel), + return __dma_alloc(dev, size, handle, gfp, prot, __builtin_return_address(0)); } -EXPORT_SYMBOL(dma_alloc_coherent); /* - * Allocate a writecombining region, in much the same way as - * dma_alloc_coherent above. + * Create userspace mapping for the DMA-coherent memory. */ -void * -dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp) -{ - return __dma_alloc(dev, size, handle, gfp, - pgprot_writecombine(pgprot_kernel), - __builtin_return_address(0)); -} -EXPORT_SYMBOL(dma_alloc_writecombine); - -static int dma_mmap(struct device *dev, struct vm_area_struct *vma, - void *cpu_addr, dma_addr_t dma_addr, size_t size) +int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) { int ret = -ENXIO; #ifdef CONFIG_MMU unsigned long user_size, kern_size; struct arm_vmregion *c; + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) return ret; @@ -521,27 +525,12 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma, return ret; } -int dma_mmap_coherent(struct device *dev, struct vm_area_struct *vma, - void *cpu_addr, dma_addr_t dma_addr, size_t size) -{ - vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); - return dma_mmap(dev, vma, cpu_addr, dma_addr, size); -} -EXPORT_SYMBOL(dma_mmap_coherent); - -int dma_mmap_writecombine(struct device *dev, struct vm_area_struct *vma, - void *cpu_addr, dma_addr_t dma_addr, size_t size) -{ - vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); - return dma_mmap(dev, vma, cpu_addr, dma_addr, size); -} -EXPORT_SYMBOL(dma_mmap_writecombine); - /* * free a page as defined by the above mapping. * Must not be called with IRQs disabled. */ -void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle) +void arm_dma_free(struct device *dev, size_t size, void *cpu_addr, + dma_addr_t handle, struct dma_attrs *attrs) { WARN_ON(irqs_disabled()); @@ -555,7 +544,6 @@ void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr __dma_free_buffer(pfn_to_page(dma_to_pfn(dev, handle)), size); } -EXPORT_SYMBOL(dma_free_coherent); static void dma_cache_maint_page(struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 13:14:01 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 15:14:01 +0200 Subject: [Linaro-mm-sig] [PATCHv10 11/11] ARM: dma-mapping: add support for IOMMU mapper In-Reply-To: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> References: <1337260441-8121-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337260441-8121-12-git-send-email-m.szyprowski@samsung.com> This patch add a complete implementation of DMA-mapping API for devices which have IOMMU support. This implementation tries to optimize dma address space usage by remapping all possible physical memory chunks into a single dma address space chunk. DMA address space is managed on top of the bitmap stored in the dma_iommu_mapping structure stored in device->archdata. Platform setup code has to initialize parameters of the dma address space (base address, size, allocation precision order) with arm_iommu_create_mapping() function. To reduce the size of the bitmap, all allocations are aligned to the specified order of base 4 KiB pages. dma_alloc_* functions allocate physical memory in chunks, each with alloc_pages() function to avoid failing if the physical memory gets fragmented. In worst case the allocated buffer is composed of 4 KiB page chunks. dma_map_sg() function minimizes the total number of dma address space chunks by merging of physical memory chunks into one larger dma address space chunk. If requested chunk (scatter list entry) boundaries match physical page boundaries, most calls to dma_map_sg() requests will result in creating only one chunk in dma address space. dma_map_page() simply creates a mapping for the given page(s) in the dma address space. All dma functions also perform required cache operation like their counterparts from the arm linear physical memory mapping version. This patch contains code and fixes kindly provided by: - Krishna Reddy , - Andrzej Pietrasiewicz , - Hiroshi DOYU Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Reviewed-by: Konrad Rzeszutek Wilk Tested-By: Subash Patel --- arch/arm/Kconfig | 8 + arch/arm/include/asm/device.h | 3 + arch/arm/include/asm/dma-iommu.h | 34 ++ arch/arm/mm/dma-mapping.c | 712 +++++++++++++++++++++++++++++++++++++- arch/arm/mm/vmregion.h | 2 +- 5 files changed, 747 insertions(+), 12 deletions(-) create mode 100644 arch/arm/include/asm/dma-iommu.h diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index c8111c5..97478a5 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -46,6 +46,14 @@ config ARM config ARM_HAS_SG_CHAIN bool +config NEED_SG_DMA_LENGTH + bool + +config ARM_DMA_USE_IOMMU + select NEED_SG_DMA_LENGTH + select ARM_HAS_SG_CHAIN + bool + config HAVE_PWM bool diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h index 6e2cb0e..b69c0d3 100644 --- a/arch/arm/include/asm/device.h +++ b/arch/arm/include/asm/device.h @@ -14,6 +14,9 @@ struct dev_archdata { #ifdef CONFIG_IOMMU_API void *iommu; /* private IOMMU data */ #endif +#ifdef CONFIG_ARM_DMA_USE_IOMMU + struct dma_iommu_mapping *mapping; +#endif }; struct omap_device; diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h new file mode 100644 index 0000000..799b094 --- /dev/null +++ b/arch/arm/include/asm/dma-iommu.h @@ -0,0 +1,34 @@ +#ifndef ASMARM_DMA_IOMMU_H +#define ASMARM_DMA_IOMMU_H + +#ifdef __KERNEL__ + +#include +#include +#include +#include + +struct dma_iommu_mapping { + /* iommu specific data */ + struct iommu_domain *domain; + + void *bitmap; + size_t bits; + unsigned int order; + dma_addr_t base; + + spinlock_t lock; + struct kref kref; +}; + +struct dma_iommu_mapping * +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size, + int order); + +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping); + +int arm_iommu_attach_device(struct device *dev, + struct dma_iommu_mapping *mapping); + +#endif /* __KERNEL__ */ +#endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 2501866..3ac4760 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -19,6 +19,8 @@ #include #include #include +#include +#include #include #include @@ -26,6 +28,7 @@ #include #include #include +#include #include "mm.h" @@ -155,6 +158,21 @@ static u64 get_coherent_dma_mask(struct device *dev) return mask; } +static void __dma_clear_buffer(struct page *page, size_t size) +{ + void *ptr; + /* + * Ensure that the allocated pages are zeroed, and that any data + * lurking in the kernel direct-mapped region is invalidated. + */ + ptr = page_address(page); + if (ptr) { + memset(ptr, 0, size); + dmac_flush_range(ptr, ptr + size); + outer_flush_range(__pa(ptr), __pa(ptr) + size); + } +} + /* * Allocate a DMA buffer for 'dev' of size 'size' using the * specified gfp mask. Note that 'size' must be page aligned. @@ -163,7 +181,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf { unsigned long order = get_order(size); struct page *page, *p, *e; - void *ptr; u64 mask = get_coherent_dma_mask(dev); #ifdef CONFIG_DMA_API_DEBUG @@ -192,14 +209,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++) __free_page(p); - /* - * Ensure that the allocated pages are zeroed, and that any data - * lurking in the kernel direct-mapped region is invalidated. - */ - ptr = page_address(page); - memset(ptr, 0, size); - dmac_flush_range(ptr, ptr + size); - outer_flush_range(__pa(ptr), __pa(ptr) + size); + __dma_clear_buffer(page, size); return page; } @@ -348,7 +358,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); pte = consistent_pte[idx] + off; - c->vm_pages = page; + c->priv = page; do { BUG_ON(!pte_none(*pte)); @@ -509,13 +519,14 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); if (c) { unsigned long off = vma->vm_pgoff; + struct page *pages = c->priv; kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT; if (off < kern_size && user_size <= (kern_size - off)) { ret = remap_pfn_range(vma, vma->vm_start, - page_to_pfn(c->vm_pages) + off, + page_to_pfn(pages) + off, user_size << PAGE_SHIFT, vma->vm_page_prot); } @@ -654,6 +665,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, int i, j; for_each_sg(sg, s, nents, i) { +#ifdef CONFIG_NEED_SG_DMA_LENGTH + s->dma_length = s->length; +#endif s->dma_address = ops->map_page(dev, sg_page(s), s->offset, s->length, dir, attrs); if (dma_mapping_error(dev, s->dma_address)) @@ -762,3 +776,679 @@ static int __init dma_debug_do_init(void) return 0; } fs_initcall(dma_debug_do_init); + +#ifdef CONFIG_ARM_DMA_USE_IOMMU + +/* IOMMU */ + +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, + size_t size) +{ + unsigned int order = get_order(size); + unsigned int align = 0; + unsigned int count, start; + unsigned long flags; + + count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) + + (1 << mapping->order) - 1) >> mapping->order; + + if (order > mapping->order) + align = (1 << (order - mapping->order)) - 1; + + spin_lock_irqsave(&mapping->lock, flags); + start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0, + count, align); + if (start > mapping->bits) { + spin_unlock_irqrestore(&mapping->lock, flags); + return DMA_ERROR_CODE; + } + + bitmap_set(mapping->bitmap, start, count); + spin_unlock_irqrestore(&mapping->lock, flags); + + return mapping->base + (start << (mapping->order + PAGE_SHIFT)); +} + +static inline void __free_iova(struct dma_iommu_mapping *mapping, + dma_addr_t addr, size_t size) +{ + unsigned int start = (addr - mapping->base) >> + (mapping->order + PAGE_SHIFT); + unsigned int count = ((size >> PAGE_SHIFT) + + (1 << mapping->order) - 1) >> mapping->order; + unsigned long flags; + + spin_lock_irqsave(&mapping->lock, flags); + bitmap_clear(mapping->bitmap, start, count); + spin_unlock_irqrestore(&mapping->lock, flags); +} + +static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp) +{ + struct page **pages; + int count = size >> PAGE_SHIFT; + int array_size = count * sizeof(struct page *); + int i = 0; + + if (array_size <= PAGE_SIZE) + pages = kzalloc(array_size, gfp); + else + pages = vzalloc(array_size); + if (!pages) + return NULL; + + while (count) { + int j, order = __ffs(count); + + pages[i] = alloc_pages(gfp | __GFP_NOWARN, order); + while (!pages[i] && order) + pages[i] = alloc_pages(gfp | __GFP_NOWARN, --order); + if (!pages[i]) + goto error; + + if (order) + split_page(pages[i], order); + j = 1 << order; + while (--j) + pages[i + j] = pages[i] + j; + + __dma_clear_buffer(pages[i], PAGE_SIZE << order); + i += 1 << order; + count -= 1 << order; + } + + return pages; +error: + while (--i) + if (pages[i]) + __free_pages(pages[i], 0); + if (array_size < PAGE_SIZE) + kfree(pages); + else + vfree(pages); + return NULL; +} + +static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size) +{ + int count = size >> PAGE_SHIFT; + int array_size = count * sizeof(struct page *); + int i; + for (i = 0; i < count; i++) + if (pages[i]) + __free_pages(pages[i], 0); + if (array_size < PAGE_SIZE) + kfree(pages); + else + vfree(pages); + return 0; +} + +/* + * Create a CPU mapping for a specified pages + */ +static void * +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) +{ + struct arm_vmregion *c; + size_t align; + size_t count = size >> PAGE_SHIFT; + int bit; + + if (!consistent_pte[0]) { + pr_err("%s: not initialised\n", __func__); + dump_stack(); + return NULL; + } + + /* + * Align the virtual region allocation - maximum alignment is + * a section size, minimum is a page size. This helps reduce + * fragmentation of the DMA space, and also prevents allocations + * smaller than a section from crossing a section boundary. + */ + bit = fls(size - 1); + if (bit > SECTION_SHIFT) + bit = SECTION_SHIFT; + align = 1 << bit; + + /* + * Allocate a virtual address in the consistent mapping region. + */ + c = arm_vmregion_alloc(&consistent_head, align, size, + gfp & ~(__GFP_DMA | __GFP_HIGHMEM), NULL); + if (c) { + pte_t *pte; + int idx = CONSISTENT_PTE_INDEX(c->vm_start); + int i = 0; + u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); + + pte = consistent_pte[idx] + off; + c->priv = pages; + + do { + BUG_ON(!pte_none(*pte)); + + set_pte_ext(pte, mk_pte(pages[i], prot), 0); + pte++; + off++; + i++; + if (off >= PTRS_PER_PTE) { + off = 0; + pte = consistent_pte[++idx]; + } + } while (i < count); + + dsb(); + + return (void *)c->vm_start; + } + return NULL; +} + +/* + * Create a mapping in device IO address space for specified pages + */ +static dma_addr_t +__iommu_create_mapping(struct device *dev, struct page **pages, size_t size) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; + dma_addr_t dma_addr, iova; + int i, ret = DMA_ERROR_CODE; + + dma_addr = __alloc_iova(mapping, size); + if (dma_addr == DMA_ERROR_CODE) + return dma_addr; + + iova = dma_addr; + for (i = 0; i < count; ) { + unsigned int next_pfn = page_to_pfn(pages[i]) + 1; + phys_addr_t phys = page_to_phys(pages[i]); + unsigned int len, j; + + for (j = i + 1; j < count; j++, next_pfn++) + if (page_to_pfn(pages[j]) != next_pfn) + break; + + len = (j - i) << PAGE_SHIFT; + ret = iommu_map(mapping->domain, iova, phys, len, 0); + if (ret < 0) + goto fail; + iova += len; + i = j; + } + return dma_addr; +fail: + iommu_unmap(mapping->domain, dma_addr, iova-dma_addr); + __free_iova(mapping, dma_addr, size); + return DMA_ERROR_CODE; +} + +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + + /* + * add optional in-page offset from iova to size and align + * result to page size + */ + size = PAGE_ALIGN((iova & ~PAGE_MASK) + size); + iova &= PAGE_MASK; + + iommu_unmap(mapping->domain, iova, size); + __free_iova(mapping, iova, size); + return 0; +} + +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, + dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) +{ + pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); + struct page **pages; + void *addr = NULL; + + *handle = DMA_ERROR_CODE; + size = PAGE_ALIGN(size); + + pages = __iommu_alloc_buffer(dev, size, gfp); + if (!pages) + return NULL; + + *handle = __iommu_create_mapping(dev, pages, size); + if (*handle == DMA_ERROR_CODE) + goto err_buffer; + + addr = __iommu_alloc_remap(pages, size, gfp, prot); + if (!addr) + goto err_mapping; + + return addr; + +err_mapping: + __iommu_remove_mapping(dev, *handle, size); +err_buffer: + __iommu_free_buffer(dev, pages, size); + return NULL; +} + +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) +{ + struct arm_vmregion *c; + + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); + + if (c) { + struct page **pages = c->priv; + + unsigned long uaddr = vma->vm_start; + unsigned long usize = vma->vm_end - vma->vm_start; + int i = 0; + + do { + int ret; + + ret = vm_insert_page(vma, uaddr, pages[i++]); + if (ret) { + pr_err("Remapping memory, error: %d\n", ret); + return ret; + } + + uaddr += PAGE_SIZE; + usize -= PAGE_SIZE; + } while (usize > 0); + } + return 0; +} + +/* + * free a page as defined by the above mapping. + * Must not be called with IRQs disabled. + */ +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, + dma_addr_t handle, struct dma_attrs *attrs) +{ + struct arm_vmregion *c; + size = PAGE_ALIGN(size); + + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); + if (c) { + struct page **pages = c->priv; + __dma_free_remap(cpu_addr, size); + __iommu_remove_mapping(dev, handle, size); + __iommu_free_buffer(dev, pages, size); + } +} + +/* + * Map a part of the scatter-gather list into contiguous io address space + */ +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, + size_t size, dma_addr_t *handle, + enum dma_data_direction dir) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + dma_addr_t iova, iova_base; + int ret = 0; + unsigned int count; + struct scatterlist *s; + + size = PAGE_ALIGN(size); + *handle = DMA_ERROR_CODE; + + iova_base = iova = __alloc_iova(mapping, size); + if (iova == DMA_ERROR_CODE) + return -ENOMEM; + + for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { + phys_addr_t phys = page_to_phys(sg_page(s)); + unsigned int len = PAGE_ALIGN(s->offset + s->length); + + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); + + ret = iommu_map(mapping->domain, iova, phys, len, 0); + if (ret < 0) + goto fail; + count += len >> PAGE_SHIFT; + iova += len; + } + *handle = iova_base; + + return 0; +fail: + iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE); + __free_iova(mapping, iova_base, size); + return ret; +} + +/** + * arm_iommu_map_sg - map a set of SG buffers for streaming mode DMA + * @dev: valid struct device pointer + * @sg: list of buffers + * @nents: number of buffers to map + * @dir: DMA transfer direction + * + * Map a set of buffers described by scatterlist in streaming mode for DMA. + * The scatter gather list elements are merged together (if possible) and + * tagged with the appropriate dma address and length. They are obtained via + * sg_dma_{address,length}. + */ +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + struct scatterlist *s = sg, *dma = sg, *start = sg; + int i, count = 0; + unsigned int offset = s->offset; + unsigned int size = s->offset + s->length; + unsigned int max = dma_get_max_seg_size(dev); + + for (i = 1; i < nents; i++) { + s = sg_next(s); + + s->dma_address = DMA_ERROR_CODE; + s->dma_length = 0; + + if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) { + if (__map_sg_chunk(dev, start, size, &dma->dma_address, + dir) < 0) + goto bad_mapping; + + dma->dma_address += offset; + dma->dma_length = size - offset; + + size = offset = s->offset; + start = s; + dma = sg_next(dma); + count += 1; + } + size += s->length; + } + if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0) + goto bad_mapping; + + dma->dma_address += offset; + dma->dma_length = size - offset; + + return count+1; + +bad_mapping: + for_each_sg(sg, s, count, i) + __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); + return 0; +} + +/** + * arm_iommu_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg + * @dev: valid struct device pointer + * @sg: list of buffers + * @nents: number of buffers to unmap (same as was passed to dma_map_sg) + * @dir: DMA transfer direction (same as was passed to dma_map_sg) + * + * Unmap a set of streaming mode DMA translations. Again, CPU access + * rules concerning calls here are the same as for dma_unmap_single(). + */ +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + struct scatterlist *s; + int i; + + for_each_sg(sg, s, nents, i) { + if (sg_dma_len(s)) + __iommu_remove_mapping(dev, sg_dma_address(s), + sg_dma_len(s)); + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(sg_page(s), s->offset, + s->length, dir); + } +} + +/** + * arm_iommu_sync_sg_for_cpu + * @dev: valid struct device pointer + * @sg: list of buffers + * @nents: number of buffers to map (returned from dma_map_sg) + * @dir: DMA transfer direction (same as was passed to dma_map_sg) + */ +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction dir) +{ + struct scatterlist *s; + int i; + + for_each_sg(sg, s, nents, i) + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir); + +} + +/** + * arm_iommu_sync_sg_for_device + * @dev: valid struct device pointer + * @sg: list of buffers + * @nents: number of buffers to map (returned from dma_map_sg) + * @dir: DMA transfer direction (same as was passed to dma_map_sg) + */ +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction dir) +{ + struct scatterlist *s; + int i; + + for_each_sg(sg, s, nents, i) + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); +} + + +/** + * arm_iommu_map_page + * @dev: valid struct device pointer + * @page: page that buffer resides in + * @offset: offset into page for start of buffer + * @size: size of buffer to map + * @dir: DMA transfer direction + * + * IOMMU aware version of arm_dma_map_page() + */ +static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + dma_addr_t dma_addr; + int ret, len = PAGE_ALIGN(size + offset); + + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(page, offset, size, dir); + + dma_addr = __alloc_iova(mapping, len); + if (dma_addr == DMA_ERROR_CODE) + return dma_addr; + + ret = iommu_map(mapping->domain, dma_addr, page_to_phys(page), len, 0); + if (ret < 0) + goto fail; + + return dma_addr + offset; +fail: + __free_iova(mapping, dma_addr, len); + return DMA_ERROR_CODE; +} + +/** + * arm_iommu_unmap_page + * @dev: valid struct device pointer + * @handle: DMA address of buffer + * @size: size of buffer (same as passed to dma_map_page) + * @dir: DMA transfer direction (same as passed to dma_map_page) + * + * IOMMU aware version of arm_dma_unmap_page() + */ +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + dma_addr_t iova = handle & PAGE_MASK; + struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); + int offset = handle & ~PAGE_MASK; + int len = PAGE_ALIGN(size + offset); + + if (!iova) + return; + + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(page, offset, size, dir); + + iommu_unmap(mapping->domain, iova, len); + __free_iova(mapping, iova, len); +} + +static void arm_iommu_sync_single_for_cpu(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + dma_addr_t iova = handle & PAGE_MASK; + struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); + unsigned int offset = handle & ~PAGE_MASK; + + if (!iova) + return; + + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(page, offset, size, dir); +} + +static void arm_iommu_sync_single_for_device(struct device *dev, + dma_addr_t handle, size_t size, enum dma_data_direction dir) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + dma_addr_t iova = handle & PAGE_MASK; + struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); + unsigned int offset = handle & ~PAGE_MASK; + + if (!iova) + return; + + __dma_page_cpu_to_dev(page, offset, size, dir); +} + +struct dma_map_ops iommu_ops = { + .alloc = arm_iommu_alloc_attrs, + .free = arm_iommu_free_attrs, + .mmap = arm_iommu_mmap_attrs, + + .map_page = arm_iommu_map_page, + .unmap_page = arm_iommu_unmap_page, + .sync_single_for_cpu = arm_iommu_sync_single_for_cpu, + .sync_single_for_device = arm_iommu_sync_single_for_device, + + .map_sg = arm_iommu_map_sg, + .unmap_sg = arm_iommu_unmap_sg, + .sync_sg_for_cpu = arm_iommu_sync_sg_for_cpu, + .sync_sg_for_device = arm_iommu_sync_sg_for_device, +}; + +/** + * arm_iommu_create_mapping + * @bus: pointer to the bus holding the client device (for IOMMU calls) + * @base: start address of the valid IO address space + * @size: size of the valid IO address space + * @order: accuracy of the IO addresses allocations + * + * Creates a mapping structure which holds information about used/unused + * IO address ranges, which is required to perform memory allocation and + * mapping with IOMMU aware functions. + * + * The client device need to be attached to the mapping with + * arm_iommu_attach_device function. + */ +struct dma_iommu_mapping * +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size, + int order) +{ + unsigned int count = size >> (PAGE_SHIFT + order); + unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); + struct dma_iommu_mapping *mapping; + int err = -ENOMEM; + + if (!count) + return ERR_PTR(-EINVAL); + + mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL); + if (!mapping) + goto err; + + mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL); + if (!mapping->bitmap) + goto err2; + + mapping->base = base; + mapping->bits = BITS_PER_BYTE * bitmap_size; + mapping->order = order; + spin_lock_init(&mapping->lock); + + mapping->domain = iommu_domain_alloc(bus); + if (!mapping->domain) + goto err3; + + kref_init(&mapping->kref); + return mapping; +err3: + kfree(mapping->bitmap); +err2: + kfree(mapping); +err: + return ERR_PTR(err); +} + +static void release_iommu_mapping(struct kref *kref) +{ + struct dma_iommu_mapping *mapping = + container_of(kref, struct dma_iommu_mapping, kref); + + iommu_domain_free(mapping->domain); + kfree(mapping->bitmap); + kfree(mapping); +} + +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping) +{ + if (mapping) + kref_put(&mapping->kref, release_iommu_mapping); +} + +/** + * arm_iommu_attach_device + * @dev: valid struct device pointer + * @mapping: io address space mapping structure (returned from + * arm_iommu_create_mapping) + * + * Attaches specified io address space mapping to the provided device, + * this replaces the dma operations (dma_map_ops pointer) with the + * IOMMU aware version. More than one client might be attached to + * the same io address space mapping. + */ +int arm_iommu_attach_device(struct device *dev, + struct dma_iommu_mapping *mapping) +{ + int err; + + err = iommu_attach_device(mapping->domain, dev); + if (err) + return err; + + kref_get(&mapping->kref); + dev->archdata.mapping = mapping; + set_dma_ops(dev, &iommu_ops); + + pr_info("Attached IOMMU controller to %s device.\n", dev_name(dev)); + return 0; +} + +#endif diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h index 162be66..bf312c3 100644 --- a/arch/arm/mm/vmregion.h +++ b/arch/arm/mm/vmregion.h @@ -17,7 +17,7 @@ struct arm_vmregion { struct list_head vm_list; unsigned long vm_start; unsigned long vm_end; - struct page *vm_pages; + void *priv; int vm_active; const void *caller; }; -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 15:44:00 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 17:44:00 +0200 Subject: [Linaro-mm-sig] [PATCHv2 updated] ARM: dma-mapping: remove custom consistent dma region Message-ID: <1337269440-10225-1-git-send-email-m.szyprowski@samsung.com> This patch changes dma-mapping subsystem to use generic vmalloc areas for all consistent dma allocations. This increases the total size limit of the consistent allocations and removes platform hacks and a lot of duplicated code. Atomic allocations are served from special pool preallocated on boot, becasue vmalloc areas cannot be reliably created in atomic context. Signed-off-by: Marek Szyprowski --- Hello, This is an updated version of the patch posted in the following thread: http://www.spinics.net/lists/kernel/msg1342885.html This one has been rebased onto the ARM DMA-mapping redesign patches and includes a part for IOMMU-aware ARM DMA-mapping implementation. The ARM DMA-mapping redesign patches are available in the following thread: http://www.spinics.net/lists/arm-kernel/msg175729.html Best regards Marek Szyprowski Samsung Poland R&D Center --- Documentation/kernel-parameters.txt | 4 + arch/arm/include/asm/dma-mapping.h | 2 +- arch/arm/mm/dma-mapping.c | 497 ++++++++++++++++------------------- 3 files changed, 228 insertions(+), 275 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c1601e5..ba58f50 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -515,6 +515,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. a hypervisor. Default: yes + coherent_pool=nn[KMG] [ARM,KNL] + Sets the size of memory pool for coherent, atomic dma + allocations. + code_bytes [X86] How many bytes of object code to print in an oops report. Range: 0 - 8192 diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index bbef15d..80777d87 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -226,7 +226,7 @@ static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struc * DMA region above it's default value of 2MB. It must be called before the * memory allocator is initialised, i.e. before any core_initcall. */ -extern void __init init_consistent_dma_size(unsigned long size); +static inline void init_consistent_dma_size(unsigned long size) { } /* * For SA-1111, IXP425, and ADI systems the dma-mapping functions are "magic" diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 3ac4760..2e98403 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -228,204 +229,170 @@ static void __dma_free_buffer(struct page *page, size_t size) } #ifdef CONFIG_MMU - -#define CONSISTENT_OFFSET(x) (((unsigned long)(x) - consistent_base) >> PAGE_SHIFT) -#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - consistent_base) >> PMD_SHIFT) - -/* - * These are the page tables (2MB each) covering uncached, DMA consistent allocations - */ -static pte_t **consistent_pte; - -#define DEFAULT_CONSISTENT_DMA_SIZE SZ_2M - -unsigned long consistent_base = CONSISTENT_END - DEFAULT_CONSISTENT_DMA_SIZE; - -void __init init_consistent_dma_size(unsigned long size) -{ - unsigned long base = CONSISTENT_END - ALIGN(size, SZ_2M); - - BUG_ON(consistent_pte); /* Check we're called before DMA region init */ - BUG_ON(base < VMALLOC_END); - - /* Grow region to accommodate specified size */ - if (base < consistent_base) - consistent_base = base; -} - -#include "vmregion.h" - -static struct arm_vmregion_head consistent_head = { - .vm_lock = __SPIN_LOCK_UNLOCKED(&consistent_head.vm_lock), - .vm_list = LIST_HEAD_INIT(consistent_head.vm_list), - .vm_end = CONSISTENT_END, -}; - #ifdef CONFIG_HUGETLB_PAGE #error ARM Coherent DMA allocator does not (yet) support huge TLB #endif -/* - * Initialise the consistent memory allocation. - */ -static int __init consistent_init(void) -{ - int ret = 0; - pgd_t *pgd; - pud_t *pud; - pmd_t *pmd; - pte_t *pte; - int i = 0; - unsigned long base = consistent_base; - unsigned long num_ptes = (CONSISTENT_END - base) >> PMD_SHIFT; - - consistent_pte = kmalloc(num_ptes * sizeof(pte_t), GFP_KERNEL); - if (!consistent_pte) { - pr_err("%s: no memory\n", __func__); - return -ENOMEM; - } - - pr_debug("DMA memory: 0x%08lx - 0x%08lx:\n", base, CONSISTENT_END); - consistent_head.vm_start = base; - - do { - pgd = pgd_offset(&init_mm, base); - - pud = pud_alloc(&init_mm, pgd, base); - if (!pud) { - pr_err("%s: no pud tables\n", __func__); - ret = -ENOMEM; - break; - } - - pmd = pmd_alloc(&init_mm, pud, base); - if (!pmd) { - pr_err("%s: no pmd tables\n", __func__); - ret = -ENOMEM; - break; - } - WARN_ON(!pmd_none(*pmd)); - - pte = pte_alloc_kernel(pmd, base); - if (!pte) { - pr_err("%s: no pte tables\n", __func__); - ret = -ENOMEM; - break; - } - - consistent_pte[i++] = pte; - base += PMD_SIZE; - } while (base < CONSISTENT_END); - - return ret; -} - -core_initcall(consistent_init); - static void * __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, const void *caller) { - struct arm_vmregion *c; - size_t align; - int bit; + struct vm_struct *area; + unsigned long addr; - if (!consistent_pte) { - pr_err("%s: not initialised\n", __func__); + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); + if (!area) + return NULL; + addr = (unsigned long)area->addr; + area->phys_addr = __pfn_to_phys(page_to_pfn(page)); + + if (ioremap_page_range(addr, addr + size, area->phys_addr, prot)) { + vunmap((void *)addr); + return NULL; + } + return (void *)addr; +} + +static void __dma_free_remap(void *cpu_addr, size_t size) +{ + struct vm_struct *area; + + read_lock(&vmlist_lock); + area = find_vm_area(cpu_addr); + if (!area) { + pr_err("%s: trying to free invalid coherent area: %p\n", + __func__, cpu_addr); + dump_stack(); + read_unlock(&vmlist_lock); + return; + } + unmap_kernel_range((unsigned long)cpu_addr, size); + read_unlock(&vmlist_lock); + vunmap(cpu_addr); +} + +struct dma_pool { + size_t size; + spinlock_t lock; + unsigned long *bitmap; + unsigned long count; + void *vaddr; + struct page *page; +}; + +static struct dma_pool atomic_pool = { + .size = SZ_256K, +}; + +static int __init early_coherent_pool(char *p) +{ + atomic_pool.size = memparse(p, &p); + return 0; +} +early_param("coherent_pool", early_coherent_pool); + +/* + * Initialise the coherent pool for atomic allocations. + */ +static int __init atomic_pool_init(void) +{ + struct dma_pool *pool = &atomic_pool; + pgprot_t prot = pgprot_dmacoherent(pgprot_kernel); + unsigned long count = pool->size >> PAGE_SHIFT; + gfp_t gfp = GFP_KERNEL | GFP_DMA; + unsigned long *bitmap; + struct page *page; + void *ptr; + int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); + + bitmap = kzalloc(bitmap_size, GFP_KERNEL); + if (!bitmap) + goto no_bitmap; + + page = __dma_alloc_buffer(NULL, pool->size, gfp); + if (!page) + goto no_page; + + ptr = __dma_alloc_remap(page, pool->size, gfp, prot, NULL); + if (ptr) { + spin_lock_init(&pool->lock); + pool->vaddr = ptr; + pool->page = page; + pool->bitmap = bitmap; + pool->count = count; + pr_info("DMA: preallocated %u KiB pool for atomic coherent allocations\n", + (unsigned)pool->size / 1024); + return 0; + } + + __dma_free_buffer(page, pool->size); +no_page: + kfree(bitmap); +no_bitmap: + pr_err("DMA: failed to allocate %u KiB pool for atomic coherent allocation\n", + (unsigned)pool->size / 1024); + return -ENOMEM; +} +core_initcall(atomic_pool_init); + +static void *__alloc_from_pool(size_t size, struct page **ret_page) +{ + struct dma_pool *pool = &atomic_pool; + unsigned int count = size >> PAGE_SHIFT; + unsigned int pageno; + unsigned long flags; + void *ptr = NULL; + size_t align; + + if (!pool->vaddr) { + pr_err("%s: coherent pool not initialised!\n", __func__); dump_stack(); return NULL; } /* - * Align the virtual region allocation - maximum alignment is - * a section size, minimum is a page size. This helps reduce - * fragmentation of the DMA space, and also prevents allocations - * smaller than a section from crossing a section boundary. + * Align the region allocation - allocations from pool are rather + * small, so align them to their order in pages, minimum is a page + * size. This helps reduce fragmentation of the DMA space. */ - bit = fls(size - 1); - if (bit > SECTION_SHIFT) - bit = SECTION_SHIFT; - align = 1 << bit; + align = PAGE_SIZE << get_order(size); - /* - * Allocate a virtual address in the consistent mapping region. - */ - c = arm_vmregion_alloc(&consistent_head, align, size, - gfp & ~(__GFP_DMA | __GFP_HIGHMEM), caller); - if (c) { - pte_t *pte; - int idx = CONSISTENT_PTE_INDEX(c->vm_start); - u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); - - pte = consistent_pte[idx] + off; - c->priv = page; - - do { - BUG_ON(!pte_none(*pte)); - - set_pte_ext(pte, mk_pte(page, prot), 0); - page++; - pte++; - off++; - if (off >= PTRS_PER_PTE) { - off = 0; - pte = consistent_pte[++idx]; - } - } while (size -= PAGE_SIZE); - - dsb(); - - return (void *)c->vm_start; + spin_lock_irqsave(&pool->lock, flags); + pageno = bitmap_find_next_zero_area(pool->bitmap, pool->count, + 0, count, (1 << align) - 1); + if (pageno < pool->count) { + bitmap_set(pool->bitmap, pageno, count); + ptr = pool->vaddr + PAGE_SIZE * pageno; + *ret_page = pool->page + pageno; } - return NULL; + spin_unlock_irqrestore(&pool->lock, flags); + + return ptr; } -static void __dma_free_remap(void *cpu_addr, size_t size) +static int __free_from_pool(void *start, size_t size) { - struct arm_vmregion *c; - unsigned long addr; - pte_t *ptep; - int idx; - u32 off; + struct dma_pool *pool = &atomic_pool; + unsigned long pageno, count; + unsigned long flags; - c = arm_vmregion_find_remove(&consistent_head, (unsigned long)cpu_addr); - if (!c) { - pr_err("%s: trying to free invalid coherent area: %p\n", - __func__, cpu_addr); + if (start < pool->vaddr || start > pool->vaddr + pool->size) + return 0; + + if (start + size > pool->vaddr + pool->size) { + pr_err("%s: freeing wrong coherent size from pool\n", __func__); dump_stack(); - return; + return 0; } - if ((c->vm_end - c->vm_start) != size) { - pr_err("%s: freeing wrong coherent size (%ld != %d)\n", - __func__, c->vm_end - c->vm_start, size); - dump_stack(); - size = c->vm_end - c->vm_start; - } + pageno = (start - pool->vaddr) >> PAGE_SHIFT; + count = size >> PAGE_SHIFT; - idx = CONSISTENT_PTE_INDEX(c->vm_start); - off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); - ptep = consistent_pte[idx] + off; - addr = c->vm_start; - do { - pte_t pte = ptep_get_and_clear(&init_mm, addr, ptep); + spin_lock_irqsave(&pool->lock, flags); + bitmap_clear(pool->bitmap, pageno, count); + spin_unlock_irqrestore(&pool->lock, flags); - ptep++; - addr += PAGE_SIZE; - off++; - if (off >= PTRS_PER_PTE) { - off = 0; - ptep = consistent_pte[++idx]; - } - - if (pte_none(pte) || !pte_present(pte)) - pr_crit("%s: bad page in kernel page table\n", - __func__); - } while (size -= PAGE_SIZE); - - flush_tlb_kernel_range(c->vm_start, c->vm_end); - - arm_vmregion_free(&consistent_head, c); + return 1; } static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) @@ -441,6 +408,8 @@ static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) #define __dma_free_remap(addr, size) do { } while (0) #define __get_dma_pgprot(attrs, prot) __pgprot(0) +#define __alloc_from_pool(size, ret_page) NULL +#define __free_from_pool(addr, size) 0 #endif /* CONFIG_MMU */ @@ -463,6 +432,16 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, *handle = DMA_ERROR_CODE; size = PAGE_ALIGN(size); + /* + * Atomic allocations need special handling + */ + if (gfp & GFP_ATOMIC && !arch_is_coherent()) { + addr = __alloc_from_pool(size, &page); + if (addr) + *handle = pfn_to_dma(dev, page_to_pfn(page)); + return addr; + } + page = __dma_alloc_buffer(dev, size, gfp); if (!page) return NULL; @@ -506,30 +485,21 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, { int ret = -ENXIO; #ifdef CONFIG_MMU - unsigned long user_size, kern_size; - struct arm_vmregion *c; + unsigned long user_count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; + unsigned long pfn = dma_to_pfn(dev, dma_addr); + unsigned long off = vma->vm_pgoff; vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) return ret; - user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; - - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); - if (c) { - unsigned long off = vma->vm_pgoff; - struct page *pages = c->priv; - - kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT; - - if (off < kern_size && - user_size <= (kern_size - off)) { - ret = remap_pfn_range(vma, vma->vm_start, - page_to_pfn(pages) + off, - user_size << PAGE_SHIFT, - vma->vm_page_prot); - } + if (off < count && user_count <= (count - off)) { + ret = remap_pfn_range(vma, vma->vm_start, + pfn + off, + user_count << PAGE_SHIFT, + vma->vm_page_prot); } #endif /* CONFIG_MMU */ @@ -543,13 +513,16 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, void arm_dma_free(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) { - WARN_ON(irqs_disabled()); - if (dma_release_from_coherent(dev, get_order(size), cpu_addr)) return; size = PAGE_ALIGN(size); + if (__free_from_pool(cpu_addr, size)) + return; + + WARN_ON(irqs_disabled()); + if (!arch_is_coherent()) __dma_free_remap(cpu_addr, size); @@ -769,9 +742,6 @@ static int arm_dma_set_mask(struct device *dev, u64 dma_mask) static int __init dma_debug_do_init(void) { -#ifdef CONFIG_MMU - arm_vmregion_create_proc("dma-mappings", &consistent_head); -#endif dma_debug_init(PREALLOC_DMA_DEBUG_ENTRIES); return 0; } @@ -888,61 +858,30 @@ static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t s * Create a CPU mapping for a specified pages */ static void * -__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot, + const void *caller) { - struct arm_vmregion *c; - size_t align; - size_t count = size >> PAGE_SHIFT; - int bit; + unsigned int i, count = size >> PAGE_SHIFT; + struct vm_struct *area; + unsigned long p; - if (!consistent_pte[0]) { - pr_err("%s: not initialised\n", __func__); - dump_stack(); + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); + if (!area) return NULL; + + area->pages = pages; + p = (unsigned long)area->addr; + + for (i = 0; i < count; i++) { + phys_addr_t phys = __pfn_to_phys(page_to_pfn(pages[i])); + if (ioremap_page_range(p, p + PAGE_SIZE, phys, prot)) + goto err; + p += PAGE_SIZE; } - - /* - * Align the virtual region allocation - maximum alignment is - * a section size, minimum is a page size. This helps reduce - * fragmentation of the DMA space, and also prevents allocations - * smaller than a section from crossing a section boundary. - */ - bit = fls(size - 1); - if (bit > SECTION_SHIFT) - bit = SECTION_SHIFT; - align = 1 << bit; - - /* - * Allocate a virtual address in the consistent mapping region. - */ - c = arm_vmregion_alloc(&consistent_head, align, size, - gfp & ~(__GFP_DMA | __GFP_HIGHMEM), NULL); - if (c) { - pte_t *pte; - int idx = CONSISTENT_PTE_INDEX(c->vm_start); - int i = 0; - u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); - - pte = consistent_pte[idx] + off; - c->priv = pages; - - do { - BUG_ON(!pte_none(*pte)); - - set_pte_ext(pte, mk_pte(pages[i], prot), 0); - pte++; - off++; - i++; - if (off >= PTRS_PER_PTE) { - off = 0; - pte = consistent_pte[++idx]; - } - } while (i < count); - - dsb(); - - return (void *)c->vm_start; - } + return area->addr; +err: + unmap_kernel_range((unsigned long)area->addr, size); + vunmap(area->addr); return NULL; } @@ -1001,6 +940,17 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si return 0; } +static struct page **__iommu_get_pages(void *cpu_addr) +{ + struct vm_struct *area; + read_lock(&vmlist_lock); + area = find_vm_area(cpu_addr); + read_unlock(&vmlist_lock); + if (area) + return area->pages; + return NULL; +} + static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) { @@ -1019,7 +969,8 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, if (*handle == DMA_ERROR_CODE) goto err_buffer; - addr = __iommu_alloc_remap(pages, size, gfp, prot); + addr = __iommu_alloc_remap(pages, size, gfp, prot, + __builtin_return_address(0)); if (!addr) goto err_mapping; @@ -1036,31 +987,25 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs) { - struct arm_vmregion *c; + unsigned long uaddr = vma->vm_start; + unsigned long usize = vma->vm_end - vma->vm_start; + struct page **pages = __iommu_get_pages(cpu_addr); vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); - if (c) { - struct page **pages = c->priv; + if (!pages) + return -ENXIO; - unsigned long uaddr = vma->vm_start; - unsigned long usize = vma->vm_end - vma->vm_start; - int i = 0; + do { + int ret = vm_insert_page(vma, uaddr, *pages++); + if (ret) { + pr_err("Remapping memory failed: %d\n", ret); + return ret; + } + uaddr += PAGE_SIZE; + usize -= PAGE_SIZE; + } while (usize > 0); - do { - int ret; - - ret = vm_insert_page(vma, uaddr, pages[i++]); - if (ret) { - pr_err("Remapping memory, error: %d\n", ret); - return ret; - } - - uaddr += PAGE_SIZE; - usize -= PAGE_SIZE; - } while (usize > 0); - } return 0; } @@ -1071,16 +1016,20 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) { - struct arm_vmregion *c; + struct page **pages = __iommu_get_pages(cpu_addr); size = PAGE_ALIGN(size); - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); - if (c) { - struct page **pages = c->priv; - __dma_free_remap(cpu_addr, size); - __iommu_remove_mapping(dev, handle, size); - __iommu_free_buffer(dev, pages, size); + if (!pages) { + pr_err("%s: trying to free invalid coherent area: %p\n", + __func__, cpu_addr); + dump_stack(); + return; } + + unmap_kernel_range((unsigned long)cpu_addr, size); + vunmap(cpu_addr); + __iommu_remove_mapping(dev, handle, size); + __iommu_free_buffer(dev, pages, size); } /* -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 16:53:03 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 18:53:03 +0200 Subject: [Linaro-mm-sig] [PATCH/RFC 0/3] ARM: DMA-mapping: new extensions for buffer sharing Message-ID: <1337273586-11089-1-git-send-email-m.szyprowski@samsung.com> Hello, This patch series introduces a new features to DMA mapping subsystem to let drivers share the allocated buffers (preferably using recently introduced dma_buf framework) easy and efficient. The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is intended for use with dma_{alloc, mmap, free}_attrs functions. It can be used to notify dma-mapping core that the driver will not use kernel mapping for the allocated buffer at all, so the core can skip creating it. This saves precious kernel virtual address space. Such buffer can be accessed from userspace, after calling dma_mmap_attrs() for it (a typical use case for multimedia buffers). The value returned by dma_alloc_attrs() with this attribute should be considered as a DMA cookie, which needs to be passed to dma_mmap_attrs() and dma_free_attrs() funtions. The second extension is required to let drivers to share the buffers allocated by DMA-mapping subsystem. Right now the driver gets a dma address of the allocated buffer and the kernel virtual mapping for it. If it wants to share it with other device (= map into its dma address space) it usually hacks around kernel virtual addresses to get pointers to pages or assumes that both devices share the DMA address space. Both solutions are just hacks for the special cases, which should be avoided in the final version of buffer sharing. To solve this issue in a generic way, a new call to DMA mapping has been introduced - dma_get_sgtable(). It allocates a scatter-list which describes the allocated buffer and lets the driver(s) to use it with other device(s) by calling dma_map_sg() on it. The proposed patches have been generated on top of the ARM DMA-mapping redesign patch series on Linux v3.4-rc7. They are also available on the following GIT branch: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.4-rc7-arm-dma-v10-ext with all require patches on top of vanilla v3.4-rc7 kernel. Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: Marek Szyprowski (3): common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute ARM: dma-mapping: add support for dma_get_sgtable() Documentation/DMA-attributes.txt | 18 +++++++++++++ arch/arm/include/asm/dma-mapping.h | 12 +++++++++ arch/arm/mm/dma-mapping.c | 51 ++++++++++++++++++++++++++++++++---- include/linux/dma-attrs.h | 1 + include/linux/dma-mapping.h | 3 +++ 5 files changed, 80 insertions(+), 5 deletions(-) -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 16:53:04 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 18:53:04 +0200 Subject: [Linaro-mm-sig] [PATCH 1/3] common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute In-Reply-To: <1337273586-11089-1-git-send-email-m.szyprowski@samsung.com> References: <1337273586-11089-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337273586-11089-2-git-send-email-m.szyprowski@samsung.com> This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the platform to avoid creating a kernel virtual mapping for the allocated buffer. On some architectures creating such mapping is non-trivial task and consumes very limited resources (like kernel virtual address space or dma consistent address space). Buffers allocated with this attribute can be only passed to user space by calling dma_mmap_attrs(). Signed-off-by: Marek Szyprowski --- Documentation/DMA-attributes.txt | 18 ++++++++++++++++++ include/linux/dma-attrs.h | 1 + 2 files changed, 19 insertions(+) diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index 5c72eed..725580d 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -49,3 +49,21 @@ DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either consistent or non-consistent memory as it sees fit. By using this API, you are guaranteeing to the platform that you have all the correct and necessary sync points for this memory in the driver. + +DMA_ATTR_NO_KERNEL_MAPPING +-------------------------- + +DMA_ATTR_NO_KERNEL_MAPPING lets the platform to avoid creating a kernel +virtual mapping for the allocated buffer. On some architectures creating +such mapping is non-trivial task and consumes very limited resources +(like kernel virtual address space or dma consistent address space). +Buffers allocated with this attribute can be only passed to user space +by calling dma_mmap_attrs(). By using this API, you are guaranteeing +that you won't dereference the pointer returned by dma_alloc_attr(). You +can threat it as a cookie that must be passed to dma_mmap_attrs() and +dma_free_attrs(). Make sure that both of these also get this attribute +set on each call. + +Since it is optional for platforms to implement +DMA_ATTR_NO_KERNEL_MAPPING, those that do not will simply ignore the +attribute and exhibit default behavior. diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h index 547ab56..a37c10c 100644 --- a/include/linux/dma-attrs.h +++ b/include/linux/dma-attrs.h @@ -15,6 +15,7 @@ enum dma_attr { DMA_ATTR_WEAK_ORDERING, DMA_ATTR_WRITE_COMBINE, DMA_ATTR_NON_CONSISTENT, + DMA_ATTR_NO_KERNEL_MAPPING, DMA_ATTR_MAX, }; -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 16:53:05 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 18:53:05 +0200 Subject: [Linaro-mm-sig] [PATCH 2/3] ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute In-Reply-To: <1337273586-11089-1-git-send-email-m.szyprowski@samsung.com> References: <1337273586-11089-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337273586-11089-3-git-send-email-m.szyprowski@samsung.com> This patch adds support for DMA_ATTR_NO_KERNEL_MAPPING attribute for IOMMU allocations, what let drivers to save precious kernel virtual address space for large buffers that are intended to be accessed only from userspace. This patch is heavily based on initial work kindly provided by Abhinav Kochhar . Signed-off-by: Marek Szyprowski --- arch/arm/mm/dma-mapping.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 2e98403..23d0ace 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -940,9 +940,13 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si return 0; } -static struct page **__iommu_get_pages(void *cpu_addr) +static struct page **__iommu_get_pages(void *cpu_addr, struct dma_attrs *attrs) { struct vm_struct *area; + + if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs)) + return cpu_addr; + read_lock(&vmlist_lock); area = find_vm_area(cpu_addr); read_unlock(&vmlist_lock); @@ -969,6 +973,9 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, if (*handle == DMA_ERROR_CODE) goto err_buffer; + if (dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs)) + return pages; + addr = __iommu_alloc_remap(pages, size, gfp, prot, __builtin_return_address(0)); if (!addr) @@ -989,7 +996,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, { unsigned long uaddr = vma->vm_start; unsigned long usize = vma->vm_end - vma->vm_start; - struct page **pages = __iommu_get_pages(cpu_addr); + struct page **pages = __iommu_get_pages(cpu_addr, attrs); vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); @@ -1016,7 +1023,7 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) { - struct page **pages = __iommu_get_pages(cpu_addr); + struct page **pages = __iommu_get_pages(cpu_addr, attrs); size = PAGE_ALIGN(size); if (!pages) { @@ -1026,8 +1033,11 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, return; } - unmap_kernel_range((unsigned long)cpu_addr, size); - vunmap(cpu_addr); + if (!dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs)) { + unmap_kernel_range((unsigned long)cpu_addr, size); + vunmap(cpu_addr); + } + __iommu_remove_mapping(dev, handle, size); __iommu_free_buffer(dev, pages, size); } -- 1.7.10.1 From m.szyprowski at samsung.com Thu May 17 16:53:06 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 17 May 2012 18:53:06 +0200 Subject: [Linaro-mm-sig] [PATCH 3/3] ARM: dma-mapping: add support for dma_get_sgtable() In-Reply-To: <1337273586-11089-1-git-send-email-m.szyprowski@samsung.com> References: <1337273586-11089-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <1337273586-11089-4-git-send-email-m.szyprowski@samsung.com> This patch adds dma_get_sgtable() function which is required to let drivers to share the buffers allocated by DMA-mapping subsystem. Right now the driver gets a dma address of the allocated buffer and the kernel virtual mapping for it. If it wants to share it with other device (= map into its dma address space) it usually hacks around kernel virtual addresses to get pointers to pages or assumes that both devices share the DMA address space. Both solutions are just hacks for the special cases, which should be avoided in the final version of buffer sharing. To solve this issue in a generic way, a new call to DMA mapping has been introduced - dma_get_sgtable(). It allocates a scatter-list which describes the allocated buffer and lets the driver(s) to use it with other device(s) by calling dma_map_sg() on it. This patch adds this extension only to ARM architecture, mainly to demonstrate the buffer sharing. I plan to provide some generic implementations for other architectures once this idea gets accepted. Signed-off-by: Marek Szyprowski --- arch/arm/include/asm/dma-mapping.h | 12 ++++++++++++ arch/arm/mm/dma-mapping.c | 31 +++++++++++++++++++++++++++++++ include/linux/dma-mapping.h | 3 +++ 3 files changed, 46 insertions(+) diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 80777d87..2e37778 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -221,6 +221,15 @@ static inline int dma_mmap_writecombine(struct device *dev, struct vm_area_struc return dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, &attrs); } +static inline int dma_get_sgtable(struct device *dev, struct sg_table *sgt, + void *cpu_addr, dma_addr_t dma_addr, + size_t size, struct dma_attrs *attrs) +{ + struct dma_map_ops *ops = get_dma_ops(dev); + BUG_ON(!ops); + return ops->get_sgtable(dev, sgt, cpu_addr, dma_addr, size, attrs); +} + /* * This can be called during boot to increase the size of the consistent * DMA region above it's default value of 2MB. It must be called before the @@ -280,6 +289,9 @@ extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int, enum dma_data_direction); extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int, enum dma_data_direction); +extern int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs); #endif /* __KERNEL__ */ #endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 23d0ace..b140440 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -120,6 +120,7 @@ struct dma_map_ops arm_dma_ops = { .alloc = arm_dma_alloc, .free = arm_dma_free, .mmap = arm_dma_mmap, + .get_sgtable = arm_dma_get_sgtable, .map_page = arm_dma_map_page, .unmap_page = arm_dma_unmap_page, .map_sg = arm_dma_map_sg, @@ -529,6 +530,21 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr, __dma_free_buffer(pfn_to_page(dma_to_pfn(dev, handle)), size); } +int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt, + void *cpu_addr, dma_addr_t handle, size_t size, + struct dma_attrs *attrs) +{ + struct page *page = pfn_to_page(dma_to_pfn(dev, handle)); + int ret; + + ret = sg_alloc_table(sgt, 1, GFP_KERNEL); + if (unlikely(ret)) + return ret; + + sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0); + return 0; +} + static void dma_cache_maint_page(struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, void (*op)(const void *, size_t, int)) @@ -1042,6 +1058,20 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, __iommu_free_buffer(dev, pages, size); } +static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt, + void *cpu_addr, dma_addr_t dma_addr, + size_t size, struct dma_attrs *attrs) +{ + unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; + struct page **pages = __iommu_get_pages(cpu_addr, attrs); + + if (!pages) + return -ENXIO; + + return sg_alloc_table_from_pages(sgt, pages, count, 0, size, + GFP_KERNEL); +} + /* * Map a part of the scatter-gather list into contiguous io address space */ @@ -1301,6 +1331,7 @@ struct dma_map_ops iommu_ops = { .alloc = arm_iommu_alloc_attrs, .free = arm_iommu_free_attrs, .mmap = arm_iommu_mmap_attrs, + .get_sgtable = arm_iommu_get_sgtable, .map_page = arm_iommu_map_page, .unmap_page = arm_iommu_unmap_page, diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index dfc099e..94af418 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -18,6 +18,9 @@ struct dma_map_ops { int (*mmap)(struct device *, struct vm_area_struct *, void *, dma_addr_t, size_t, struct dma_attrs *attrs); + int (*get_sgtable)(struct device *dev, struct sg_table *sgt, void *, + dma_addr_t, size_t, struct dma_attrs *attrs); + dma_addr_t (*map_page)(struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, -- 1.7.10.1 From sumit.semwal at linaro.org Fri May 18 04:07:43 2012 From: sumit.semwal at linaro.org (Sumit Semwal) Date: Fri, 18 May 2012 09:37:43 +0530 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v2) In-Reply-To: <1337250710-31113-1-git-send-email-airlied@gmail.com> References: <1337250710-31113-1-git-send-email-airlied@gmail.com> Message-ID: Hi Dave, On 17 May 2012 16:01, Dave Airlie wrote: > From: Dave Airlie > > The main requirement I have for this interface is for scanning out > using the USB gpu devices. Since these devices have to read the > framebuffer on updates and linearly compress it, using kmaps > is a major overhead for every update. > > v2: fix warn issues pointed out by Sylwester Nawrocki. > > Signed-off-by: Dave Airlie > --- > ?drivers/base/dma-buf.c ?| ? 34 ++++++++++++++++++++++++++++++++++ > ?include/linux/dma-buf.h | ? 14 ++++++++++++++ > ?2 files changed, 48 insertions(+), 0 deletions(-) > > diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c > index 07cbbc6..750f92c 100644 > --- a/drivers/base/dma-buf.c > +++ b/drivers/base/dma-buf.c > @@ -406,3 +406,37 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, > ? ? ? ? ? ? ? ?dmabuf->ops->kunmap(dmabuf, page_num, vaddr); > ?} > ?EXPORT_SYMBOL_GPL(dma_buf_kunmap); > + > +/** > + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. The same restrictions as for vmap and friends apply. > + * @dma_buf: ? [in] ? ?buffer to vmap > + * > + * This call may fail due to lack of virtual mapping address space. > + * These calls are optional in drivers. The intended use for them > + * is for mapping objects linear in kernel space for high use objects. > + * Please attempt to use kmap/kunmap before thinking about these interfaces. > + */ > +void *dma_buf_vmap(struct dma_buf *dmabuf) > +{ > + ? ? ? if (WARN_ON(!dmabuf)) > + ? ? ? ? ? ? ? return NULL; > + > + ? ? ? if (dmabuf->ops->vmap) > + ? ? ? ? ? ? ? return dmabuf->ops->vmap(dmabuf); > + ? ? ? return NULL; > +} > +EXPORT_SYMBOL(dma_buf_vmap); I am afraid we don't yet have a clear consensus on the usage of EXPORT_SYMBOL - till it is done, I would prefer that we use EXPORT_SYMBOL_GPL for consistency. Once we reach agreement, we can change them all in one go if required. > + -- Thanks and regards, ~Sumit. From sumit.semwal at linaro.org Fri May 18 04:12:12 2012 From: sumit.semwal at linaro.org (Sumit Semwal) Date: Fri, 18 May 2012 09:42:12 +0530 Subject: [Linaro-mm-sig] [PATCH] dma-buf: mmap support In-Reply-To: References: <1335258532-20739-1-git-send-email-daniel.vetter@ffwll.ch> Message-ID: Hi Daniel, Rob, On 11 May 2012 21:00, Rob Clark wrote: > On Tue, Apr 24, 2012 at 4:08 AM, Daniel Vetter wrote: >> Compared to Rob Clark's RFC I've ditched the prepare/finish hooks >> >> Cc: Rob Clark >> Cc: Rebecca Schultz Zavin >> Signed-Off-by: Daniel Vetter > > Acked-by: Rob Clark Thanks, applied to my for-next. Sorry, I was away due to some medical reasons for some time, hence the delay. > Best regards, ~Sumit. From hdoyu at nvidia.com Fri May 18 06:10:25 2012 From: hdoyu at nvidia.com (Hiroshi DOYU) Date: Fri, 18 May 2012 09:10:25 +0300 Subject: [Linaro-mm-sig] [RFC 0/2] dma-mapping: Introduce new IOVA API with address specified Message-ID: <1337321427-27748-1-git-send-email-hdoyu@nvidia.com> Hello, The following patchset is our enhancement for the upstream DMA mapping API(v9), where new IOVA API is introduced with the version of IOVA address specified. The current upstream DMA mapping API cannot specify any specific IOVA address at allocation. We need to specify IOVA address. This is necessary because some HWAs requre some specific address, for example, AVP vector and also some data buffer alignement can improve better performance from H/W constraints POV. Hiroshi DOYU (2): dma-mapping: Export arm_iommu_{alloc,free}_iova() functions dma-mapping: Enable IOVA mapping with specific address arch/arm/include/asm/dma-iommu.h | 31 ++++++ arch/arm/include/asm/dma-mapping.h | 1 + arch/arm/mm/dma-mapping.c | 181 +++++++++++++++++++++++++++++------- 3 files changed, 180 insertions(+), 33 deletions(-) -- 1.7.5.4 From hdoyu at nvidia.com Fri May 18 06:10:26 2012 From: hdoyu at nvidia.com (Hiroshi DOYU) Date: Fri, 18 May 2012 09:10:26 +0300 Subject: [Linaro-mm-sig] [RFC 1/2] dma-mapping: Export arm_iommu_{alloc, free}_iova() functions In-Reply-To: <1337321427-27748-1-git-send-email-hdoyu@nvidia.com> References: <1337321427-27748-1-git-send-email-hdoyu@nvidia.com> Message-ID: <1337321427-27748-2-git-send-email-hdoyu@nvidia.com> Export __{alloc,free}_iova() as arm_iommu_{alloc,free}_iova(). There are some cases that IOVA allocation and mapping have to be done seperately, especially for perf optimization reasons. This patch allows client modules to {alloc,free} IOVA space by themselves without backing up actual pages for that area. Signed-off-by: Hiroshi DOYU --- arch/arm/include/asm/dma-iommu.h | 4 ++++ arch/arm/mm/dma-mapping.c | 31 +++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 0 deletions(-) diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h index 799b094..2595928 100644 --- a/arch/arm/include/asm/dma-iommu.h +++ b/arch/arm/include/asm/dma-iommu.h @@ -30,5 +30,9 @@ void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping); int arm_iommu_attach_device(struct device *dev, struct dma_iommu_mapping *mapping); +dma_addr_t arm_iommu_alloc_iova(struct device *dev, size_t size); + +void arm_iommu_free_iova(struct device *dev, dma_addr_t addr, size_t size); + #endif /* __KERNEL__ */ #endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index afb5e7a..bca1715 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1041,6 +1041,21 @@ static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, return mapping->base + (start << (mapping->order + PAGE_SHIFT)); } +/** + * arm_iommu_alloc_iova + * @dev: valid struct device pointer + * @size: size of buffer to allocate + * + * Allocate IOVA address range + */ +dma_addr_t arm_iommu_alloc_iova(struct device *dev, size_t size) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + + return __alloc_iova(mapping, size); +} +EXPORT_SYMBOL_GPL(arm_iommu_alloc_iova); + static inline void __free_iova(struct dma_iommu_mapping *mapping, dma_addr_t addr, size_t size) { @@ -1055,6 +1070,22 @@ static inline void __free_iova(struct dma_iommu_mapping *mapping, spin_unlock_irqrestore(&mapping->lock, flags); } +/** + * arm_iommu_free_iova + * @dev: valid struct device pointer + * @iova: iova address being free'ed + * @size: size of buffer to allocate + * + * Free IOVA address range + */ +void arm_iommu_free_iova(struct device *dev, dma_addr_t addr, size_t size) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + + __free_iova(mapping, addr, size); +} +EXPORT_SYMBOL_GPL(arm_iommu_free_iova); + static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp) { struct page **pages; -- 1.7.5.4 From hdoyu at nvidia.com Fri May 18 06:10:27 2012 From: hdoyu at nvidia.com (Hiroshi DOYU) Date: Fri, 18 May 2012 09:10:27 +0300 Subject: [Linaro-mm-sig] [RFC 2/2] dma-mapping: Enable IOVA mapping with specific address In-Reply-To: <1337321427-27748-1-git-send-email-hdoyu@nvidia.com> References: <1337321427-27748-1-git-send-email-hdoyu@nvidia.com> Message-ID: <1337321427-27748-3-git-send-email-hdoyu@nvidia.com> Enable IOVA (un)mapping at a specific IOVA address, independent of allocating/freeing IOVA area, introducing the following dma_(un)map_page_*at*() functions: dma_map_page_at() dma_unmap_page_at() The above create a mapping between pre-allocated iova and a page, and remov just a mapping, leaving iova itself allocated. At mapping, it also checks if IOVA is already reserved or not. There are the version with the prefix "arm_iommu_", and they are exactly same as the above. Signed-off-by: Hiroshi DOYU --- arch/arm/include/asm/dma-iommu.h | 29 +++++++- arch/arm/include/asm/dma-mapping.h | 1 + arch/arm/mm/dma-mapping.c | 158 +++++++++++++++++++++++++++--------- 3 files changed, 150 insertions(+), 38 deletions(-) diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h index 2595928..99eba3d 100644 --- a/arch/arm/include/asm/dma-iommu.h +++ b/arch/arm/include/asm/dma-iommu.h @@ -30,9 +30,36 @@ void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping); int arm_iommu_attach_device(struct device *dev, struct dma_iommu_mapping *mapping); -dma_addr_t arm_iommu_alloc_iova(struct device *dev, size_t size); +dma_addr_t arm_iommu_alloc_iova_at(struct device *dev, dma_addr_t addr, + size_t size); + +static inline dma_addr_t arm_iommu_alloc_iova(struct device *dev, size_t size) +{ + return arm_iommu_alloc_iova_at(dev, DMA_ANON_ADDR, size); +} void arm_iommu_free_iova(struct device *dev, dma_addr_t addr, size_t size); +dma_addr_t arm_iommu_map_page_at(struct device *dev, struct page *page, + dma_addr_t addr, unsigned long offset, size_t size, + enum dma_data_direction dir, struct dma_attrs *attrs); + +static inline dma_addr_t dma_map_page_at(struct device *d, struct page *p, + dma_addr_t a, size_t o, size_t s, + enum dma_data_direction r) +{ + return arm_iommu_map_page_at(d, p, a, o, s, r, 0); +} + +void arm_iommu_unmap_page_at(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs); + +static inline void dma_unmap_page_at(struct device *d, dma_addr_t a, size_t s, + enum dma_data_direction r) +{ + return arm_iommu_unmap_page_at(d, a, s, r, 0); +} + #endif /* __KERNEL__ */ #endif diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index bbef15d..b73eb73 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -12,6 +12,7 @@ #include #define DMA_ERROR_CODE (~0) +#define DMA_ANON_ADDR (~0) extern struct dma_map_ops arm_dma_ops; static inline struct dma_map_ops *get_dma_ops(struct device *dev) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index bca1715..b98e668 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1013,48 +1013,65 @@ fs_initcall(dma_debug_do_init); /* IOMMU */ -static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, - size_t size) +static dma_addr_t __alloc_iova_at(struct dma_iommu_mapping *mapping, + dma_addr_t iova, size_t size) { unsigned int order = get_order(size); unsigned int align = 0; - unsigned int count, start; + unsigned int count, start, orig = 0; unsigned long flags; + bool anon = (iova == DMA_ANON_ADDR) ? true : false; count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order; - if (order > mapping->order) + if (anon && (order > mapping->order)) align = (1 << (order - mapping->order)) - 1; spin_lock_irqsave(&mapping->lock, flags); - start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0, - count, align); - if (start > mapping->bits) { - spin_unlock_irqrestore(&mapping->lock, flags); - return DMA_ERROR_CODE; - } + if (!anon) + orig = (iova - mapping->base) >> (mapping->order + PAGE_SHIFT); + + start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, + orig, count, align); + if (start > mapping->bits) + goto not_found; + + if (!anon && (orig != start)) + goto not_found; bitmap_set(mapping->bitmap, start, count); spin_unlock_irqrestore(&mapping->lock, flags); return mapping->base + (start << (mapping->order + PAGE_SHIFT)); + +not_found: + spin_unlock_irqrestore(&mapping->lock, flags); + return DMA_ERROR_CODE; +} + +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, + size_t size) +{ + return __alloc_iova_at(mapping, DMA_ANON_ADDR, size); } /** - * arm_iommu_alloc_iova + * arm_iommu_alloc_iova_at * @dev: valid struct device pointer + * @iova: iova address being requested. Set DMA_ANON_ADDR for arbitral * @size: size of buffer to allocate * * Allocate IOVA address range */ -dma_addr_t arm_iommu_alloc_iova(struct device *dev, size_t size) +dma_addr_t arm_iommu_alloc_iova_at(struct device *dev, dma_addr_t iova, + size_t size) { struct dma_iommu_mapping *mapping = dev->archdata.mapping; - return __alloc_iova(mapping, size); + return __alloc_iova_at(mapping, iova, size); } -EXPORT_SYMBOL_GPL(arm_iommu_alloc_iova); +EXPORT_SYMBOL_GPL(arm_iommu_alloc_iova_at); static inline void __free_iova(struct dma_iommu_mapping *mapping, dma_addr_t addr, size_t size) @@ -1507,6 +1524,41 @@ void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg, __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); } +static dma_addr_t __arm_iommu_map_page_at(struct device *dev, struct page *page, + dma_addr_t req, unsigned long offset, size_t size, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + dma_addr_t dma_addr; + int ret, len = PAGE_ALIGN(size + offset); + + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(page, offset, size, dir); + + dma_addr = __alloc_iova_at(mapping, req, len); + if (dma_addr == DMA_ERROR_CODE) { + if (req == DMA_ANON_ADDR) + return DMA_ERROR_CODE; + /* + * Verified that iova(req) is reserved in advance if + * @req is specified. + */ + dma_addr = req; + } + + if (req != DMA_ANON_ADDR) + BUG_ON(dma_addr != req); + + ret = iommu_map(mapping->domain, dma_addr, page_to_phys(page), len, 0); + if (ret < 0) + goto fail; + + return dma_addr + offset; +fail: + if (req == DMA_ANON_ADDR) + __free_iova(mapping, dma_addr, len); + return DMA_ERROR_CODE; +} /** * arm_iommu_map_page @@ -1522,25 +1574,47 @@ static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, struct dma_attrs *attrs) { - struct dma_iommu_mapping *mapping = dev->archdata.mapping; - dma_addr_t dma_addr; - int ret, len = PAGE_ALIGN(size + offset); + return __arm_iommu_map_page_at(dev, page, DMA_ANON_ADDR, + offset, size, dir, attrs); +} - if (!arch_is_coherent()) - __dma_page_cpu_to_dev(page, offset, size, dir); +/** + * arm_iommu_map_page_at + * @dev: valid struct device pointer + * @page: page that buffer resides in + * @req: iova address being requested. Set DMA_ANON_ADDR for arbitral + * @offset: offset into page for start of buffer + * @size: size of buffer to map + * @dir: DMA transfer direction + * + * The version with a specified iova address of arm_iommu_map_page(). + */ +dma_addr_t arm_iommu_map_page_at(struct device *dev, struct page *page, + dma_addr_t req, unsigned long offset, size_t size, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + return __arm_iommu_map_page_at(dev, page, req, offset, size, dir, + attrs); +} +EXPORT_SYMBOL_GPL(arm_iommu_map_page_at); - dma_addr = __alloc_iova(mapping, len); - if (dma_addr == DMA_ERROR_CODE) - return dma_addr; +static inline int __arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + dma_addr_t iova = handle & PAGE_MASK; + struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); + int offset = handle & ~PAGE_MASK; + int len = PAGE_ALIGN(size + offset); - ret = iommu_map(mapping->domain, dma_addr, page_to_phys(page), len, 0); - if (ret < 0) - goto fail; + if (!iova) + return -EINVAL; - return dma_addr + offset; -fail: - __free_iova(mapping, dma_addr, len); - return DMA_ERROR_CODE; + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(page, offset, size, dir); + + iommu_unmap(mapping->domain, iova, len); + return 0; } /** @@ -1558,20 +1632,30 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, { struct dma_iommu_mapping *mapping = dev->archdata.mapping; dma_addr_t iova = handle & PAGE_MASK; - struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); - int offset = handle & ~PAGE_MASK; int len = PAGE_ALIGN(size + offset); - if (!iova) + if (__arm_iommu_unmap_page(dev, handle, size, dir, attrs)) return; - - if (!arch_is_coherent()) - __dma_page_dev_to_cpu(page, offset, size, dir); - - iommu_unmap(mapping->domain, iova, len); __free_iova(mapping, iova, len); } +/** + * arm_iommu_unmap_page_at + * @dev: valid struct device pointer + * @handle: DMA address of buffer + * @size: size of buffer (same as passed to dma_map_page) + * @dir: DMA transfer direction (same as passed to dma_map_page) + * + * The version without freeing iova of arm_iommu_unmap_page(). + */ +void arm_iommu_unmap_page_at(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + __arm_iommu_unmap_page(dev, handle, size, dir, attrs); +} +EXPORT_SYMBOL_GPL(arm_iommu_unmap_page_at); + static void arm_iommu_sync_single_for_cpu(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { -- 1.7.5.4 From marcin.slusarz at gmail.com Thu May 17 17:32:01 2012 From: marcin.slusarz at gmail.com (Marcin Slusarz) Date: Thu, 17 May 2012 19:32:01 +0200 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v2) In-Reply-To: References: <1337250710-31113-1-git-send-email-airlied@gmail.com> Message-ID: <20120517173201.GA3209@joi.lan> On Thu, May 17, 2012 at 06:32:19AM -0600, Rob Clark wrote: > On Thu, May 17, 2012 at 4:31 AM, Dave Airlie wrote: > > From: Dave Airlie > > > > The main requirement I have for this interface is for scanning out > > using the USB gpu devices. Since these devices have to read the > > framebuffer on updates and linearly compress it, using kmaps > > is a major overhead for every update. > > > > v2: fix warn issues pointed out by Sylwester Nawrocki. > > > > Signed-off-by: Dave Airlie > > --- > > ?drivers/base/dma-buf.c ?| ? 34 ++++++++++++++++++++++++++++++++++ > > ?include/linux/dma-buf.h | ? 14 ++++++++++++++ > > ?2 files changed, 48 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c > > index 07cbbc6..750f92c 100644 > > --- a/drivers/base/dma-buf.c > > +++ b/drivers/base/dma-buf.c > > @@ -406,3 +406,37 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, > > ? ? ? ? ? ? ? ?dmabuf->ops->kunmap(dmabuf, page_num, vaddr); > > ?} > > ?EXPORT_SYMBOL_GPL(dma_buf_kunmap); > > + > > +/** > > + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. The same restrictions as for vmap and friends apply. > > + * @dma_buf: ? [in] ? ?buffer to vmap > > + * > > + * This call may fail due to lack of virtual mapping address space. > > + * These calls are optional in drivers. The intended use for them > > + * is for mapping objects linear in kernel space for high use objects. > > + * Please attempt to use kmap/kunmap before thinking about these interfaces. > > + */ > > +void *dma_buf_vmap(struct dma_buf *dmabuf) > > +{ > > + ? ? ? if (WARN_ON(!dmabuf)) > > + ? ? ? ? ? ? ? return NULL; > > + > > + ? ? ? if (dmabuf->ops->vmap) > > + ? ? ? ? ? ? ? return dmabuf->ops->vmap(dmabuf); > > + ? ? ? return NULL; > > +} > > +EXPORT_SYMBOL(dma_buf_vmap); > > + > > +/** > > + * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. > > + * @dma_buf: ? [in] ? ?buffer to vmap > > + */ > > +void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) > > +{ > > + ? ? ? if (WARN_ON(!dmabuf)) > > + ? ? ? ? ? ? ? return; > > + > > + ? ? ? if (dmabuf->ops->vunmap) > > + ? ? ? ? ? ? ? dmabuf->ops->vunmap(dmabuf, vaddr); > > +} > > +EXPORT_SYMBOL(dma_buf_vunmap); > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > > index 3efbfc2..b92b6de 100644 > > --- a/include/linux/dma-buf.h > > +++ b/include/linux/dma-buf.h > > @@ -92,6 +92,9 @@ struct dma_buf_ops { > > ? ? ? ?void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); > > ? ? ? ?void *(*kmap)(struct dma_buf *, unsigned long); > > ? ? ? ?void (*kunmap)(struct dma_buf *, unsigned long, void *); > > + > > + ? ? ? void *(*vmap)(struct dma_buf *); > > + ? ? ? void (*vunmap)(struct dma_buf *, void *vaddr); > > ?}; > > > > ?/** > > @@ -167,6 +170,9 @@ void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); > > ?void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); > > ?void *dma_buf_kmap(struct dma_buf *, unsigned long); > > ?void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); > > + > > +void *dma_buf_vmap(struct dma_buf *); > > +void dma_buf_vunmap(struct dma_buf *, void *vaddr); > > ?#else > > > > ?static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > > @@ -248,6 +254,14 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?unsigned long pnum, void *vaddr) > > ?{ > > ?} > > + > > +static inline void *dma_buf_vmap(struct dma_buf *) > > +{ > > +} > > + > > +static inline void dma_buf_vunmap(struct dma_buf *, void *vaddr); > > +{ > > +} > > I think these two will cause compile issues for > !CONFIG_DMA_SHARED_BUFFER case due to no parameter name for first arg. And because of ";" between ) and { :) Marcin From airlied at gmail.com Fri May 18 18:44:01 2012 From: airlied at gmail.com (Dave Airlie) Date: Fri, 18 May 2012 19:44:01 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v3) Message-ID: <1337366641-6584-1-git-send-email-airlied@gmail.com> From: Dave Airlie The main requirement I have for this interface is for scanning out using the USB gpu devices. Since these devices have to read the framebuffer on updates and linearly compress it, using kmaps is a major overhead for every update. v2: fix warn issues pointed out by Sylwester Nawrocki. v3: fix compile !CONFIG_DMA_SHARED_BUFFER and add _GPL for now Signed-off-by: Dave Airlie --- drivers/base/dma-buf.c | 34 ++++++++++++++++++++++++++++++++++ include/linux/dma-buf.h | 14 ++++++++++++++ 2 files changed, 48 insertions(+), 0 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 07cbbc6..0d8197e 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -406,3 +406,37 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, dmabuf->ops->kunmap(dmabuf, page_num, vaddr); } EXPORT_SYMBOL_GPL(dma_buf_kunmap); + +/** + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. The same restrictions as for vmap and friends apply. + * @dma_buf: [in] buffer to vmap + * + * This call may fail due to lack of virtual mapping address space. + * These calls are optional in drivers. The intended use for them + * is for mapping objects linear in kernel space for high use objects. + * Please attempt to use kmap/kunmap before thinking about these interfaces. + */ +void *dma_buf_vmap(struct dma_buf *dmabuf) +{ + if (WARN_ON(!dmabuf)) + return NULL; + + if (dmabuf->ops->vmap) + return dmabuf->ops->vmap(dmabuf); + return NULL; +} +EXPORT_SYMBOL_GPL(dma_buf_vmap); + +/** + * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. + * @dma_buf: [in] buffer to vmap + */ +void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) +{ + if (WARN_ON(!dmabuf)) + return; + + if (dmabuf->ops->vunmap) + dmabuf->ops->vunmap(dmabuf, vaddr); +} +EXPORT_SYMBOL_GPL(dma_buf_vunmap); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 3efbfc2..d8c2865 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -92,6 +92,9 @@ struct dma_buf_ops { void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); void *(*kmap)(struct dma_buf *, unsigned long); void (*kunmap)(struct dma_buf *, unsigned long, void *); + + void *(*vmap)(struct dma_buf *); + void (*vunmap)(struct dma_buf *, void *vaddr); }; /** @@ -167,6 +170,9 @@ void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); void *dma_buf_kmap(struct dma_buf *, unsigned long); void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); + +void *dma_buf_vmap(struct dma_buf *); +void dma_buf_vunmap(struct dma_buf *, void *vaddr); #else static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, @@ -248,6 +254,14 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long pnum, void *vaddr) { } + +static inline void *dma_buf_vmap(struct dma_buf *dmabuf) +{ +} + +static inline void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) +{ +} #endif /* CONFIG_DMA_SHARED_BUFFER */ #endif /* __DMA_BUF_H__ */ -- 1.7.6 From rob.clark at linaro.org Sat May 19 07:18:28 2012 From: rob.clark at linaro.org (Rob Clark) Date: Sat, 19 May 2012 01:18:28 -0600 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v3) In-Reply-To: <1337366641-6584-1-git-send-email-airlied@gmail.com> References: <1337366641-6584-1-git-send-email-airlied@gmail.com> Message-ID: On Fri, May 18, 2012 at 12:44 PM, Dave Airlie wrote: > From: Dave Airlie > > The main requirement I have for this interface is for scanning out > using the USB gpu devices. Since these devices have to read the > framebuffer on updates and linearly compress it, using kmaps > is a major overhead for every update. > > v2: fix warn issues pointed out by Sylwester Nawrocki. > > v3: fix compile !CONFIG_DMA_SHARED_BUFFER and add _GPL for now > > Signed-off-by: Dave Airlie Reviewed-by: Rob Clark > --- > ?drivers/base/dma-buf.c ?| ? 34 ++++++++++++++++++++++++++++++++++ > ?include/linux/dma-buf.h | ? 14 ++++++++++++++ > ?2 files changed, 48 insertions(+), 0 deletions(-) > > diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c > index 07cbbc6..0d8197e 100644 > --- a/drivers/base/dma-buf.c > +++ b/drivers/base/dma-buf.c > @@ -406,3 +406,37 @@ void dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long page_num, > ? ? ? ? ? ? ? ?dmabuf->ops->kunmap(dmabuf, page_num, vaddr); > ?} > ?EXPORT_SYMBOL_GPL(dma_buf_kunmap); > + > +/** > + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. The same restrictions as for vmap and friends apply. > + * @dma_buf: ? [in] ? ?buffer to vmap > + * > + * This call may fail due to lack of virtual mapping address space. > + * These calls are optional in drivers. The intended use for them > + * is for mapping objects linear in kernel space for high use objects. > + * Please attempt to use kmap/kunmap before thinking about these interfaces. > + */ > +void *dma_buf_vmap(struct dma_buf *dmabuf) > +{ > + ? ? ? if (WARN_ON(!dmabuf)) > + ? ? ? ? ? ? ? return NULL; > + > + ? ? ? if (dmabuf->ops->vmap) > + ? ? ? ? ? ? ? return dmabuf->ops->vmap(dmabuf); > + ? ? ? return NULL; > +} > +EXPORT_SYMBOL_GPL(dma_buf_vmap); > + > +/** > + * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. > + * @dma_buf: ? [in] ? ?buffer to vmap > + */ > +void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) > +{ > + ? ? ? if (WARN_ON(!dmabuf)) > + ? ? ? ? ? ? ? return; > + > + ? ? ? if (dmabuf->ops->vunmap) > + ? ? ? ? ? ? ? dmabuf->ops->vunmap(dmabuf, vaddr); > +} > +EXPORT_SYMBOL_GPL(dma_buf_vunmap); > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index 3efbfc2..d8c2865 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -92,6 +92,9 @@ struct dma_buf_ops { > ? ? ? ?void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); > ? ? ? ?void *(*kmap)(struct dma_buf *, unsigned long); > ? ? ? ?void (*kunmap)(struct dma_buf *, unsigned long, void *); > + > + ? ? ? void *(*vmap)(struct dma_buf *); > + ? ? ? void (*vunmap)(struct dma_buf *, void *vaddr); > ?}; > > ?/** > @@ -167,6 +170,9 @@ void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); > ?void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); > ?void *dma_buf_kmap(struct dma_buf *, unsigned long); > ?void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); > + > +void *dma_buf_vmap(struct dma_buf *); > +void dma_buf_vunmap(struct dma_buf *, void *vaddr); > ?#else > > ?static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > @@ -248,6 +254,14 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?unsigned long pnum, void *vaddr) > ?{ > ?} > + > +static inline void *dma_buf_vmap(struct dma_buf *dmabuf) > +{ > +} > + > +static inline void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) > +{ > +} > ?#endif /* CONFIG_DMA_SHARED_BUFFER */ > > ?#endif /* __DMA_BUF_H__ */ > -- > 1.7.6 > > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel From sumit.semwal at linaro.org Sun May 20 07:13:15 2012 From: sumit.semwal at linaro.org (Sumit Semwal) Date: Sun, 20 May 2012 12:43:15 +0530 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add vmap interface (v3) In-Reply-To: References: <1337366641-6584-1-git-send-email-airlied@gmail.com> Message-ID: Hi Dave, On 19 May 2012 12:48, Rob Clark wrote: > On Fri, May 18, 2012 at 12:44 PM, Dave Airlie wrote: >> From: Dave Airlie >> >> The main requirement I have for this interface is for scanning out >> using the USB gpu devices. Since these devices have to read the >> framebuffer on updates and linearly compress it, using kmaps >> is a major overhead for every update. Thanks; rebased and applied to my for-next. Would you care to write a small documentation patch like we have for all the other interfaces? >> Best regards, ~Sumit. From tom.cooksey at arm.com Mon May 21 08:42:27 2012 From: tom.cooksey at arm.com (Tom Cooksey) Date: Mon, 21 May 2012 09:42:27 +0100 Subject: [Linaro-mm-sig] New "xf86-video-armsoc" DDX driver Message-ID: <000001cd372d$acb213a0$06163ae0$@cooksey@arm.com> Hi All, For the last few months we (ARM MPD... "The Mali guys") have been working on getting X.Org up and running with Mali T6xx (ARM's next-generation GPU IP). The approach is very similar (well identical I think) to how things work on OMAP: We use a DRM driver to manage the display controller via KMS. The KMS driver also allocates both scan-out and pixmap/back buffers via the DRM_IOCTL_MODE_CREATE_DUMB ioctl which is internally implemented with GEM. When returning buffers to DRI clients, the x-server uses flink to get a global handle to a buffer which it passes back to the DRI client (in our case the Mali-T600 X11 EGL winsys). The client then uses the new PRIME ioctls to export the GEM buffer it received from the x-server to a dma_buf fd. This fd is then passed into the T6xx kernel driver via our own job dispatch user/kernel API (we're not using DRM for driving the GPU, only the display controller). Note: ARM doesn't generally provide the display controller IP block, so this is really for our customers/Linaro to develop, though we do have something hacked up for ARM's own PL111 display controller on our Versatile Express development platform which we'll be open sourcing/up-streaming asap via Linaro. We believe most ARM SoCs are likely to work the same way, at least those with 3rd-party GPU IP blocks/drivers (so everyone except Qualcomm & nVidia). As mentioned, this is certainly how the OMAP integration works. As such, we've taken the OMAP DDX driver Rob Clark wrote and hacked on it to make it work for Mali. The patch is actually relatively small, which is not really too surprising as all the driver is doing is allocating buffers and managing a display controller via a device-agnostic interface (KMS). All the device-specific code is kept in the DRM driver and the client GLES/EGL library. Given that the DDX driver doesn't contain any device-specific code, we're going to take the OMAP DDX as a baseline and try and make it more generic. Our immediate goals are to make it work on our own Versatile Express development platform and on Samsung's Exynos 5250 SoC, however our hope is to have a single DDX driver which can cover OMAP, Exynos, ST-E's Nova/Thor platforms and probably others too. It's even been suggested it could work with Mesa's sw backend(?). Anyway, the DDX is very much a work-in-progress and is still heavily branded OMAP, even though it's working (almost) perfectly on VExpress & Exynos too (re-branding isn't too high-up our priority list at the moment). We are actively developing this driver and will be doing so in a public git repository hosted by Linaro. We will not be maintaining any private repository behind ARM's firewall or anything like that - you'll see what we see. The first patches have now been pushed, so if anyone's interested in seeing what we have so far or wants to track development, the tree is here: http://git.linaro.org/gitweb?p=arm/xorg/driver/xf86-video-armsoc.git;a=summa ry Note: When we originally spoke to Rob Clark about this, he suggested we take the already-generic xf86-video-modesetting and just add the dri2 code to it. This is indeed how we started out, however as we progressed it became clear that the majority of the code we wanted was in the omap driver and were having to work fairly hard to keep some of the original modesetting code. This is why we've now changed tactic and just forked the OMAP driver, something Rob is more than happy for us to do. One thing the DDX driver isn't doing yet is making use of 2D hw blocks. In the short-term, we will simply create a branch off of the "generic" master for each SoC and add 2D hardware support there. We do however want a more permanent solution which doesn't need a separate branch per SoC. Some of the suggested solutions are: * Add a new generic DRM ioctl API for larger 2D operations (I would imagine small blits/blends would be done in SW). * Use SW rendering for everything other than solid blits and use v4l2's blitting API for those (importing/exporting buffers to be blitted using dma_buf). The theory here is that most UIs are rendered with GLES and so you only need 2D hardware for blits. I think we'll prototype this approach on Exynos. * Define a new x-server sub-module interface to allow a seperate .so 2D driver to be loaded (this is the approach the current OMAP DDX uses). We are hoping someone might have some advice & suggestions on how to proceed with regards to 2D. We're also very interested in any feedback, both on the DDX driver specifically and on the approach we're taking in general. Cheers, Tom From airlied at gmail.com Mon May 21 08:55:06 2012 From: airlied at gmail.com (Dave Airlie) Date: Mon, 21 May 2012 09:55:06 +0100 Subject: [Linaro-mm-sig] New "xf86-video-armsoc" DDX driver In-Reply-To: <4fba0034.e1d9440a.7f33.0bfeSMTPIN_ADDED@mx.google.com> References: <4fba0034.e1d9440a.7f33.0bfeSMTPIN_ADDED@mx.google.com> Message-ID: > > For the last few months we (ARM MPD... "The Mali guys") have been working on > getting X.Org up and running with Mali T6xx (ARM's next-generation GPU IP). > The approach is very similar (well identical I think) to how things work on > OMAP: We use a DRM driver to manage the display controller via KMS. The KMS > driver also allocates both scan-out and pixmap/back buffers via the > DRM_IOCTL_MODE_CREATE_DUMB ioctl which is internally implemented with GEM. > When returning buffers to DRI clients, the x-server uses flink to get a > global handle to a buffer which it passes back to the DRI client (in our > case the Mali-T600 X11 EGL winsys). The client then uses the new PRIME > ioctls to export the GEM buffer it received from the x-server to a dma_buf > fd. This fd is then passed into the T6xx kernel driver via our own job > dispatch user/kernel API (we're not using DRM for driving the GPU, only the > display controller). So using dumb in this was is probably a bit of an abuse, since dumb is defined to provide buffers not to be used for acceleration hw. Since when we allocate dumb buffers, we can't know what special hw layouts are required (tiling etc) for optimal performance for accel. The logic to work that out is rarely generic. > > http://git.linaro.org/gitweb?p=arm/xorg/driver/xf86-video-armsoc.git;a=summa > ry > > Note: When we originally spoke to Rob Clark about this, he suggested we take > the already-generic xf86-video-modesetting and just add the dri2 code to it. > This is indeed how we started out, however as we progressed it became clear > that the majority of the code we wanted was in the omap driver and were > having to work fairly hard to keep some of the original modesetting code. > This is why we've now changed tactic and just forked the OMAP driver, > something Rob is more than happy for us to do. It does seem like porting to -modesetting, and maybe cleaning up modesetting if its needs it. The modesetting driver is pretty much just a make it work port of the radeon/nouveau/intel code "shared" code. > One thing the DDX driver isn't doing yet is making use of 2D hw blocks. In > the short-term, we will simply create a branch off of the "generic" master > for each SoC and add 2D hardware support there. We do however want a more > permanent solution which doesn't need a separate branch per SoC. Some of the > suggested solutions are: > > * Add a new generic DRM ioctl API for larger 2D operations (I would imagine > small blits/blends would be done in SW). Not going to happen, again the hw isn't generic in this area, some hw requires 3D engines to do 2D ops etc. The limitations on some hw with overlaps etc, and finally it breaks the rule about generic ioctls for acceleration operations. > * Use SW rendering for everything other than solid blits and use v4l2's > blitting API for those (importing/exporting buffers to be blitted using > dma_buf). The theory here is that most UIs are rendered with GLES and so you > only need 2D hardware for blits. I think we'll prototype this approach on > Exynos. Seems a bit over the top, > * Define a new x-server sub-module interface to allow a seperate .so 2D > driver to be loaded (this is the approach the current OMAP DDX uses). This seems the sanest. I haven't time this week to review the code, but I'll try and take a look when time permits. Dave. From daniel at ffwll.ch Mon May 21 09:03:45 2012 From: daniel at ffwll.ch (Daniel Vetter) Date: Mon, 21 May 2012 11:03:45 +0200 Subject: [Linaro-mm-sig] New "xf86-video-armsoc" DDX driver In-Reply-To: References: <4fba0034.e1d9440a.7f33.0bfeSMTPIN_ADDED@mx.google.com> Message-ID: <20120521090328.GA4970@phenom.ffwll.local> On Mon, May 21, 2012 at 09:55:06AM +0100, Dave Airlie wrote: > > * Define a new x-server sub-module interface to allow a seperate .so 2D > > driver to be loaded (this is the approach the current OMAP DDX uses). > > This seems the sanest. Or go the intel glamour route and stitch together a somewhat generic 2d accel code on top of GL. That should give you reasonable (albeit likely not stellar) X render performance. -Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48 From minchan at kernel.org Tue May 22 06:58:39 2012 From: minchan at kernel.org (Minchan Kim) Date: Tue, 22 May 2012 15:58:39 +0900 Subject: [Linaro-mm-sig] [PATCHv2 1/4] mm: vmalloc: use const void * for caller argument In-Reply-To: <1337252085-22039-2-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-2-git-send-email-m.szyprowski@samsung.com> Message-ID: <4FBB391F.8030402@kernel.org> On 05/17/2012 07:54 PM, Marek Szyprowski wrote: > 'const void *' is a safer type for caller function type. This patch > updates all references to caller function type. > > Signed-off-by: Marek Szyprowski > Reviewed-by: Kyungmin Park Reviewed-by: Minchan Kim -- Kind regards, Minchan Kim From minchan at kernel.org Tue May 22 07:01:14 2012 From: minchan at kernel.org (Minchan Kim) Date: Tue, 22 May 2012 16:01:14 +0900 Subject: [Linaro-mm-sig] [PATCHv2 2/4] mm: vmalloc: export find_vm_area() function In-Reply-To: <1337252085-22039-3-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-3-git-send-email-m.szyprowski@samsung.com> Message-ID: <4FBB39BA.3000601@kernel.org> On 05/17/2012 07:54 PM, Marek Szyprowski wrote: > find_vm_area() function is usefull for other core subsystems (like > dma-mapping) to get access to vm_area internals. > > Signed-off-by: Marek Szyprowski > Reviewed-by: Kyungmin Park We can't know how you want to use this function. It would be better to fold this patch into [4/4]. > --- > include/linux/vmalloc.h | 1 + > mm/vmalloc.c | 10 +++++++++- > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 2e28f4d..6071e91 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -93,6 +93,7 @@ extern struct vm_struct *__get_vm_area_caller(unsigned long size, > unsigned long start, unsigned long end, > const void *caller); > extern struct vm_struct *remove_vm_area(const void *addr); > +extern struct vm_struct *find_vm_area(const void *addr); > > extern int map_vm_area(struct vm_struct *area, pgprot_t prot, > struct page ***pages); > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 8bc7f3ef..8cb7f22 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1402,7 +1402,15 @@ struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags, > -1, GFP_KERNEL, caller); > } > > -static struct vm_struct *find_vm_area(const void *addr) > +/** > + * find_vm_area - find a continuous kernel virtual area > + * @addr: base address > + * > + * Search for the kernel VM area starting at @addr, and return it. > + * It is up to the caller to do all required locking to keep the returned > + * pointer valid. > + */ > +struct vm_struct *find_vm_area(const void *addr) > { > struct vmap_area *va; > -- Kind regards, Minchan Kim From minchan at kernel.org Tue May 22 07:07:45 2012 From: minchan at kernel.org (Minchan Kim) Date: Tue, 22 May 2012 16:07:45 +0900 Subject: [Linaro-mm-sig] [PATCHv2 3/4] mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework In-Reply-To: <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> Message-ID: <4FBB3B41.8010102@kernel.org> On 05/17/2012 07:54 PM, Marek Szyprowski wrote: > Add new type of vm_area intented to be used for consisten mappings > created by dma-mapping framework. > > Signed-off-by: Marek Szyprowski > Reviewed-by: Kyungmin Park > --- > include/linux/vmalloc.h | 1 + > mm/vmalloc.c | 3 +++ > 2 files changed, 4 insertions(+) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 6071e91..8a9555a 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -14,6 +14,7 @@ struct vm_area_struct; /* vma defining user mapping in mm_types.h */ > #define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */ > #define VM_VPAGES 0x00000010 /* buffer for pages was vmalloc'ed */ > #define VM_UNLIST 0x00000020 /* vm_struct is not listed in vmlist */ > +#define VM_DMA 0x00000040 /* used by dma-mapping framework */ > /* bits [20..32] reserved for arch specific ioremap internals */ > > /* > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 8cb7f22..9c13bab 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2582,6 +2582,9 @@ static int s_show(struct seq_file *m, void *p) > if (v->flags & VM_IOREMAP) > seq_printf(m, " ioremap"); > > + if (v->flags & VM_DMA) > + seq_printf(m, " dma"); > + Hmm, VM_DMA would become generic flag? AFAIU, maybe VM_DMA would be used only on ARM arch. Of course, it isn't performance sensitive part but there in no reason to check it, either in other architecture except ARM. I suggest following as #ifdef CONFIG_ARM #define VM_DMA 0x00000040 #else #define VM_DMA 0x0 #end Maybe it could remove check code at compile time. > if (v->flags & VM_ALLOC) > seq_printf(m, " vmalloc"); > -- Kind regards, Minchan Kim From minchan at kernel.org Tue May 22 07:33:22 2012 From: minchan at kernel.org (Minchan Kim) Date: Tue, 22 May 2012 16:33:22 +0900 Subject: [Linaro-mm-sig] [PATCHv2 4/4] ARM: dma-mapping: remove custom consistent dma region In-Reply-To: <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> Message-ID: <4FBB4142.2070709@kernel.org> On 05/17/2012 07:54 PM, Marek Szyprowski wrote: > This patch changes dma-mapping subsystem to use generic vmalloc areas > for all consistent dma allocations. This increases the total size limit > of the consistent allocations and removes platform hacks and a lot of > duplicated code. > I like this patch very much! There are just small nitpicks below. > Atomic allocations are served from special pool preallocated on boot, > becasue vmalloc areas cannot be reliably created in atomic context. typo because > > Signed-off-by: Marek Szyprowski > --- > Documentation/kernel-parameters.txt | 4 + > arch/arm/include/asm/dma-mapping.h | 2 +- > arch/arm/mm/dma-mapping.c | 360 ++++++++++++++++------------------- > 3 files changed, 171 insertions(+), 195 deletions(-) > > diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt > index c1601e5..ba58f50 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -515,6 +515,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. > a hypervisor. > Default: yes > > + coherent_pool=nn[KMG] [ARM,KNL] > + Sets the size of memory pool for coherent, atomic dma > + allocations. > + > code_bytes [X86] How many bytes of object code to print > in an oops report. > Range: 0 - 8192 > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h > index cb3b7c9..92b0afb 100644 > --- a/arch/arm/include/asm/dma-mapping.h > +++ b/arch/arm/include/asm/dma-mapping.h > @@ -210,7 +210,7 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *, > * DMA region above it's default value of 2MB. It must be called before the > * memory allocator is initialised, i.e. before any core_initcall. > */ > -extern void __init init_consistent_dma_size(unsigned long size); > +static inline void init_consistent_dma_size(unsigned long size) { } > > > #ifdef CONFIG_DMABOUNCE > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index db23ae4..3be4de2 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -19,6 +19,8 @@ > #include > #include > #include > +#include > +#include > > #include > #include > @@ -119,210 +121,178 @@ static void __dma_free_buffer(struct page *page, size_t size) > } > > #ifdef CONFIG_MMU > - > -#define CONSISTENT_OFFSET(x) (((unsigned long)(x) - consistent_base) >> PAGE_SHIFT) > -#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - consistent_base) >> PMD_SHIFT) > - > -/* > - * These are the page tables (2MB each) covering uncached, DMA consistent allocations > - */ > -static pte_t **consistent_pte; > - > -#define DEFAULT_CONSISTENT_DMA_SIZE SZ_2M > - > -unsigned long consistent_base = CONSISTENT_END - DEFAULT_CONSISTENT_DMA_SIZE; > - > -void __init init_consistent_dma_size(unsigned long size) > -{ > - unsigned long base = CONSISTENT_END - ALIGN(size, SZ_2M); > - > - BUG_ON(consistent_pte); /* Check we're called before DMA region init */ > - BUG_ON(base < VMALLOC_END); > - > - /* Grow region to accommodate specified size */ > - if (base < consistent_base) > - consistent_base = base; > -} > - > -#include "vmregion.h" > - > -static struct arm_vmregion_head consistent_head = { > - .vm_lock = __SPIN_LOCK_UNLOCKED(&consistent_head.vm_lock), > - .vm_list = LIST_HEAD_INIT(consistent_head.vm_list), > - .vm_end = CONSISTENT_END, > -}; > - > #ifdef CONFIG_HUGETLB_PAGE > #error ARM Coherent DMA allocator does not (yet) support huge TLB > #endif > > -/* > - * Initialise the consistent memory allocation. > - */ > -static int __init consistent_init(void) > -{ > - int ret = 0; > - pgd_t *pgd; > - pud_t *pud; > - pmd_t *pmd; > - pte_t *pte; > - int i = 0; > - unsigned long base = consistent_base; > - unsigned long num_ptes = (CONSISTENT_END - base) >> PMD_SHIFT; > - > - consistent_pte = kmalloc(num_ptes * sizeof(pte_t), GFP_KERNEL); > - if (!consistent_pte) { > - pr_err("%s: no memory\n", __func__); > - return -ENOMEM; > - } > - > - pr_debug("DMA memory: 0x%08lx - 0x%08lx:\n", base, CONSISTENT_END); > - consistent_head.vm_start = base; > - > - do { > - pgd = pgd_offset(&init_mm, base); > - > - pud = pud_alloc(&init_mm, pgd, base); > - if (!pud) { > - printk(KERN_ERR "%s: no pud tables\n", __func__); > - ret = -ENOMEM; > - break; > - } > - > - pmd = pmd_alloc(&init_mm, pud, base); > - if (!pmd) { > - printk(KERN_ERR "%s: no pmd tables\n", __func__); > - ret = -ENOMEM; > - break; > - } > - WARN_ON(!pmd_none(*pmd)); > - > - pte = pte_alloc_kernel(pmd, base); > - if (!pte) { > - printk(KERN_ERR "%s: no pte tables\n", __func__); > - ret = -ENOMEM; > - break; > - } > - > - consistent_pte[i++] = pte; > - base += PMD_SIZE; > - } while (base < CONSISTENT_END); > - > - return ret; > -} > - > -core_initcall(consistent_init); > - > static void * > __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, > const void *caller) > { > - struct arm_vmregion *c; > - size_t align; > - int bit; > + struct vm_struct *area; > + unsigned long addr; > > - if (!consistent_pte) { > - printk(KERN_ERR "%s: not initialised\n", __func__); > + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); Out of curiosity. Do we always map dma area into user's address space? > + if (!area) > + return NULL; > + addr = (unsigned long)area->addr; > + area->phys_addr = __pfn_to_phys(page_to_pfn(page)); > + > + if (ioremap_page_range(addr, addr + size, area->phys_addr, prot)) { > + vunmap((void *)addr); > + return NULL; > + } > + return (void *)addr; > +} > + > +static void __dma_free_remap(void *cpu_addr, size_t size) > +{ > + struct vm_struct *area; > + > + read_lock(&vmlist_lock); Why do we need vmlist_lock? > + area = find_vm_area(cpu_addr); find_vm_area only checks vmalloced regions so we need more check. if (!area || !(area->flags & VM_DMA)) > + if (!area) { > + pr_err("%s: trying to free invalid coherent area: %p\n", > + __func__, cpu_addr); > + dump_stack(); > + read_unlock(&vmlist_lock); > + return; > + } > + unmap_kernel_range((unsigned long)cpu_addr, size); > + read_unlock(&vmlist_lock); > + vunmap(cpu_addr); > +} > + > +struct dma_pool { > + size_t size; > + spinlock_t lock; > + unsigned long *bitmap; > + unsigned long count; Nitpick. What does count mean? nr_pages? > + void *vaddr; > + struct page *page; > +}; > + > +static struct dma_pool atomic_pool = { > + .size = SZ_256K, > +}; AFAIUC, we could set it to 2M but you are reducing it to 256K. What's the justification for that default value? > + > +static int __init early_coherent_pool(char *p) > +{ > + atomic_pool.size = memparse(p, &p); > + return 0; > +} > +early_param("coherent_pool", early_coherent_pool); > + > +/* > + * Initialise the coherent pool for atomic allocations. > + */ > +static int __init atomic_pool_init(void) > +{ > + struct dma_pool *pool = &atomic_pool; > + pgprot_t prot = pgprot_dmacoherent(pgprot_kernel); > + unsigned long count = pool->size >> PAGE_SHIFT; > + gfp_t gfp = GFP_KERNEL | GFP_DMA; > + unsigned long *bitmap; > + struct page *page; > + void *ptr; > + int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); > + > + bitmap = kzalloc(bitmap_size, GFP_KERNEL); > + if (!bitmap) > + goto no_bitmap; > + > + page = __dma_alloc_buffer(NULL, pool->size, gfp); > + if (!page) > + goto no_page; > + > + ptr = __dma_alloc_remap(page, pool->size, gfp, prot, NULL); > + if (ptr) { > + spin_lock_init(&pool->lock); > + pool->vaddr = ptr; > + pool->page = page; > + pool->bitmap = bitmap; > + pool->count = count; > + pr_info("DMA: preallocated %u KiB pool for atomic coherent allocations\n", > + (unsigned)pool->size / 1024); > + return 0; > + } > + > + __dma_free_buffer(page, pool->size); > +no_page: > + kfree(bitmap); > +no_bitmap: > + pr_err("DMA: failed to allocate %u KiB pool for atomic coherent allocation\n", > + (unsigned)pool->size / 1024); > + return -ENOMEM; > +} > +core_initcall(atomic_pool_init); > + > +static void *__alloc_from_pool(size_t size, struct page **ret_page) > +{ > + struct dma_pool *pool = &atomic_pool; > + unsigned int count = size >> PAGE_SHIFT; > + unsigned int pageno; > + unsigned long flags; > + void *ptr = NULL; > + size_t align; > + > + if (!pool->vaddr) { > + pr_err("%s: coherent pool not initialised!\n", __func__); > dump_stack(); > return NULL; > } > > /* > - * Align the virtual region allocation - maximum alignment is > - * a section size, minimum is a page size. This helps reduce > - * fragmentation of the DMA space, and also prevents allocations > - * smaller than a section from crossing a section boundary. > + * Align the region allocation - allocations from pool are rather > + * small, so align them to their order in pages, minimum is a page > + * size. This helps reduce fragmentation of the DMA space. > */ > - bit = fls(size - 1); > - if (bit > SECTION_SHIFT) > - bit = SECTION_SHIFT; > - align = 1 << bit; > + align = PAGE_SIZE << get_order(size); > > - /* > - * Allocate a virtual address in the consistent mapping region. > - */ > - c = arm_vmregion_alloc(&consistent_head, align, size, > - gfp & ~(__GFP_DMA | __GFP_HIGHMEM), caller); > - if (c) { > - pte_t *pte; > - int idx = CONSISTENT_PTE_INDEX(c->vm_start); > - u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); > - > - pte = consistent_pte[idx] + off; > - c->vm_pages = page; > - > - do { > - BUG_ON(!pte_none(*pte)); > - > - set_pte_ext(pte, mk_pte(page, prot), 0); > - page++; > - pte++; > - off++; > - if (off >= PTRS_PER_PTE) { > - off = 0; > - pte = consistent_pte[++idx]; > - } > - } while (size -= PAGE_SIZE); > - > - dsb(); > - > - return (void *)c->vm_start; > + spin_lock_irqsave(&pool->lock, flags); > + pageno = bitmap_find_next_zero_area(pool->bitmap, pool->count, > + 0, count, (1 << align) - 1); > + if (pageno < pool->count) { > + bitmap_set(pool->bitmap, pageno, count); > + ptr = pool->vaddr + PAGE_SIZE * pageno; > + *ret_page = pool->page + pageno; > } > - return NULL; > + spin_unlock_irqrestore(&pool->lock, flags); > + > + return ptr; > } > > -static void __dma_free_remap(void *cpu_addr, size_t size) > +static int __free_from_pool(void *start, size_t size) > { > - struct arm_vmregion *c; > - unsigned long addr; > - pte_t *ptep; > - int idx; > - u32 off; > + struct dma_pool *pool = &atomic_pool; > + unsigned long pageno, count; > + unsigned long flags; > > - c = arm_vmregion_find_remove(&consistent_head, (unsigned long)cpu_addr); > - if (!c) { > - printk(KERN_ERR "%s: trying to free invalid coherent area: %p\n", > - __func__, cpu_addr); > + if (start < pool->vaddr || start > pool->vaddr + pool->size) > + return 0; > + > + if (start + size > pool->vaddr + pool->size) { > + pr_err("%s: freeing wrong coherent size from pool\n", __func__); > dump_stack(); > - return; > + return 0; > } > > - if ((c->vm_end - c->vm_start) != size) { > - printk(KERN_ERR "%s: freeing wrong coherent size (%ld != %d)\n", > - __func__, c->vm_end - c->vm_start, size); > - dump_stack(); > - size = c->vm_end - c->vm_start; > - } > + pageno = (start - pool->vaddr) >> PAGE_SHIFT; > + count = size >> PAGE_SHIFT; > > - idx = CONSISTENT_PTE_INDEX(c->vm_start); > - off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); > - ptep = consistent_pte[idx] + off; > - addr = c->vm_start; > - do { > - pte_t pte = ptep_get_and_clear(&init_mm, addr, ptep); > + spin_lock_irqsave(&pool->lock, flags); > + bitmap_clear(pool->bitmap, pageno, count); > + spin_unlock_irqrestore(&pool->lock, flags); > > - ptep++; > - addr += PAGE_SIZE; > - off++; > - if (off >= PTRS_PER_PTE) { > - off = 0; > - ptep = consistent_pte[++idx]; > - } > - > - if (pte_none(pte) || !pte_present(pte)) > - printk(KERN_CRIT "%s: bad page in kernel page table\n", > - __func__); > - } while (size -= PAGE_SIZE); > - > - flush_tlb_kernel_range(c->vm_start, c->vm_end); > - > - arm_vmregion_free(&consistent_head, c); > + return 1; > } > > #else /* !CONFIG_MMU */ > > #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) > #define __dma_free_remap(addr, size) do { } while (0) > +#define __alloc_from_pool(size, ret_page) NULL > +#define __free_from_pool(addr, size) 0 > > #endif /* CONFIG_MMU */ > > @@ -345,6 +315,16 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, > *handle = ~0; > size = PAGE_ALIGN(size); > > + /* > + * Atomic allocations need special handling > + */ > + if (gfp & GFP_ATOMIC && !arch_is_coherent()) { > + addr = __alloc_from_pool(size, &page); > + if (addr) > + *handle = pfn_to_dma(dev, page_to_pfn(page)); > + return addr; > + } > + > page = __dma_alloc_buffer(dev, size, gfp); > if (!page) > return NULL; > @@ -398,24 +378,16 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma, > { > int ret = -ENXIO; > #ifdef CONFIG_MMU > - unsigned long user_size, kern_size; > - struct arm_vmregion *c; > + unsigned long user_count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > + unsigned long pfn = dma_to_pfn(dev, dma_addr); > + unsigned long off = vma->vm_pgoff; > > - user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; > - > - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); > - if (c) { > - unsigned long off = vma->vm_pgoff; > - > - kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT; > - > - if (off < kern_size && > - user_size <= (kern_size - off)) { > - ret = remap_pfn_range(vma, vma->vm_start, > - page_to_pfn(c->vm_pages) + off, > - user_size << PAGE_SHIFT, > - vma->vm_page_prot); > - } > + if (off < count && user_count <= (count - off)) { > + ret = remap_pfn_range(vma, vma->vm_start, > + pfn + off, > + user_count << PAGE_SHIFT, > + vma->vm_page_prot); > } > #endif /* CONFIG_MMU */ > > @@ -444,13 +416,16 @@ EXPORT_SYMBOL(dma_mmap_writecombine); > */ > void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle) > { > - WARN_ON(irqs_disabled()); > - > if (dma_release_from_coherent(dev, get_order(size), cpu_addr)) > return; > > size = PAGE_ALIGN(size); > > + if (__free_from_pool(cpu_addr, size)) > + return; > + > + WARN_ON(irqs_disabled()); > + > if (!arch_is_coherent()) > __dma_free_remap(cpu_addr, size); > > @@ -726,9 +701,6 @@ EXPORT_SYMBOL(dma_set_mask); > > static int __init dma_debug_do_init(void) > { > -#ifdef CONFIG_MMU > - arm_vmregion_create_proc("dma-mappings", &consistent_head); > -#endif > dma_debug_init(PREALLOC_DMA_DEBUG_ENTRIES); > return 0; > } -- Kind regards, Minchan Kim From m.szyprowski at samsung.com Tue May 22 07:40:17 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Tue, 22 May 2012 09:40:17 +0200 Subject: [Linaro-mm-sig] [GIT PULL] CMA and ARM DMA-mapping updates for v3.5 Message-ID: <1337672417-10065-1-git-send-email-m.szyprowski@samsung.com> Hi Linus, I would like to ask for pulling Contiguous Memory Allocator (CMA) and ARM DMA-mapping framework updates for v3.5. The following changes since commit 76e10d158efb6d4516018846f60c2ab5501900bc: Linux 3.4 (2012-05-20 15:29:13 -0700) with the top-most commit 0f51596bd39a5c928307ffcffc9ba07f90f42a8b Merge branch 'for-next-arm-dma' into for-linus are available in the git repository at: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git for-linus These patches contains 2 major updates for DMA mapping subsystem (mainly for ARM architecture). First one is Contiguous Memory Allocator (CMA) which makes it possible for device drivers to allocate big contiguous chunks of memory after the system has booted. The main difference from the similar frameworks is the fact that CMA allows to transparently reuse memory region reserved for the big chunk allocation as a system memory, so no memory is wasted when no big chunk is allocated. Once the alloc request is issued, the framework migrates system pages to create a space for the required big chunk of physically contiguous memory. For more information one can refer to nice LWN articles: 'A reworked contiguous memory allocator': http://lwn.net/Articles/447405/ 'CMA and ARM': http://lwn.net/Articles/450286/ 'A deep dive into CMA': http://lwn.net/Articles/486301/ and the following thread with the patches and links to all previous versions: https://lkml.org/lkml/2012/4/3/204 The main client for this new framework is ARM DMA-mapping subsystem. The second part provides a complete redesign in ARM DMA-mapping subsystem. The core implementation has been changed to use common struct dma_map_ops based infrastructure with the recent updates for new dma attributes merged in v3.4-rc2. This allows to use more than one implementation of dma-mapping calls and change/select them on the struct device basis. The first client of this new infractructure is dmabounce implementation which has been completely cut out of the core, common code. The last patch of this redesign update introduces a new, experimental implementation of dma-mapping calls on top of generic IOMMU framework. This lets ARM sub-platform to transparently use IOMMU for DMA-mapping calls if one provides required IOMMU hardware. For more information please refer to the following thread: http://www.spinics.net/lists/arm-kernel/msg175729.html The last patch merges changes from both updates and provides a resolution for the conflicts which cannot be avoided when patches have been applied on the same files (mainly arch/arm/mm/dma-mapping.c). Thanks! Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: Marek Szyprowski (17): common: add dma_mmap_from_coherent() function ARM: dma-mapping: use dma_mmap_from_coherent() ARM: dma-mapping: use pr_* instread of printk ARM: dma-mapping: introduce DMA_ERROR_CODE constant ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops ARM: dma-mapping: use asm-generic/dma-mapping-common.h ARM: dma-mapping: implement dma sg methods on top of any generic dma ops ARM: dma-mapping: move all dma bounce code to separate dma ops structure ARM: dma-mapping: remove redundant code and do the cleanup ARM: dma-mapping: use alloc, mmap, free from dma_ops ARM: dma-mapping: add support for IOMMU mapper mm: extract reclaim code from __alloc_pages_direct_reclaim() mm: trigger page reclaim in alloc_contig_range() to stabilise watermarks drivers: add Contiguous Memory Allocator X86: integrate CMA with DMA-mapping subsystem ARM: integrate CMA with DMA-mapping subsystem Merge branch 'for-next-arm-dma' into for-linus Mel Gorman (1): mm: Serialize access to min_free_kbytes Michal Nazarewicz (9): mm: page_alloc: remove trailing whitespace mm: compaction: introduce isolate_migratepages_range() mm: compaction: introduce map_pages() mm: compaction: introduce isolate_freepages_range() mm: compaction: export some of the functions mm: page_alloc: introduce alloc_contig_range() mm: page_alloc: change fallbacks array handling mm: mmzone: MIGRATE_CMA migration type added mm: page_isolation: MIGRATE_CMA isolation functions added Minchan Kim (1): cma: fix migration mode Vitaly Andrianov (1): ARM: dma-mapping: use PMD size for section unmap Documentation/kernel-parameters.txt | 9 + arch/Kconfig | 3 + arch/arm/Kconfig | 11 + arch/arm/common/dmabounce.c | 84 ++- arch/arm/include/asm/device.h | 4 + arch/arm/include/asm/dma-contiguous.h | 15 + arch/arm/include/asm/dma-iommu.h | 34 + arch/arm/include/asm/dma-mapping.h | 407 +++-------- arch/arm/include/asm/mach/map.h | 1 + arch/arm/kernel/setup.c | 9 +- arch/arm/mm/dma-mapping.c | 1348 ++++++++++++++++++++++++++++----- arch/arm/mm/init.c | 23 +- arch/arm/mm/mm.h | 3 + arch/arm/mm/mmu.c | 31 +- arch/arm/mm/vmregion.h | 2 +- arch/x86/Kconfig | 1 + arch/x86/include/asm/dma-contiguous.h | 13 + arch/x86/include/asm/dma-mapping.h | 5 + arch/x86/kernel/pci-dma.c | 18 +- arch/x86/kernel/pci-nommu.c | 8 +- arch/x86/kernel/setup.c | 2 + drivers/base/Kconfig | 89 +++ drivers/base/Makefile | 1 + drivers/base/dma-coherent.c | 42 + drivers/base/dma-contiguous.c | 401 ++++++++++ include/asm-generic/dma-coherent.h | 4 +- include/asm-generic/dma-contiguous.h | 28 + include/linux/device.h | 4 + include/linux/dma-contiguous.h | 110 +++ include/linux/gfp.h | 12 + include/linux/mmzone.h | 47 +- include/linux/page-isolation.h | 18 +- mm/Kconfig | 2 +- mm/Makefile | 3 +- mm/compaction.c | 418 +++++++---- mm/internal.h | 33 + mm/memory-failure.c | 2 +- mm/memory_hotplug.c | 6 +- mm/page_alloc.c | 409 +++++++++-- mm/page_isolation.c | 15 +- mm/vmstat.c | 3 + 41 files changed, 2898 insertions(+), 780 deletions(-) create mode 100644 arch/arm/include/asm/dma-contiguous.h create mode 100644 arch/arm/include/asm/dma-iommu.h create mode 100644 arch/x86/include/asm/dma-contiguous.h create mode 100644 drivers/base/dma-contiguous.c create mode 100644 include/asm-generic/dma-contiguous.h create mode 100644 include/linux/dma-contiguous.h From 21cnbao at gmail.com Tue May 22 08:27:49 2012 From: 21cnbao at gmail.com (Barry Song) Date: Tue, 22 May 2012 16:27:49 +0800 Subject: [Linaro-mm-sig] [GIT PULL] CMA and ARM DMA-mapping updates for v3.5 In-Reply-To: <1337672417-10065-1-git-send-email-m.szyprowski@samsung.com> References: <1337672417-10065-1-git-send-email-m.szyprowski@samsung.com> Message-ID: Hi Marek, 2012/5/22 Marek Szyprowski : > Hi Linus, > > I would like to ask for pulling Contiguous Memory Allocator (CMA) and > ARM DMA-mapping framework updates for v3.5. > > The following changes since commit 76e10d158efb6d4516018846f60c2ab5501900bc: > > ?Linux 3.4 (2012-05-20 15:29:13 -0700) > > with the top-most commit 0f51596bd39a5c928307ffcffc9ba07f90f42a8b > > ?Merge branch 'for-next-arm-dma' into for-linus > > are available in the git repository at: > ?git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git for-linus > > > Patch summary: > > Marek Szyprowski (17): > ? ? ?common: add dma_mmap_from_coherent() function > ? ? ?ARM: dma-mapping: use dma_mmap_from_coherent() > ? ? ?ARM: dma-mapping: use pr_* instread of printk > ? ? ?ARM: dma-mapping: introduce DMA_ERROR_CODE constant > ? ? ?ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops > ? ? ?ARM: dma-mapping: use asm-generic/dma-mapping-common.h > ? ? ?ARM: dma-mapping: implement dma sg methods on top of any generic dma ops > ? ? ?ARM: dma-mapping: move all dma bounce code to separate dma ops structure > ? ? ?ARM: dma-mapping: remove redundant code and do the cleanup > ? ? ?ARM: dma-mapping: use alloc, mmap, free from dma_ops > ? ? ?ARM: dma-mapping: add support for IOMMU mapper > ? ? ?mm: extract reclaim code from __alloc_pages_direct_reclaim() > ? ? ?mm: trigger page reclaim in alloc_contig_range() to stabilise watermarks > ? ? ?drivers: add Contiguous Memory Allocator > ? ? ?X86: integrate CMA with DMA-mapping subsystem > ? ? ?ARM: integrate CMA with DMA-mapping subsystem > ? ? ?Merge branch 'for-next-arm-dma' into for-linus > > Mel Gorman (1): > ? ? ?mm: Serialize access to min_free_kbytes > > Michal Nazarewicz (9): > ? ? ?mm: page_alloc: remove trailing whitespace > ? ? ?mm: compaction: introduce isolate_migratepages_range() > ? ? ?mm: compaction: introduce map_pages() > ? ? ?mm: compaction: introduce isolate_freepages_range() > ? ? ?mm: compaction: export some of the functions > ? ? ?mm: page_alloc: introduce alloc_contig_range() > ? ? ?mm: page_alloc: change fallbacks array handling > ? ? ?mm: mmzone: MIGRATE_CMA migration type added > ? ? ?mm: page_isolation: MIGRATE_CMA isolation functions added > > Minchan Kim (1): > ? ? ?cma: fix migration mode > > Vitaly Andrianov (1): > ? ? ?ARM: dma-mapping: use PMD size for section unmap would you pls add CMA test module in this patchset too? http://lists.infradead.org/pipermail/linux-arm-kernel/2012-March/088412.html mm: cma: add a simple kernel module as the helper to test CMA Signed-off-by: Barry Song Reviewed-by: Michal Nazarewicz Thanks barry From t.stanislaws at samsung.com Tue May 22 11:51:46 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Tue, 22 May 2012 13:51:46 +0200 Subject: [Linaro-mm-sig] [RFC 05/13] v4l: vb2-dma-contig: add support for DMABUF exporting In-Reply-To: <1813415.rG2izL3i2h@avalon> References: <1334063447-16824-1-git-send-email-t.stanislaws@samsung.com> <3143149.ZCeOjVLXgh@avalon> <4F8FEC04.3030700@samsung.com> <1813415.rG2izL3i2h@avalon> Message-ID: <4FBB7DD2.10108@samsung.com> Hi Laurent, Sorry for the late reply. Thank you very much for noticing the issue. >>>> +static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv) >>>> +{ >>>> + struct vb2_dc_buf *buf = buf_priv; >>>> + struct dma_buf *dbuf; >>>> + >>>> + if (buf->dma_buf) >>>> + return buf->dma_buf; >>> >>> Can't there be a race condition here if the user closes the DMABUF file >>> handle before vb2 core calls dma_buf_fd() ? >> >> The user cannot access the file until it is associated with a file >> descriptor. How can the user close it? Could you give me a more detailed >> description of this potential race condition? > > Let's assume the V4L2 buffer has already been exported once. buf->dma_buf is > set to a non-NULL value, and the application has an open file handle for the > buffer. The application then tries to export the buffer a second time. > vb2_dc_get_dmabuf() gets called, checks buf->dma_buf and returns it as it's > non-NULL. Right after that, before the vb2 core calls dma_buf_fd() on the > struct dma_buf, the application closes the file handle to the exported buffer. > The struct dma_buf object gets freed, as the reference count drops to 0. I am not sure if reference counter drops to 0 in this case. Notice that after first dma_buf_export the dma_buf's refcnt is 1, after first dma_buf_fd it goes to 2. After closing a file handle the refcnt drops back to 1 so the file (and dma_buf) is not released. Therefore IMO there no dangling pointer issue. However it looks that there is a circular reference between vb2_dc_buf and dma_buf. vb2_dc_buf::dma_buf is pointing to dma_buf with reference taken at dma_buf_export. On the other hand the dma_buf->priv is pointing to vb2_dc_buf with reference taken at atomic_inc(&buf->refcount) in vb2_dc_get_dmabuf. The circular reference leads into resource leakage. The problem could be fixed by dropping caching of dma_buf pointer. The new dma_buf would be created and exported at every export event. The dma_buf_put would be called in vb2_expbuf just after successful dma_buf_fd. Do you have any ideas how I could deal with resource leakage/dangling problems without creating a new dma_buf instance at every export event? Regards, Tomasz Stanislawski > vb2 core will then try to call dma_buf_fd() on a dma_buf object that has been > freed. > >>>> + /* dmabuf keeps reference to vb2 buffer */ >>>> + atomic_inc(&buf->refcount); >>>> + dbuf = dma_buf_export(buf, &vb2_dc_dmabuf_ops, buf->size, 0); >>>> + if (IS_ERR(dbuf)) { >>>> + atomic_dec(&buf->refcount); >>>> + return NULL; >>>> + } >>>> + >>>> + buf->dma_buf = dbuf; >>>> + >>>> + return dbuf; >>>> +} > From airlied at gmail.com Tue May 22 12:34:38 2012 From: airlied at gmail.com (Dave Airlie) Date: Tue, 22 May 2012 13:34:38 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add initial vmap documentation Message-ID: <1337690078-5277-1-git-send-email-airlied@gmail.com> From: Dave Airlie Signed-off-by: Dave Airlie --- Documentation/dma-buf-sharing.txt | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 3bbd5c5..98e9fa0 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -300,6 +300,17 @@ Access to a dma_buf from the kernel context involves three steps: Note that these calls need to always succeed. The exporter needs to complete any preparations that might fail in begin_cpu_access. + For some circumstances the overhead of kmap can be too high, a vmap interface + is introduced. This interface shouldn't be used very carefully, as vmalloc + space is a limited resources on many architectures. + + Interfaces: + void *dma_buf_vmap(struct dma_buf *dmabuf) + void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) + + This call can fail if there is no vmap support in the exporter, or if it + runs out of vmalloc space. Fallback to kmap should be implemented. + 3. Finish access When the importer is done accessing the range specified in begin_cpu_access, -- 1.7.6 From chris at chris-wilson.co.uk Tue May 22 12:44:13 2012 From: chris at chris-wilson.co.uk (Chris Wilson) Date: Tue, 22 May 2012 13:44:13 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add initial vmap documentation In-Reply-To: <1337690078-5277-1-git-send-email-airlied@gmail.com> References: <1337690078-5277-1-git-send-email-airlied@gmail.com> Message-ID: <1337690665_180141@CP5-2952> On Tue, 22 May 2012 13:34:38 +0100, Dave Airlie wrote: > From: Dave Airlie Inline comment for one sentence that made no sense. > Signed-off-by: Dave Airlie > --- > Documentation/dma-buf-sharing.txt | 11 +++++++++++ > 1 files changed, 11 insertions(+), 0 deletions(-) > > diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt > index 3bbd5c5..98e9fa0 100644 > --- a/Documentation/dma-buf-sharing.txt > +++ b/Documentation/dma-buf-sharing.txt > @@ -300,6 +300,17 @@ Access to a dma_buf from the kernel context involves three steps: > Note that these calls need to always succeed. The exporter needs to complete > any preparations that might fail in begin_cpu_access. > > + For some circumstances the overhead of kmap can be too high, a vmap interface > + is introduced. This interface shouldn't be used very carefully, as vmalloc This interface should be used carefully. Sparingly would also be appropriate, though in less regular usage than carefully. > + space is a limited resources on many architectures. > + > + Interfaces: > + void *dma_buf_vmap(struct dma_buf *dmabuf) > + void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) > + > + This call can fail if there is no vmap support in the exporter, or if it > + runs out of vmalloc space. Fallback to kmap should be implemented. > + > 3. Finish access -- Chris Wilson, Intel Open Source Technology Centre From m.szyprowski at samsung.com Tue May 22 12:50:20 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Tue, 22 May 2012 14:50:20 +0200 Subject: [Linaro-mm-sig] [PATCHv2 4/4] ARM: dma-mapping: remove custom consistent dma region In-Reply-To: <4FBB4142.2070709@kernel.org> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> <4FBB4142.2070709@kernel.org> Message-ID: <000001cd3819$722aae30$56800a90$%szyprowski@samsung.com> Hello, On Tuesday, May 22, 2012 9:33 AM Minchan Kim wrote: > On 05/17/2012 07:54 PM, Marek Szyprowski wrote: > > > This patch changes dma-mapping subsystem to use generic vmalloc areas > > for all consistent dma allocations. This increases the total size limit > > of the consistent allocations and removes platform hacks and a lot of > > duplicated code. > > > > I like this patch very much! Thanks! > There are just small nitpicks below. > > > Atomic allocations are served from special pool preallocated on boot, > > becasue vmalloc areas cannot be reliably created in atomic context. > > > typo because > > > > > > Signed-off-by: Marek Szyprowski > > --- > > Documentation/kernel-parameters.txt | 4 + > > arch/arm/include/asm/dma-mapping.h | 2 +- > > arch/arm/mm/dma-mapping.c | 360 ++++++++++++++++------------------- > > 3 files changed, 171 insertions(+), 195 deletions(-) > > > > diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt > > index c1601e5..ba58f50 100644 > > --- a/Documentation/kernel-parameters.txt > > +++ b/Documentation/kernel-parameters.txt > > @@ -515,6 +515,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. > > a hypervisor. > > Default: yes > > > > + coherent_pool=nn[KMG] [ARM,KNL] > > + Sets the size of memory pool for coherent, atomic dma > > + allocations. > > + > > code_bytes [X86] How many bytes of object code to print > > in an oops report. > > Range: 0 - 8192 > > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h > > index cb3b7c9..92b0afb 100644 > > --- a/arch/arm/include/asm/dma-mapping.h > > +++ b/arch/arm/include/asm/dma-mapping.h > > @@ -210,7 +210,7 @@ int dma_mmap_writecombine(struct device *, struct vm_area_struct *, > > * DMA region above it's default value of 2MB. It must be called before the > > * memory allocator is initialised, i.e. before any core_initcall. > > */ > > -extern void __init init_consistent_dma_size(unsigned long size); > > +static inline void init_consistent_dma_size(unsigned long size) { } > > > > > > #ifdef CONFIG_DMABOUNCE > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > > index db23ae4..3be4de2 100644 > > --- a/arch/arm/mm/dma-mapping.c > > +++ b/arch/arm/mm/dma-mapping.c > > @@ -19,6 +19,8 @@ > > #include > > #include > > #include > > +#include > > +#include > > > > #include > > #include > > @@ -119,210 +121,178 @@ static void __dma_free_buffer(struct page *page, size_t size) > > } > > > > #ifdef CONFIG_MMU > > - > > -#define CONSISTENT_OFFSET(x) (((unsigned long)(x) - consistent_base) >> PAGE_SHIFT) > > -#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - consistent_base) >> PMD_SHIFT) > > - > > -/* > > - * These are the page tables (2MB each) covering uncached, DMA consistent allocations > > - */ > > -static pte_t **consistent_pte; > > - > > -#define DEFAULT_CONSISTENT_DMA_SIZE SZ_2M > > - > > -unsigned long consistent_base = CONSISTENT_END - DEFAULT_CONSISTENT_DMA_SIZE; > > - > > -void __init init_consistent_dma_size(unsigned long size) > > -{ > > - unsigned long base = CONSISTENT_END - ALIGN(size, SZ_2M); > > - > > - BUG_ON(consistent_pte); /* Check we're called before DMA region init */ > > - BUG_ON(base < VMALLOC_END); > > - > > - /* Grow region to accommodate specified size */ > > - if (base < consistent_base) > > - consistent_base = base; > > -} > > - > > -#include "vmregion.h" > > - > > -static struct arm_vmregion_head consistent_head = { > > - .vm_lock = __SPIN_LOCK_UNLOCKED(&consistent_head.vm_lock), > > - .vm_list = LIST_HEAD_INIT(consistent_head.vm_list), > > - .vm_end = CONSISTENT_END, > > -}; > > - > > #ifdef CONFIG_HUGETLB_PAGE > > #error ARM Coherent DMA allocator does not (yet) support huge TLB > > #endif > > > > -/* > > - * Initialise the consistent memory allocation. > > - */ > > -static int __init consistent_init(void) > > -{ > > - int ret = 0; > > - pgd_t *pgd; > > - pud_t *pud; > > - pmd_t *pmd; > > - pte_t *pte; > > - int i = 0; > > - unsigned long base = consistent_base; > > - unsigned long num_ptes = (CONSISTENT_END - base) >> PMD_SHIFT; > > - > > - consistent_pte = kmalloc(num_ptes * sizeof(pte_t), GFP_KERNEL); > > - if (!consistent_pte) { > > - pr_err("%s: no memory\n", __func__); > > - return -ENOMEM; > > - } > > - > > - pr_debug("DMA memory: 0x%08lx - 0x%08lx:\n", base, CONSISTENT_END); > > - consistent_head.vm_start = base; > > - > > - do { > > - pgd = pgd_offset(&init_mm, base); > > - > > - pud = pud_alloc(&init_mm, pgd, base); > > - if (!pud) { > > - printk(KERN_ERR "%s: no pud tables\n", __func__); > > - ret = -ENOMEM; > > - break; > > - } > > - > > - pmd = pmd_alloc(&init_mm, pud, base); > > - if (!pmd) { > > - printk(KERN_ERR "%s: no pmd tables\n", __func__); > > - ret = -ENOMEM; > > - break; > > - } > > - WARN_ON(!pmd_none(*pmd)); > > - > > - pte = pte_alloc_kernel(pmd, base); > > - if (!pte) { > > - printk(KERN_ERR "%s: no pte tables\n", __func__); > > - ret = -ENOMEM; > > - break; > > - } > > - > > - consistent_pte[i++] = pte; > > - base += PMD_SIZE; > > - } while (base < CONSISTENT_END); > > - > > - return ret; > > -} > > - > > -core_initcall(consistent_init); > > - > > static void * > > __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, > > const void *caller) > > { > > - struct arm_vmregion *c; > > - size_t align; > > - int bit; > > + struct vm_struct *area; > > + unsigned long addr; > > > > - if (!consistent_pte) { > > - printk(KERN_ERR "%s: not initialised\n", __func__); > > + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); > > Out of curiosity. > Do we always map dma area into user's address space? Nope, but there is always such possibility that the driver calls dma_mmap_*() and lets user space to access the memory allocated for dma. > > + if (!area) > > + return NULL; > > + addr = (unsigned long)area->addr; > > + area->phys_addr = __pfn_to_phys(page_to_pfn(page)); > > + > > + if (ioremap_page_range(addr, addr + size, area->phys_addr, prot)) { > > + vunmap((void *)addr); > > > + return NULL; > > > + } > > + return (void *)addr; > > +} > > + > > +static void __dma_free_remap(void *cpu_addr, size_t size) > > +{ > > + struct vm_struct *area; > > + > > + read_lock(&vmlist_lock); > > > Why do we need vmlist_lock? In fact, this one is not really needed here. I've copied it from arch/arm/mm/ioremap.c and then replaced search loop with find_vm_area(). Now I see that find_vm_area() uses internal locks in find_vmap_area(), so the global vmlist_lock is not needed here. > > + area = find_vm_area(cpu_addr); > > > find_vm_area only checks vmalloced regions so we need more check. > > if (!area || !(area->flags & VM_DMA)) > > > + if (!area) { > > + pr_err("%s: trying to free invalid coherent area: %p\n", > > + __func__, cpu_addr); > > + dump_stack(); > > + read_unlock(&vmlist_lock); > > + return; > > + } > > + unmap_kernel_range((unsigned long)cpu_addr, size); > > + read_unlock(&vmlist_lock); > > + vunmap(cpu_addr); > > +} > > + > > +struct dma_pool { > > + size_t size; > > + spinlock_t lock; > > + unsigned long *bitmap; > > + unsigned long count; > > > Nitpick. What does count mean? > nr_pages? Right, I will rename it to nr_pages as it is much better name. > > + void *vaddr; > > + struct page *page; > > +}; > > + > > +static struct dma_pool atomic_pool = { > > + .size = SZ_256K, > > +}; > > > AFAIUC, we could set it to 2M but you are reducing it to 256K. > What's the justification for that default value? I want to reduce memory waste. This atomic_pool is very rarely used (non-atomic allocations don't use this pool, kernel mappings are created on fly for them). The original consistent dma size on ARM was set to 2MiB, but it covered both atomic and non-atomic allocations. Some time ago (in the context of CMA/Contiguous Memory Allocator in Cambourne during Linaro MM meeting) we have discussed the idea of pool for the atomic allocations and the conclusion was that the 1/8 of the original consistent dma size is probably more than enough. This value can be adjusted later if really needed or set with kernel boot parameter for some rare systems that needs more. > > + > > +static int __init early_coherent_pool(char *p) > > +{ > > + atomic_pool.size = memparse(p, &p); > > + return 0; > > +} > > +early_param("coherent_pool", early_coherent_pool); > > + > > +/* > > + * Initialise the coherent pool for atomic allocations. > > + */ > > +static int __init atomic_pool_init(void) > > +{ > > + struct dma_pool *pool = &atomic_pool; > > + pgprot_t prot = pgprot_dmacoherent(pgprot_kernel); > > + unsigned long count = pool->size >> PAGE_SHIFT; > > + gfp_t gfp = GFP_KERNEL | GFP_DMA; > > + unsigned long *bitmap; > > + struct page *page; > > + void *ptr; > > + int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); > > + > > + bitmap = kzalloc(bitmap_size, GFP_KERNEL); > > + if (!bitmap) > > + goto no_bitmap; > > + > > + page = __dma_alloc_buffer(NULL, pool->size, gfp); > > + if (!page) > > + goto no_page; > > + > > + ptr = __dma_alloc_remap(page, pool->size, gfp, prot, NULL); > > + if (ptr) { > > + spin_lock_init(&pool->lock); > > + pool->vaddr = ptr; > > + pool->page = page; > > + pool->bitmap = bitmap; > > + pool->count = count; > > + pr_info("DMA: preallocated %u KiB pool for atomic coherent allocations\n", > > + (unsigned)pool->size / 1024); > > + return 0; > > + } > > + > > + __dma_free_buffer(page, pool->size); > > +no_page: > > + kfree(bitmap); > > +no_bitmap: > > + pr_err("DMA: failed to allocate %u KiB pool for atomic coherent allocation\n", > > + (unsigned)pool->size / 1024); > > + return -ENOMEM; > > +} > > +core_initcall(atomic_pool_init); > > + > > +static void *__alloc_from_pool(size_t size, struct page **ret_page) > > +{ > > + struct dma_pool *pool = &atomic_pool; > > + unsigned int count = size >> PAGE_SHIFT; > > + unsigned int pageno; > > + unsigned long flags; > > + void *ptr = NULL; > > + size_t align; > > + > > + if (!pool->vaddr) { > > + pr_err("%s: coherent pool not initialised!\n", __func__); > > dump_stack(); > > return NULL; > > } > > > > /* > > - * Align the virtual region allocation - maximum alignment is > > - * a section size, minimum is a page size. This helps reduce > > - * fragmentation of the DMA space, and also prevents allocations > > - * smaller than a section from crossing a section boundary. > > + * Align the region allocation - allocations from pool are rather > > + * small, so align them to their order in pages, minimum is a page > > + * size. This helps reduce fragmentation of the DMA space. > > */ > > - bit = fls(size - 1); > > - if (bit > SECTION_SHIFT) > > - bit = SECTION_SHIFT; > > - align = 1 << bit; > > + align = PAGE_SIZE << get_order(size); > > > > - /* > > - * Allocate a virtual address in the consistent mapping region. > > - */ > > - c = arm_vmregion_alloc(&consistent_head, align, size, > > - gfp & ~(__GFP_DMA | __GFP_HIGHMEM), caller); > > - if (c) { > > - pte_t *pte; > > - int idx = CONSISTENT_PTE_INDEX(c->vm_start); > > - u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); > > - > > - pte = consistent_pte[idx] + off; > > - c->vm_pages = page; > > - > > - do { > > - BUG_ON(!pte_none(*pte)); > > - > > - set_pte_ext(pte, mk_pte(page, prot), 0); > > - page++; > > - pte++; > > - off++; > > - if (off >= PTRS_PER_PTE) { > > - off = 0; > > - pte = consistent_pte[++idx]; > > - } > > - } while (size -= PAGE_SIZE); > > - > > - dsb(); > > - > > - return (void *)c->vm_start; > > + spin_lock_irqsave(&pool->lock, flags); > > + pageno = bitmap_find_next_zero_area(pool->bitmap, pool->count, > > + 0, count, (1 << align) - 1); > > + if (pageno < pool->count) { > > + bitmap_set(pool->bitmap, pageno, count); > > + ptr = pool->vaddr + PAGE_SIZE * pageno; > > + *ret_page = pool->page + pageno; > > } > > - return NULL; > > + spin_unlock_irqrestore(&pool->lock, flags); > > + > > + return ptr; > > } > > > > -static void __dma_free_remap(void *cpu_addr, size_t size) > > +static int __free_from_pool(void *start, size_t size) > > { > > - struct arm_vmregion *c; > > - unsigned long addr; > > - pte_t *ptep; > > - int idx; > > - u32 off; > > + struct dma_pool *pool = &atomic_pool; > > + unsigned long pageno, count; > > + unsigned long flags; > > > > - c = arm_vmregion_find_remove(&consistent_head, (unsigned long)cpu_addr); > > - if (!c) { > > - printk(KERN_ERR "%s: trying to free invalid coherent area: %p\n", > > - __func__, cpu_addr); > > + if (start < pool->vaddr || start > pool->vaddr + pool->size) > > + return 0; > > + > > + if (start + size > pool->vaddr + pool->size) { > > + pr_err("%s: freeing wrong coherent size from pool\n", __func__); > > dump_stack(); > > - return; > > + return 0; > > } > > > > - if ((c->vm_end - c->vm_start) != size) { > > - printk(KERN_ERR "%s: freeing wrong coherent size (%ld != %d)\n", > > - __func__, c->vm_end - c->vm_start, size); > > - dump_stack(); > > - size = c->vm_end - c->vm_start; > > - } > > + pageno = (start - pool->vaddr) >> PAGE_SHIFT; > > + count = size >> PAGE_SHIFT; > > > > - idx = CONSISTENT_PTE_INDEX(c->vm_start); > > - off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); > > - ptep = consistent_pte[idx] + off; > > - addr = c->vm_start; > > - do { > > - pte_t pte = ptep_get_and_clear(&init_mm, addr, ptep); > > + spin_lock_irqsave(&pool->lock, flags); > > + bitmap_clear(pool->bitmap, pageno, count); > > + spin_unlock_irqrestore(&pool->lock, flags); > > > > - ptep++; > > - addr += PAGE_SIZE; > > - off++; > > - if (off >= PTRS_PER_PTE) { > > - off = 0; > > - ptep = consistent_pte[++idx]; > > - } > > - > > - if (pte_none(pte) || !pte_present(pte)) > > - printk(KERN_CRIT "%s: bad page in kernel page table\n", > > - __func__); > > - } while (size -= PAGE_SIZE); > > - > > - flush_tlb_kernel_range(c->vm_start, c->vm_end); > > - > > - arm_vmregion_free(&consistent_head, c); > > + return 1; > > } > > > > #else /* !CONFIG_MMU */ > > > > #define __dma_alloc_remap(page, size, gfp, prot, c) page_address(page) > > #define __dma_free_remap(addr, size) do { } while (0) > > +#define __alloc_from_pool(size, ret_page) NULL > > +#define __free_from_pool(addr, size) 0 > > > > #endif /* CONFIG_MMU */ > > > > @@ -345,6 +315,16 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t > gfp, > > *handle = ~0; > > size = PAGE_ALIGN(size); > > > > + /* > > + * Atomic allocations need special handling > > + */ > > + if (gfp & GFP_ATOMIC && !arch_is_coherent()) { > > + addr = __alloc_from_pool(size, &page); > > + if (addr) > > + *handle = pfn_to_dma(dev, page_to_pfn(page)); > > + return addr; > > + } > > + > > page = __dma_alloc_buffer(dev, size, gfp); > > if (!page) > > return NULL; > > @@ -398,24 +378,16 @@ static int dma_mmap(struct device *dev, struct vm_area_struct *vma, > > { > > int ret = -ENXIO; > > #ifdef CONFIG_MMU > > - unsigned long user_size, kern_size; > > - struct arm_vmregion *c; > > + unsigned long user_count = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; > > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > > + unsigned long pfn = dma_to_pfn(dev, dma_addr); > > + unsigned long off = vma->vm_pgoff; > > > > - user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; > > - > > - c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); > > - if (c) { > > - unsigned long off = vma->vm_pgoff; > > - > > - kern_size = (c->vm_end - c->vm_start) >> PAGE_SHIFT; > > - > > - if (off < kern_size && > > - user_size <= (kern_size - off)) { > > - ret = remap_pfn_range(vma, vma->vm_start, > > - page_to_pfn(c->vm_pages) + off, > > - user_size << PAGE_SHIFT, > > - vma->vm_page_prot); > > - } > > + if (off < count && user_count <= (count - off)) { > > + ret = remap_pfn_range(vma, vma->vm_start, > > + pfn + off, > > + user_count << PAGE_SHIFT, > > + vma->vm_page_prot); > > } > > #endif /* CONFIG_MMU */ > > > > @@ -444,13 +416,16 @@ EXPORT_SYMBOL(dma_mmap_writecombine); > > */ > > void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle) > > { > > - WARN_ON(irqs_disabled()); > > - > > if (dma_release_from_coherent(dev, get_order(size), cpu_addr)) > > return; > > > > size = PAGE_ALIGN(size); > > > > + if (__free_from_pool(cpu_addr, size)) > > + return; > > + > > + WARN_ON(irqs_disabled()); > > + > > if (!arch_is_coherent()) > > __dma_free_remap(cpu_addr, size); > > > > @@ -726,9 +701,6 @@ EXPORT_SYMBOL(dma_set_mask); > > > > static int __init dma_debug_do_init(void) > > { > > -#ifdef CONFIG_MMU > > - arm_vmregion_create_proc("dma-mappings", &consistent_head); > > -#endif > > dma_debug_init(PREALLOC_DMA_DEBUG_ENTRIES); > > return 0; > > } Best regards -- Marek Szyprowski Samsung Poland R&D Center From t.stanislaws at samsung.com Tue May 22 13:47:12 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Tue, 22 May 2012 15:47:12 +0200 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf() In-Reply-To: <1331913881-13105-1-git-send-email-rob.clark@linaro.org> References: <1331913881-13105-1-git-send-email-rob.clark@linaro.org> Message-ID: <4FBB98E0.8040600@samsung.com> Hi, I think I discovered an interesting issue with dma_buf. I found out that dma_buf_fd does not increase reference count for dma_buf::file. This leads to potential kernel crash triggered by user space. Please, take a look on the scenario below: The applications spawns two thread. One of them is exporting DMABUF. Thread I | Thread II | Comments -----------------------+-------------------+----------------------------------- dbuf = dma_buf_export | | dma_buf is creates, refcount is 1 fd = dma_buf_fd(dbuf) | | assume fd is set to 42, refcount is still 1 | close(42) | The file descriptor is closed asynchronously, dbuf's refcount drops to 0 | dma_buf_release | dbuf structure is freed, dbuf becomes a dangling pointer int size = dbuf->size; | | the dbuf is dereferenced, causing a kernel crash -----------------------+-------------------+----------------------------------- I think that the problem could be fixed in two ways. a) forcing driver developer to call get_dma_buf just before calling dma_buf_fd. b) increasing dma_buf->file's reference count at dma_buf_fd I prefer solution (b) because it prevents symmetry between dma_buf_fd and close. I mean that dma_buf_fd increases reference count, close decreases it. What is your opinion about the issue? Regards, Tomasz Stanislawski On 03/16/2012 05:04 PM, Rob Clark wrote: > From: Rob Clark > > Works in a similar way to get_file(), and is needed in cases such as > when the exporter needs to also keep a reference to the dmabuf (that > is later released with a dma_buf_put()), and possibly other similar > cases. > > Signed-off-by: Rob Clark > --- From daniel at ffwll.ch Tue May 22 14:32:34 2012 From: daniel at ffwll.ch (Daniel Vetter) Date: Tue, 22 May 2012 16:32:34 +0200 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf() In-Reply-To: <4FBB98E0.8040600@samsung.com> References: <1331913881-13105-1-git-send-email-rob.clark@linaro.org> <4FBB98E0.8040600@samsung.com> Message-ID: <20120522143234.GC4629@phenom.ffwll.local> On Tue, May 22, 2012 at 03:47:12PM +0200, Tomasz Stanislawski wrote: > Hi, > I think I discovered an interesting issue with dma_buf. > I found out that dma_buf_fd does not increase reference > count for dma_buf::file. This leads to potential kernel > crash triggered by user space. Please, take a look on > the scenario below: > > The applications spawns two thread. One of them is exporting DMABUF. > > Thread I | Thread II | Comments > -----------------------+-------------------+----------------------------------- > dbuf = dma_buf_export | | dma_buf is creates, refcount is 1 > fd = dma_buf_fd(dbuf) | | assume fd is set to 42, refcount is still 1 > | close(42) | The file descriptor is closed asynchronously, dbuf's refcount drops to 0 > | dma_buf_release | dbuf structure is freed, dbuf becomes a dangling pointer > int size = dbuf->size; | | the dbuf is dereferenced, causing a kernel crash > -----------------------+-------------------+----------------------------------- > > I think that the problem could be fixed in two ways. > a) forcing driver developer to call get_dma_buf just before calling dma_buf_fd. > b) increasing dma_buf->file's reference count at dma_buf_fd > > I prefer solution (b) because it prevents symmetry between dma_buf_fd and close. > I mean that dma_buf_fd increases reference count, close decreases it. > > What is your opinion about the issue? I guess most exporters would like to hang onto the exported dma_buf a bit and hence need a reference (e.g. to cache the dma_buf as long as the underlying buffer object exists). So I guess we can change the semantics of dma_buf_fd from transferring the reference you currently have (and hence forbidding any further access by the caller) to grabbing a reference of it's on for the fd that is created. -Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48 From t.stanislaws at samsung.com Tue May 22 15:00:31 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Tue, 22 May 2012 17:00:31 +0200 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf() In-Reply-To: <20120522143234.GC4629@phenom.ffwll.local> References: <1331913881-13105-1-git-send-email-rob.clark@linaro.org> <4FBB98E0.8040600@samsung.com> <20120522143234.GC4629@phenom.ffwll.local> Message-ID: <4FBBAA0F.6090503@samsung.com> On 05/22/2012 04:32 PM, Daniel Vetter wrote: > On Tue, May 22, 2012 at 03:47:12PM +0200, Tomasz Stanislawski wrote: >> Hi, >> I think I discovered an interesting issue with dma_buf. >> I found out that dma_buf_fd does not increase reference >> count for dma_buf::file. This leads to potential kernel >> crash triggered by user space. Please, take a look on >> the scenario below: >> >> The applications spawns two thread. One of them is exporting DMABUF. >> >> Thread I | Thread II | Comments >> -----------------------+-------------------+----------------------------------- >> dbuf = dma_buf_export | | dma_buf is creates, refcount is 1 >> fd = dma_buf_fd(dbuf) | | assume fd is set to 42, refcount is still 1 >> | close(42) | The file descriptor is closed asynchronously, dbuf's refcount drops to 0 >> | dma_buf_release | dbuf structure is freed, dbuf becomes a dangling pointer >> int size = dbuf->size; | | the dbuf is dereferenced, causing a kernel crash >> -----------------------+-------------------+----------------------------------- >> >> I think that the problem could be fixed in two ways. >> a) forcing driver developer to call get_dma_buf just before calling dma_buf_fd. >> b) increasing dma_buf->file's reference count at dma_buf_fd >> >> I prefer solution (b) because it prevents symmetry between dma_buf_fd and close. >> I mean that dma_buf_fd increases reference count, close decreases it. >> >> What is your opinion about the issue? > > I guess most exporters would like to hang onto the exported dma_buf a bit > and hence need a reference (e.g. to cache the dma_buf as long as the > underlying buffer object exists). So I guess we can change the semantics > of dma_buf_fd from transferring the reference you currently have (and > hence forbidding any further access by the caller) to grabbing a reference > of it's on for the fd that is created. > -Daniel Hi Daniel, Would it be simpler, safer and more intuitive if dma_buf_fd increased dmabuf->file's reference counter? Regards, Tomasz Stanislawski From daniel at ffwll.ch Tue May 22 15:05:29 2012 From: daniel at ffwll.ch (Daniel Vetter) Date: Tue, 22 May 2012 17:05:29 +0200 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf() In-Reply-To: <4FBBAA0F.6090503@samsung.com> References: <1331913881-13105-1-git-send-email-rob.clark@linaro.org> <4FBB98E0.8040600@samsung.com> <20120522143234.GC4629@phenom.ffwll.local> <4FBBAA0F.6090503@samsung.com> Message-ID: On Tue, May 22, 2012 at 5:00 PM, Tomasz Stanislawski wrote: > On 05/22/2012 04:32 PM, Daniel Vetter wrote: >> On Tue, May 22, 2012 at 03:47:12PM +0200, Tomasz Stanislawski wrote: >>> Hi, >>> I think I discovered an interesting issue with dma_buf. >>> I found out that dma_buf_fd does not increase reference >>> count for dma_buf::file. This leads to potential kernel >>> crash triggered by user space. Please, take a look on >>> the scenario below: >>> >>> The applications spawns two thread. One of them is exporting DMABUF. >>> >>> ? ? ? Thread I ? ? ? ? | ? Thread II ? ? ? | Comments >>> -----------------------+-------------------+----------------------------------- >>> dbuf = dma_buf_export ?| ? ? ? ? ? ? ? ? ? | dma_buf is creates, refcount is 1 >>> fd = dma_buf_fd(dbuf) ?| ? ? ? ? ? ? ? ? ? | assume fd is set to 42, refcount is still 1 >>> ? ? ? ? ? ? ? ? ? ? ? ?| ? ? ?close(42) ? ?| The file descriptor is closed asynchronously, dbuf's refcount drops to 0 >>> ? ? ? ? ? ? ? ? ? ? ? ?| ?dma_buf_release ?| dbuf structure is freed, dbuf becomes a dangling pointer >>> int size = dbuf->size; | ? ? ? ? ? ? ? ? ? | the dbuf is dereferenced, causing a kernel crash >>> -----------------------+-------------------+----------------------------------- >>> >>> I think that the problem could be fixed in two ways. >>> a) forcing driver developer to call get_dma_buf just before calling dma_buf_fd. >>> b) increasing dma_buf->file's reference count at dma_buf_fd >>> >>> I prefer solution (b) because it prevents symmetry between dma_buf_fd and close. >>> I mean that dma_buf_fd increases reference count, close decreases it. >>> >>> What is your opinion about the issue? >> >> I guess most exporters would like to hang onto the exported dma_buf a bit >> and hence need a reference (e.g. to cache the dma_buf as long as the >> underlying buffer object exists). So I guess we can change the semantics >> of dma_buf_fd from transferring the reference you currently have (and >> hence forbidding any further access by the caller) to grabbing a reference >> of it's on for the fd that is created. >> -Daniel > > Hi Daniel, > Would it be simpler, safer and more intuitive if dma_buf_fd increased > dmabuf->file's reference counter? That's actually what I wanted to say. Message seems to have been lost in transit ;-) -Daniel -- Daniel Vetter daniel.vetter at ffwll.ch - +41 (0) 79 364 57 48 - http://blog.ffwll.ch From airlied at gmail.com Tue May 22 15:13:15 2012 From: airlied at gmail.com (Dave Airlie) Date: Tue, 22 May 2012 16:13:15 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf() In-Reply-To: References: <1331913881-13105-1-git-send-email-rob.clark@linaro.org> <4FBB98E0.8040600@samsung.com> <20120522143234.GC4629@phenom.ffwll.local> <4FBBAA0F.6090503@samsung.com> Message-ID: On Tue, May 22, 2012 at 4:05 PM, Daniel Vetter wrote: > On Tue, May 22, 2012 at 5:00 PM, Tomasz Stanislawski > wrote: >> On 05/22/2012 04:32 PM, Daniel Vetter wrote: >>> On Tue, May 22, 2012 at 03:47:12PM +0200, Tomasz Stanislawski wrote: >>>> Hi, >>>> I think I discovered an interesting issue with dma_buf. >>>> I found out that dma_buf_fd does not increase reference >>>> count for dma_buf::file. This leads to potential kernel >>>> crash triggered by user space. Please, take a look on >>>> the scenario below: >>>> >>>> The applications spawns two thread. One of them is exporting DMABUF. >>>> >>>> ? ? ? Thread I ? ? ? ? | ? Thread II ? ? ? | Comments >>>> -----------------------+-------------------+----------------------------------- >>>> dbuf = dma_buf_export ?| ? ? ? ? ? ? ? ? ? | dma_buf is creates, refcount is 1 >>>> fd = dma_buf_fd(dbuf) ?| ? ? ? ? ? ? ? ? ? | assume fd is set to 42, refcount is still 1 >>>> ? ? ? ? ? ? ? ? ? ? ? ?| ? ? ?close(42) ? ?| The file descriptor is closed asynchronously, dbuf's refcount drops to 0 >>>> ? ? ? ? ? ? ? ? ? ? ? ?| ?dma_buf_release ?| dbuf structure is freed, dbuf becomes a dangling pointer >>>> int size = dbuf->size; | ? ? ? ? ? ? ? ? ? | the dbuf is dereferenced, causing a kernel crash >>>> -----------------------+-------------------+----------------------------------- >>>> >>>> I think that the problem could be fixed in two ways. >>>> a) forcing driver developer to call get_dma_buf just before calling dma_buf_fd. >>>> b) increasing dma_buf->file's reference count at dma_buf_fd >>>> >>>> I prefer solution (b) because it prevents symmetry between dma_buf_fd and close. >>>> I mean that dma_buf_fd increases reference count, close decreases it. >>>> >>>> What is your opinion about the issue? >>> >>> I guess most exporters would like to hang onto the exported dma_buf a bit >>> and hence need a reference (e.g. to cache the dma_buf as long as the >>> underlying buffer object exists). So I guess we can change the semantics >>> of dma_buf_fd from transferring the reference you currently have (and >>> hence forbidding any further access by the caller) to grabbing a reference >>> of it's on for the fd that is created. >>> -Daniel >> >> Hi Daniel, >> Would it be simpler, safer and more intuitive if dma_buf_fd increased >> dmabuf->file's reference counter? > > That's actually what I wanted to say. Message seems to have been lost > in transit ;-) Now I've thought about it and Tomasz has pointed it out I agree, Now we just have to work out when to drop that reference, which I don't see anyone addressing :-) I love lifetime rules. Dave. From rob.clark at linaro.org Tue May 22 17:37:48 2012 From: rob.clark at linaro.org (Rob Clark) Date: Tue, 22 May 2012 11:37:48 -0600 Subject: [Linaro-mm-sig] [PATCH] dma-buf: add get_dma_buf() In-Reply-To: References: <1331913881-13105-1-git-send-email-rob.clark@linaro.org> <4FBB98E0.8040600@samsung.com> <20120522143234.GC4629@phenom.ffwll.local> <4FBBAA0F.6090503@samsung.com> Message-ID: On Tue, May 22, 2012 at 9:13 AM, Dave Airlie wrote: > On Tue, May 22, 2012 at 4:05 PM, Daniel Vetter wrote: >> On Tue, May 22, 2012 at 5:00 PM, Tomasz Stanislawski >> wrote: >>> On 05/22/2012 04:32 PM, Daniel Vetter wrote: >>>> On Tue, May 22, 2012 at 03:47:12PM +0200, Tomasz Stanislawski wrote: >>>>> Hi, >>>>> I think I discovered an interesting issue with dma_buf. >>>>> I found out that dma_buf_fd does not increase reference >>>>> count for dma_buf::file. This leads to potential kernel >>>>> crash triggered by user space. Please, take a look on >>>>> the scenario below: >>>>> >>>>> The applications spawns two thread. One of them is exporting DMABUF. >>>>> >>>>> ? ? ? Thread I ? ? ? ? | ? Thread II ? ? ? | Comments >>>>> -----------------------+-------------------+----------------------------------- >>>>> dbuf = dma_buf_export ?| ? ? ? ? ? ? ? ? ? | dma_buf is creates, refcount is 1 >>>>> fd = dma_buf_fd(dbuf) ?| ? ? ? ? ? ? ? ? ? | assume fd is set to 42, refcount is still 1 >>>>> ? ? ? ? ? ? ? ? ? ? ? ?| ? ? ?close(42) ? ?| The file descriptor is closed asynchronously, dbuf's refcount drops to 0 >>>>> ? ? ? ? ? ? ? ? ? ? ? ?| ?dma_buf_release ?| dbuf structure is freed, dbuf becomes a dangling pointer >>>>> int size = dbuf->size; | ? ? ? ? ? ? ? ? ? | the dbuf is dereferenced, causing a kernel crash >>>>> -----------------------+-------------------+----------------------------------- >>>>> >>>>> I think that the problem could be fixed in two ways. >>>>> a) forcing driver developer to call get_dma_buf just before calling dma_buf_fd. >>>>> b) increasing dma_buf->file's reference count at dma_buf_fd >>>>> >>>>> I prefer solution (b) because it prevents symmetry between dma_buf_fd and close. >>>>> I mean that dma_buf_fd increases reference count, close decreases it. >>>>> >>>>> What is your opinion about the issue? >>>> >>>> I guess most exporters would like to hang onto the exported dma_buf a bit >>>> and hence need a reference (e.g. to cache the dma_buf as long as the >>>> underlying buffer object exists). So I guess we can change the semantics >>>> of dma_buf_fd from transferring the reference you currently have (and >>>> hence forbidding any further access by the caller) to grabbing a reference >>>> of it's on for the fd that is created. >>>> -Daniel >>> >>> Hi Daniel, >>> Would it be simpler, safer and more intuitive if dma_buf_fd increased >>> dmabuf->file's reference counter? >> >> That's actually what I wanted to say. Message seems to have been lost >> in transit ;-) > > Now I've thought about it and Tomasz has pointed it out I agree, > > Now we just have to work out when to drop that reference, which I > don't see anyone addressing :-) > > I love lifetime rules. I think in the GEM case, where we keep a ref in obj->export_dma_buf, we can drop the extra ref to the dmabuf (if we decide dma_buf_fd() increases the refcnt), as long as we be sure to NULL out obj->export_dma_buf from dmabuf_ops->release (which is unfortunately in each driver at the moment).. This way obj->export_dma_buf behaves as a sort of weak-reference, not causing a memory leak. BR, -R > Dave. From sumit.semwal at ti.com Wed May 23 10:08:48 2012 From: sumit.semwal at ti.com (Sumit Semwal) Date: Wed, 23 May 2012 15:38:48 +0530 Subject: [Linaro-mm-sig] [PATCH] dma-buf: minor documentation fixes. Message-ID: <1337767728-32236-1-git-send-email-sumit.semwal@ti.com> Some minor inline documentation fixes for gaps resulting from new patches. Signed-off-by: Sumit Semwal --- drivers/base/dma-buf.c | 9 +++++---- include/linux/dma-buf.h | 3 +++ 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index c3c88b0..24e88fe 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -429,7 +429,7 @@ EXPORT_SYMBOL_GPL(dma_buf_kunmap); /** * dma_buf_mmap - Setup up a userspace mmap with the given vma - * @dma_buf: [in] buffer that should back the vma + * @dmabuf: [in] buffer that should back the vma * @vma: [in] vma for the mmap * @pgoff: [in] offset in pages where this mmap should start within the * dma-buf buffer. @@ -470,8 +470,9 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, EXPORT_SYMBOL_GPL(dma_buf_mmap); /** - * dma_buf_vmap - Create virtual mapping for the buffer object into kernel address space. Same restrictions as for vmap and friends apply. - * @dma_buf: [in] buffer to vmap + * dma_buf_vmap - Create virtual mapping for the buffer object into kernel + * address space. Same restrictions as for vmap and friends apply. + * @dmabuf: [in] buffer to vmap * * This call may fail due to lack of virtual mapping address space. * These calls are optional in drivers. The intended use for them @@ -491,7 +492,7 @@ EXPORT_SYMBOL_GPL(dma_buf_vmap); /** * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. - * @dma_buf: [in] buffer to vmap + * @dmabuf: [in] buffer to vunmap */ void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) { diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index a02b1ff..eb48f38 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -65,6 +65,9 @@ struct dma_buf_attachment; * mapping needs to be coherent - if the exporter doesn't directly * support this, it needs to fake coherency by shooting down any ptes * when transitioning away from the cpu domain. + * @vmap: [optional] creates a virtual mapping for the buffer into kernel + * address space. Same restrictions as for vmap and friends apply. + * @vunmap: [optional] unmaps a vmap from the buffer */ struct dma_buf_ops { int (*attach)(struct dma_buf *, struct device *, -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:14 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:14 +0200 Subject: [Linaro-mm-sig] [PATCHv6 00/13] Integration of videobuf2 with dmabuf Message-ID: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Hello everyone, This patchset adds support for DMABUF [2] importing to V4L2 stack. The support for DMABUF exporting was moved to separate patchset due to dependency on patches for DMA mapping redesign by Marek Szyprowski [4]. v6: - fixed missing entry in v4l2_memory_names - fixed a bug occuring after get_user_pages failure - fixed a bug caused by using invalid vma for get_user_pages - prepare/finish no longer call dma_sync for dmabuf buffers v5: - removed change of importer/exporter behaviour - fixes vb2_dc_pages_to_sgt basing on Laurent's hints - changed pin/unpin words to lock/unlock in Doc v4: - rebased on mainline 3.4-rc2 - included missing importing support for s5p-fimc and s5p-tv - added patch for changing map/unmap for importers - fixes to Documentation part - coding style fixes - pairing {map/unmap}_dmabuf in vb2-core - fixing variable types and semantic of arguments in videobufb2-dma-contig.c v3: - rebased on mainline 3.4-rc1 - split 'code refactor' patch to multiple smaller patches - squashed fixes to Sumit's patches - patchset is no longer dependant on 'DMA mapping redesign' - separated path for handling IO and non-IO mappings - add documentation for DMABUF importing to V4L - removed all DMABUF exporter related code - removed usage of dma_get_pages extension v2: - extended VIDIOC_EXPBUF argument from integer memoffset to struct v4l2_exportbuffer - added patch that breaks DMABUF spec on (un)map_atachment callcacks but allows to work with existing implementation of DMABUF prime in DRM - all dma-contig code refactoring patches were squashed - bugfixes v1: List of changes since [1]. - support for DMA api extension dma_get_pages, the function is used to retrieve pages used to create DMA mapping. - small fixes/code cleanup to videobuf2 - added prepare and finish callbacks to vb2 allocators, it is used keep consistency between dma-cpu acess to the memory (by Marek Szyprowski) - support for exporting of DMABUF buffer in V4L2 and Videobuf2, originated from [3]. - support for dma-buf exporting in vb2-dma-contig allocator - support for DMABUF for s5p-tv and s5p-fimc (capture interface) drivers, originated from [3] - changed handling for userptr buffers (by Marek Szyprowski, Andrzej Pietrasiewicz) - let mmap method to use dma_mmap_writecombine call (by Marek Szyprowski) [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/42966/focus=42968 [2] https://lkml.org/lkml/2011/12/26/29 [3] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/36354/focus=36355 [4] http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 Laurent Pinchart (2): v4l: vb2-dma-contig: Shorten vb2_dma_contig prefix to vb2_dc v4l: vb2-dma-contig: Reorder functions Marek Szyprowski (2): v4l: vb2: add prepare/finish callbacks to allocators v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator Sumit Semwal (4): v4l: Add DMABUF as a memory type v4l: vb2: add support for shared buffer (dma_buf) v4l: vb: remove warnings about MEMORY_DMABUF v4l: vb2-dma-contig: add support for dma_buf importing Tomasz Stanislawski (5): Documentation: media: description of DMABUF importing in V4L2 v4l: vb2-dma-contig: Remove unneeded allocation context structure v4l: vb2-dma-contig: add support for scatterlist in userptr mode v4l: s5p-tv: mixer: support for dmabuf importing v4l: s5p-fimc: support for dmabuf importing Documentation/DocBook/media/v4l/compat.xml | 4 + Documentation/DocBook/media/v4l/io.xml | 179 +++++++ .../DocBook/media/v4l/vidioc-create-bufs.xml | 1 + Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 + Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 45 +- drivers/media/video/s5p-fimc/Kconfig | 1 + drivers/media/video/s5p-fimc/fimc-capture.c | 2 +- drivers/media/video/s5p-tv/Kconfig | 1 + drivers/media/video/s5p-tv/mixer_video.c | 2 +- drivers/media/video/v4l2-ioctl.c | 1 + drivers/media/video/videobuf-core.c | 4 + drivers/media/video/videobuf2-core.c | 207 +++++++- drivers/media/video/videobuf2-dma-contig.c | 520 +++++++++++++++++--- include/linux/videodev2.h | 7 + include/media/videobuf2-core.h | 34 ++ 15 files changed, 924 insertions(+), 99 deletions(-) -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:18 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:18 +0200 Subject: [Linaro-mm-sig] [PATCHv6 04/13] v4l: vb: remove warnings about MEMORY_DMABUF In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-5-git-send-email-t.stanislaws@samsung.com> From: Sumit Semwal Adding DMABUF memory type causes videobuf to complain about not using it in some switch cases. This patch removes these warnings. Signed-off-by: Sumit Semwal Acked-by: Laurent Pinchart --- drivers/media/video/videobuf-core.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/media/video/videobuf-core.c b/drivers/media/video/videobuf-core.c index ffdf59c..3e3e55f 100644 --- a/drivers/media/video/videobuf-core.c +++ b/drivers/media/video/videobuf-core.c @@ -335,6 +335,9 @@ static void videobuf_status(struct videobuf_queue *q, struct v4l2_buffer *b, case V4L2_MEMORY_OVERLAY: b->m.offset = vb->boff; break; + case V4L2_MEMORY_DMABUF: + /* DMABUF is not handled in videobuf framework */ + break; } b->flags = 0; @@ -411,6 +414,7 @@ int __videobuf_mmap_setup(struct videobuf_queue *q, break; case V4L2_MEMORY_USERPTR: case V4L2_MEMORY_OVERLAY: + case V4L2_MEMORY_DMABUF: /* nothing */ break; } -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:15 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:15 +0200 Subject: [Linaro-mm-sig] [PATCHv6 01/13] v4l: Add DMABUF as a memory type In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-2-git-send-email-t.stanislaws@samsung.com> From: Sumit Semwal Adds DMABUF memory type to v4l framework. Also adds the related file descriptor in v4l2_plane and v4l2_buffer. Signed-off-by: Tomasz Stanislawski [original work in the PoC for buffer sharing] Signed-off-by: Sumit Semwal Signed-off-by: Sumit Semwal Acked-by: Laurent Pinchart --- drivers/media/video/v4l2-ioctl.c | 1 + include/linux/videodev2.h | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/drivers/media/video/v4l2-ioctl.c b/drivers/media/video/v4l2-ioctl.c index 91be4e8..31fc2ad 100644 --- a/drivers/media/video/v4l2-ioctl.c +++ b/drivers/media/video/v4l2-ioctl.c @@ -175,6 +175,7 @@ static const char *v4l2_memory_names[] = { [V4L2_MEMORY_MMAP] = "mmap", [V4L2_MEMORY_USERPTR] = "userptr", [V4L2_MEMORY_OVERLAY] = "overlay", + [V4L2_MEMORY_DMABUF] = "dmabuf", }; #define prt_names(a, arr) ((((a) >= 0) && ((a) < ARRAY_SIZE(arr))) ? \ diff --git a/include/linux/videodev2.h b/include/linux/videodev2.h index 370d111..51b20f4 100644 --- a/include/linux/videodev2.h +++ b/include/linux/videodev2.h @@ -185,6 +185,7 @@ enum v4l2_memory { V4L2_MEMORY_MMAP = 1, V4L2_MEMORY_USERPTR = 2, V4L2_MEMORY_OVERLAY = 3, + V4L2_MEMORY_DMABUF = 4, }; /* see also http://vektor.theorem.ca/graphics/ycbcr/ */ @@ -591,6 +592,8 @@ struct v4l2_requestbuffers { * should be passed to mmap() called on the video node) * @userptr: when memory is V4L2_MEMORY_USERPTR, a userspace pointer * pointing to this plane + * @fd: when memory is V4L2_MEMORY_DMABUF, a userspace file + * descriptor associated with this plane * @data_offset: offset in the plane to the start of data; usually 0, * unless there is a header in front of the data * @@ -605,6 +608,7 @@ struct v4l2_plane { union { __u32 mem_offset; unsigned long userptr; + int fd; } m; __u32 data_offset; __u32 reserved[11]; @@ -629,6 +633,8 @@ struct v4l2_plane { * (or a "cookie" that should be passed to mmap() as offset) * @userptr: for non-multiplanar buffers with memory == V4L2_MEMORY_USERPTR; * a userspace pointer pointing to this buffer + * @fd: for non-multiplanar buffers with memory == V4L2_MEMORY_DMABUF; + * a userspace file descriptor associated with this buffer * @planes: for multiplanar buffers; userspace pointer to the array of plane * info structs for this buffer * @length: size in bytes of the buffer (NOT its payload) for single-plane @@ -655,6 +661,7 @@ struct v4l2_buffer { __u32 offset; unsigned long userptr; struct v4l2_plane *planes; + int fd; } m; __u32 length; __u32 input; -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:20 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:20 +0200 Subject: [Linaro-mm-sig] [PATCHv6 06/13] v4l: vb2-dma-contig: Remove unneeded allocation context structure In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-7-git-send-email-t.stanislaws@samsung.com> vb2-dma-contig returns a vb2_dc_conf structure instance as the vb2 allocation context. That structure only stores a pointer to the physical device. Remove it and use the device pointer directly as the allocation context. Signed-off-by: Tomasz Stanislawski Acked-by: Laurent Pinchart --- drivers/media/video/videobuf2-dma-contig.c | 29 +++++++--------------------- 1 file changed, 7 insertions(+), 22 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index a05784f..a019cd1 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -18,12 +18,8 @@ #include #include -struct vb2_dc_conf { - struct device *dev; -}; - struct vb2_dc_buf { - struct vb2_dc_conf *conf; + struct device *dev; void *vaddr; dma_addr_t dma_addr; unsigned long size; @@ -36,23 +32,21 @@ static void vb2_dc_put(void *buf_priv); static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) { - struct vb2_dc_conf *conf = alloc_ctx; + struct device *dev = alloc_ctx; struct vb2_dc_buf *buf; buf = kzalloc(sizeof *buf, GFP_KERNEL); if (!buf) return ERR_PTR(-ENOMEM); - buf->vaddr = dma_alloc_coherent(conf->dev, size, &buf->dma_addr, - GFP_KERNEL); + buf->vaddr = dma_alloc_coherent(dev, size, &buf->dma_addr, GFP_KERNEL); if (!buf->vaddr) { - dev_err(conf->dev, "dma_alloc_coherent of size %ld failed\n", - size); + dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size); kfree(buf); return ERR_PTR(-ENOMEM); } - buf->conf = conf; + buf->dev = dev; buf->size = size; buf->handler.refcount = &buf->refcount; @@ -69,7 +63,7 @@ static void vb2_dc_put(void *buf_priv) struct vb2_dc_buf *buf = buf_priv; if (atomic_dec_and_test(&buf->refcount)) { - dma_free_coherent(buf->conf->dev, buf->size, buf->vaddr, + dma_free_coherent(buf->dev, buf->size, buf->vaddr, buf->dma_addr); kfree(buf); } @@ -163,21 +157,12 @@ EXPORT_SYMBOL_GPL(vb2_dma_contig_memops); void *vb2_dma_contig_init_ctx(struct device *dev) { - struct vb2_dc_conf *conf; - - conf = kzalloc(sizeof *conf, GFP_KERNEL); - if (!conf) - return ERR_PTR(-ENOMEM); - - conf->dev = dev; - - return conf; + return dev; } EXPORT_SYMBOL_GPL(vb2_dma_contig_init_ctx); void vb2_dma_contig_cleanup_ctx(void *alloc_ctx) { - kfree(alloc_ctx); } EXPORT_SYMBOL_GPL(vb2_dma_contig_cleanup_ctx); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:16 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:16 +0200 Subject: [Linaro-mm-sig] [PATCHv6 02/13] Documentation: media: description of DMABUF importing in V4L2 In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-3-git-send-email-t.stanislaws@samsung.com> This patch adds description and usage examples for importing DMABUF file descriptor in V4L2. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park CC: linux-doc at vger.kernel.org --- Documentation/DocBook/media/v4l/compat.xml | 4 + Documentation/DocBook/media/v4l/io.xml | 179 ++++++++++++++++++++ .../DocBook/media/v4l/vidioc-create-bufs.xml | 1 + Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 ++ Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 45 ++--- 5 files changed, 223 insertions(+), 21 deletions(-) diff --git a/Documentation/DocBook/media/v4l/compat.xml b/Documentation/DocBook/media/v4l/compat.xml index ea42ef8..07a311f 100644 --- a/Documentation/DocBook/media/v4l/compat.xml +++ b/Documentation/DocBook/media/v4l/compat.xml @@ -2587,6 +2587,10 @@ ioctls. V4L2_CID_AUTO_FOCUS_AREA control. + + Importing DMABUF file descriptors as a new IO method described + in . + diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index fd6aca2..521f699 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -472,6 +472,162 @@ rest should be evident. +
+ Streaming I/O (DMA buffer importing) + + + Experimental + This is an experimental + interface and may change in the future. + + +The DMABUF framework provides a generic mean for sharing buffers between + multiple devices. Device drivers that support DMABUF can export a DMA buffer +to userspace as a file descriptor (known as the exporter role), import a DMA +buffer from userspace using a file descriptor previously exported for a +different or the same device (known as the importer role), or both. This +section describes the DMABUF importer role API in V4L2. + +Input and output devices support the streaming I/O method when the +V4L2_CAP_STREAMING flag in the +capabilities field of &v4l2-capability; returned by +the &VIDIOC-QUERYCAP; ioctl is set. Whether importing DMA buffers through +DMABUF file descriptors is supported is determined by calling the +&VIDIOC-REQBUFS; ioctl with the memory type set to +V4L2_MEMORY_DMABUF. + + This I/O method is dedicated for sharing DMA buffers between V4L and +other APIs. Buffers (planes) are allocated by a driver on behalf of the +application, and exported to the application as file descriptors using an API +specific to the allocator driver. Only those file descriptor are exchanged, +these files and meta-information are passed in &v4l2-buffer; (or in +&v4l2-plane; in the multi-planar API case). The driver must be switched into +DMABUF I/O mode by calling the &VIDIOC-REQBUFS; with the desired buffer type. +No buffers (planes) are allocated beforehand, consequently they are not indexed +and cannot be queried like mapped buffers with the +VIDIOC_QUERYBUF ioctl. + + + Initiating streaming I/O with DMABUF file descriptors + + +&v4l2-requestbuffers; reqbuf; + +memset (&reqbuf, 0, sizeof (reqbuf)); +reqbuf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; +reqbuf.memory = V4L2_MEMORY_DMABUF; + +if (ioctl (fd, &VIDIOC-REQBUFS;, &reqbuf) == -1) { + if (errno == EINVAL) + printf ("Video capturing or DMABUF streaming is not supported\n"); + else + perror ("VIDIOC_REQBUFS"); + + exit (EXIT_FAILURE); +} + + + + Buffer (plane) file is passed on the fly with the &VIDIOC-QBUF; +ioctl. In case of multiplanar buffers, every plane can be associated with a +different DMABUF descriptor.Although buffers are commonly cycled, applications +can pass different DMABUF descriptor at each VIDIOC_QBUF +call. + + + Queueing DMABUF using single plane API + + +int buffer_queue(int v4lfd, int index, int dmafd) +{ + &v4l2-buffer; buf; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.fd = dmafd; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + + + + + Queueing DMABUF using multi plane API + + +int buffer_queue_mp(int v4lfd, int index, int dmafd[], int n_planes) +{ + &v4l2-buffer; buf; + &v4l2-plane; planes[VIDEO_MAX_PLANES]; + int i; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.planes = planes; + buf.length = n_planes; + + memset(&planes, 0, sizeof planes); + + for (i = 0; i < n_planes; ++i) + buf.m.planes[i].m.fd = dmafd[i]; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + + + + Filled or displayed buffers are dequeued with the +&VIDIOC-DQBUF; ioctl. The driver can unlock the buffer at any +time between the completion of the DMA and this ioctl. The memory is +also unlocked when &VIDIOC-STREAMOFF; is called, &VIDIOC-REQBUFS;, or +when the device is closed. + + For capturing applications it is customary to enqueue a +number of empty buffers, to start capturing and enter the read loop. +Here the application waits until a filled buffer can be dequeued, and +re-enqueues the buffer when the data is no longer needed. Output +applications fill and enqueue buffers, when enough buffers are stacked +up output is started. In the write loop, when the application +runs out of free buffers it must wait until an empty buffer can be +dequeued and reused. Two methods exist to suspend execution of the +application until one or more buffers can be dequeued. By default +VIDIOC_DQBUF blocks when no buffer is in the +outgoing queue. When the O_NONBLOCK flag was +given to the &func-open; function, VIDIOC_DQBUF +returns immediately with an &EAGAIN; when no buffer is available. The +&func-select; or &func-poll; function are always available. + + To start and stop capturing or output applications call the +&VIDIOC-STREAMON; and &VIDIOC-STREAMOFF; ioctls. Note that +VIDIOC_STREAMOFF removes all buffers from both queues and +unlocks all buffers as a side effect. Since there is no notion of doing +anything "now" on a multitasking system, if an application needs to synchronize +with another event it should examine the &v4l2-buffer; +timestamp of captured buffers, or set the field +before enqueuing buffers for output. + + Drivers implementing DMABUF importing I/O must support the +VIDIOC_REQBUFS, VIDIOC_QBUF, +VIDIOC_DQBUF, VIDIOC_STREAMON and +VIDIOC_STREAMOFF ioctl, the select() +and poll() function. + +
+
Asynchronous I/O @@ -673,6 +829,14 @@ memory, set by the application. See for details. v4l2_buffer structure. + + int + fd + For the single-plane API and when +memory is V4L2_MEMORY_DMABUF this +is the file descriptor associated with a DMABUF buffer. + + __u32 length @@ -748,6 +912,15 @@ should set this to 0. + + int + fd + When the memory type in the containing &v4l2-buffer; is + V4L2_MEMORY_DMABUF, this is a file + descriptor associated with a DMABUF buffer, similar to the + fd field in &v4l2-buffer;. + + __u32 data_offset @@ -982,6 +1155,12 @@ pointer I/O. 3 [to do] + + V4L2_MEMORY_DMABUF + 2 + The buffer is used for DMA shared +buffer I/O. + diff --git a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml index 765549f..fee66ac 100644 --- a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml @@ -104,6 +104,7 @@ information. memory Applications set this field to V4L2_MEMORY_MMAP or +V4L2_MEMORY_DMABUF or V4L2_MEMORY_USERPTR. See diff --git a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml index 9caa49a..cb5f5ff 100644 --- a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml +++ b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml @@ -112,6 +112,21 @@ they cannot be swapped out to disk. Buffers remain locked until dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the device is closed. + To enqueue a DMABUF buffer applications +set the memory field to +V4L2_MEMORY_DMABUF and the m.fd +to a file descriptor associated with a DMABUF buffer. When the multi-planar API is +used and m.fd of the passed array of &v4l2-plane; +have to be used instead. When VIDIOC_QBUF is called with a +pointer to this structure the driver sets the +V4L2_BUF_FLAG_QUEUED flag and clears the +V4L2_BUF_FLAG_MAPPED and +V4L2_BUF_FLAG_DONE flags in the +flags field, or it returns an error code. This +ioctl locks the buffer. Buffers remain locked until dequeued, +until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the +device is closed. + Applications call the VIDIOC_DQBUF ioctl to dequeue a filled (capturing) or displayed (output) buffer from the driver's outgoing queue. They just set the diff --git a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml index d7c9505..13381ea 100644 --- a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml @@ -48,28 +48,30 @@ Description - This ioctl is used to initiate memory -mapped or user pointer -I/O. Memory mapped buffers are located in device memory and must be -allocated with this ioctl before they can be mapped into the -application's address space. User buffers are allocated by -applications themselves, and this ioctl is merely used to switch the -driver into user pointer I/O mode and to setup some internal structures. +This ioctl is used to initiate memory mapped, +user pointer or DMABUF based I/O. Memory mapped buffers are located in +device memory and must be allocated with this ioctl before they can be mapped +into the application's address space. User buffers are allocated by +applications themselves, and this ioctl is merely used to switch the driver +into user pointer I/O mode and to setup some internal structures. +Similarly, DMABUF buffers are allocated by applications through a device +driver, and this ioctl only configures the driver into DMABUF I/O mode without +performing any direct allocation. - To allocate device buffers applications initialize all -fields of the v4l2_requestbuffers structure. -They set the type field to the respective -stream or buffer type, the count field to -the desired number of buffers, memory -must be set to the requested I/O method and the reserved array -must be zeroed. When the ioctl -is called with a pointer to this structure the driver will attempt to allocate -the requested number of buffers and it stores the actual number -allocated in the count field. It can be -smaller than the number requested, even zero, when the driver runs out -of free memory. A larger number is also possible when the driver requires -more buffers to function correctly. For example video output requires at least two buffers, -one displayed and one filled by the application. + To allocate device buffers applications initialize all fields of the +v4l2_requestbuffers structure. They set the +type field to the respective stream or buffer type, +the count field to the desired number of buffers, +memory must be set to the requested I/O method and +the reserved array must be zeroed. When the ioctl is +called with a pointer to this structure the driver will attempt to allocate the +requested number of buffers and it stores the actual number allocated in the +count field. It can be smaller than the number +requested, even zero, when the driver runs out of free memory. A larger number +is also possible when the driver requires more buffers to function correctly. +For example video output requires at least two buffers, one displayed and one +filled by the application. When the I/O method is not supported the ioctl returns an &EINVAL;. @@ -103,6 +105,7 @@ as the &v4l2-format; type field. See memory Applications set this field to V4L2_MEMORY_MMAP or +V4L2_MEMORY_DMABUF or V4L2_MEMORY_USERPTR. See . -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:17 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:17 +0200 Subject: [Linaro-mm-sig] [PATCHv6 03/13] v4l: vb2: add support for shared buffer (dma_buf) In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-4-git-send-email-t.stanislaws@samsung.com> From: Sumit Semwal This patch adds support for DMABUF memory type in videobuf2. It calls relevant APIs of dma_buf for v4l reqbuf / qbuf / dqbuf operations. For this version, the support is for videobuf2 as a user of the shared buffer; so the allocation of the buffer is done outside of V4L2. [A sample allocator of dma-buf shared buffer is given at [1]] [1]: Rob Clark's DRM: https://github.com/robclark/kernel-omap4/commits/drmplane-dmabuf Signed-off-by: Tomasz Stanislawski [original work in the PoC for buffer sharing] Signed-off-by: Sumit Semwal Signed-off-by: Sumit Semwal Acked-by: Laurent Pinchart --- drivers/media/video/videobuf2-core.c | 196 +++++++++++++++++++++++++++++++++- include/media/videobuf2-core.h | 27 +++++ 2 files changed, 219 insertions(+), 4 deletions(-) diff --git a/drivers/media/video/videobuf2-core.c b/drivers/media/video/videobuf2-core.c index 9d4e9ed..f43cfa4 100644 --- a/drivers/media/video/videobuf2-core.c +++ b/drivers/media/video/videobuf2-core.c @@ -109,6 +109,36 @@ static void __vb2_buf_userptr_put(struct vb2_buffer *vb) } /** + * __vb2_plane_dmabuf_put() - release memory associated with + * a DMABUF shared plane + */ +static void __vb2_plane_dmabuf_put(struct vb2_queue *q, struct vb2_plane *p) +{ + if (!p->mem_priv) + return; + + if (p->dbuf_mapped) + call_memop(q, unmap_dmabuf, p->mem_priv); + + call_memop(q, detach_dmabuf, p->mem_priv); + dma_buf_put(p->dbuf); + memset(p, 0, sizeof *p); +} + +/** + * __vb2_buf_dmabuf_put() - release memory associated with + * a DMABUF shared buffer + */ +static void __vb2_buf_dmabuf_put(struct vb2_buffer *vb) +{ + struct vb2_queue *q = vb->vb2_queue; + unsigned int plane; + + for (plane = 0; plane < vb->num_planes; ++plane) + __vb2_plane_dmabuf_put(q, &vb->planes[plane]); +} + +/** * __setup_offsets() - setup unique offsets ("cookies") for every plane in * every buffer on the queue */ @@ -230,6 +260,8 @@ static void __vb2_free_mem(struct vb2_queue *q, unsigned int buffers) /* Free MMAP buffers or release USERPTR buffers */ if (q->memory == V4L2_MEMORY_MMAP) __vb2_buf_mem_free(vb); + else if (q->memory == V4L2_MEMORY_DMABUF) + __vb2_buf_dmabuf_put(vb); else __vb2_buf_userptr_put(vb); } @@ -352,6 +384,12 @@ static int __fill_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b) */ memcpy(b->m.planes, vb->v4l2_planes, b->length * sizeof(struct v4l2_plane)); + + if (q->memory == V4L2_MEMORY_DMABUF) { + unsigned int plane; + for (plane = 0; plane < vb->num_planes; ++plane) + b->m.planes[plane].m.fd = 0; + } } else { /* * We use length and offset in v4l2_planes array even for @@ -363,6 +401,8 @@ static int __fill_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b) b->m.offset = vb->v4l2_planes[0].m.mem_offset; else if (q->memory == V4L2_MEMORY_USERPTR) b->m.userptr = vb->v4l2_planes[0].m.userptr; + else if (q->memory == V4L2_MEMORY_DMABUF) + b->m.fd = 0; } /* @@ -454,6 +494,20 @@ static int __verify_mmap_ops(struct vb2_queue *q) } /** + * __verify_dmabuf_ops() - verify that all memory operations required for + * DMABUF queue type have been provided + */ +static int __verify_dmabuf_ops(struct vb2_queue *q) +{ + if (!(q->io_modes & VB2_DMABUF) || !q->mem_ops->attach_dmabuf || + !q->mem_ops->detach_dmabuf || !q->mem_ops->map_dmabuf || + !q->mem_ops->unmap_dmabuf) + return -EINVAL; + + return 0; +} + +/** * vb2_reqbufs() - Initiate streaming * @q: videobuf2 queue * @req: struct passed from userspace to vidioc_reqbufs handler in driver @@ -486,8 +540,9 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req) return -EBUSY; } - if (req->memory != V4L2_MEMORY_MMAP - && req->memory != V4L2_MEMORY_USERPTR) { + if (req->memory != V4L2_MEMORY_MMAP && + req->memory != V4L2_MEMORY_DMABUF && + req->memory != V4L2_MEMORY_USERPTR) { dprintk(1, "reqbufs: unsupported memory type\n"); return -EINVAL; } @@ -516,6 +571,11 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req) return -EINVAL; } + if (req->memory == V4L2_MEMORY_DMABUF && __verify_dmabuf_ops(q)) { + dprintk(1, "reqbufs: DMABUF for current setup unsupported\n"); + return -EINVAL; + } + if (req->count == 0 || q->num_buffers != 0 || q->memory != req->memory) { /* * We already have buffers allocated, so first check if they @@ -622,8 +682,9 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create) return -EBUSY; } - if (create->memory != V4L2_MEMORY_MMAP - && create->memory != V4L2_MEMORY_USERPTR) { + if (create->memory != V4L2_MEMORY_MMAP && + create->memory != V4L2_MEMORY_USERPTR && + create->memory != V4L2_MEMORY_DMABUF) { dprintk(1, "%s(): unsupported memory type\n", __func__); return -EINVAL; } @@ -647,6 +708,11 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create) return -EINVAL; } + if (create->memory == V4L2_MEMORY_DMABUF && __verify_dmabuf_ops(q)) { + dprintk(1, "%s(): DMABUF for current setup unsupported\n", __func__); + return -EINVAL; + } + if (q->num_buffers == VIDEO_MAX_FRAME) { dprintk(1, "%s(): maximum number of buffers already allocated\n", __func__); @@ -842,6 +908,14 @@ static int __fill_vb2_buffer(struct vb2_buffer *vb, const struct v4l2_buffer *b, b->m.planes[plane].length; } } + if (b->memory == V4L2_MEMORY_DMABUF) { + for (plane = 0; plane < vb->num_planes; ++plane) { + v4l2_planes[plane].bytesused = + b->m.planes[plane].bytesused; + v4l2_planes[plane].m.fd = + b->m.planes[plane].m.fd; + } + } } else { /* * Single-planar buffers do not use planes array, @@ -856,6 +930,10 @@ static int __fill_vb2_buffer(struct vb2_buffer *vb, const struct v4l2_buffer *b, v4l2_planes[0].m.userptr = b->m.userptr; v4l2_planes[0].length = b->length; } + + if (b->memory == V4L2_MEMORY_DMABUF) + v4l2_planes[0].m.fd = b->m.fd; + } vb->v4l2_buf.field = b->field; @@ -960,6 +1038,100 @@ static int __qbuf_mmap(struct vb2_buffer *vb, const struct v4l2_buffer *b) } /** + * __qbuf_dmabuf() - handle qbuf of a DMABUF buffer + */ +static int __qbuf_dmabuf(struct vb2_buffer *vb, const struct v4l2_buffer *b) +{ + struct v4l2_plane planes[VIDEO_MAX_PLANES]; + struct vb2_queue *q = vb->vb2_queue; + void *mem_priv; + unsigned int plane; + int ret; + int write = !V4L2_TYPE_IS_OUTPUT(q->type); + + /* Verify and copy relevant information provided by the userspace */ + ret = __fill_vb2_buffer(vb, b, planes); + if (ret) + return ret; + + for (plane = 0; plane < vb->num_planes; ++plane) { + struct dma_buf *dbuf = dma_buf_get(planes[plane].m.fd); + + if (IS_ERR_OR_NULL(dbuf)) { + dprintk(1, "qbuf: invalid dmabuf fd for " + "plane %d\n", plane); + ret = -EINVAL; + goto err; + } + + /* Skip the plane if already verified */ + if (dbuf == vb->planes[plane].dbuf) { + planes[plane].length = dbuf->size; + dma_buf_put(dbuf); + continue; + } + + dprintk(3, "qbuf: buffer description for plane %d changed, " + "reattaching dma buf\n", plane); + + /* Release previously acquired memory if present */ + __vb2_plane_dmabuf_put(q, &vb->planes[plane]); + + /* Acquire each plane's memory */ + mem_priv = call_memop(q, attach_dmabuf, q->alloc_ctx[plane], + dbuf, q->plane_sizes[plane], write); + if (IS_ERR(mem_priv)) { + dprintk(1, "qbuf: failed acquiring dmabuf " + "memory for plane %d\n", plane); + ret = PTR_ERR(mem_priv); + goto err; + } + + planes[plane].length = dbuf->size; + vb->planes[plane].dbuf = dbuf; + vb->planes[plane].mem_priv = mem_priv; + } + + /* TODO: This pins the buffer(s) with dma_buf_map_attachment()).. but + * really we want to do this just before the DMA, not while queueing + * the buffer(s).. + */ + for (plane = 0; plane < vb->num_planes; ++plane) { + ret = call_memop(q, map_dmabuf, vb->planes[plane].mem_priv); + if (ret) { + dprintk(1, "qbuf: failed mapping dmabuf " + "memory for plane %d\n", plane); + goto err; + } + vb->planes[plane].dbuf_mapped = 1; + } + + /* + * Call driver-specific initialization on the newly acquired buffer, + * if provided. + */ + ret = call_qop(q, buf_init, vb); + if (ret) { + dprintk(1, "qbuf: buffer initialization failed\n"); + goto err; + } + + /* + * Now that everything is in order, copy relevant information + * provided by userspace. + */ + for (plane = 0; plane < vb->num_planes; ++plane) + vb->v4l2_planes[plane] = planes[plane]; + + return 0; +err: + /* In case of errors, release planes that were already acquired */ + __vb2_buf_dmabuf_put(vb); + + return ret; +} + +/** * __enqueue_in_driver() - enqueue a vb2_buffer in driver for processing */ static void __enqueue_in_driver(struct vb2_buffer *vb) @@ -983,6 +1155,9 @@ static int __buf_prepare(struct vb2_buffer *vb, const struct v4l2_buffer *b) case V4L2_MEMORY_USERPTR: ret = __qbuf_userptr(vb, b); break; + case V4L2_MEMORY_DMABUF: + ret = __qbuf_dmabuf(vb, b); + break; default: WARN(1, "Invalid queue type\n"); ret = -EINVAL; @@ -1338,6 +1513,19 @@ int vb2_dqbuf(struct vb2_queue *q, struct v4l2_buffer *b, bool nonblocking) return ret; } + /* TODO: this unpins the buffer(dma_buf_unmap_attachment()).. but + * really we want to do this just after DMA, not when the + * buffer is dequeued.. + */ + if (q->memory == V4L2_MEMORY_DMABUF) { + unsigned int i; + + for (i = 0; i < vb->num_planes; ++i) { + call_memop(q, unmap_dmabuf, vb->planes[i].mem_priv); + vb->planes[i].dbuf_mapped = 0; + } + } + switch (vb->state) { case VB2_BUF_STATE_DONE: dprintk(3, "dqbuf: Returning done buffer\n"); diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h index a15d1f1..859bbaf 100644 --- a/include/media/videobuf2-core.h +++ b/include/media/videobuf2-core.h @@ -16,6 +16,7 @@ #include #include #include +#include struct vb2_alloc_ctx; struct vb2_fileio_data; @@ -41,6 +42,20 @@ struct vb2_fileio_data; * argument to other ops in this structure * @put_userptr: inform the allocator that a USERPTR buffer will no longer * be used + * @attach_dmabuf: attach a shared struct dma_buf for a hardware operation; + * used for DMABUF memory types; alloc_ctx is the alloc context + * dbuf is the shared dma_buf; returns NULL on failure; + * allocator private per-buffer structure on success; + * this needs to be used for further accesses to the buffer + * @detach_dmabuf: inform the exporter of the buffer that the current DMABUF + * buffer is no longer used; the buf_priv argument is the + * allocator private per-buffer structure previously returned + * from the attach_dmabuf callback + * @map_dmabuf: request for access to the dmabuf from allocator; the allocator + * of dmabuf is informed that this driver is going to use the + * dmabuf + * @unmap_dmabuf: releases access control to the dmabuf - allocator is notified + * that this driver is done using the dmabuf for now * @vaddr: return a kernel virtual address to a given memory buffer * associated with the passed private structure or NULL if no * such mapping exists @@ -56,6 +71,8 @@ struct vb2_fileio_data; * Required ops for USERPTR types: get_userptr, put_userptr. * Required ops for MMAP types: alloc, put, num_users, mmap. * Required ops for read/write access types: alloc, put, num_users, vaddr + * Required ops for DMABUF types: attach_dmabuf, detach_dmabuf, map_dmabuf, + * unmap_dmabuf. */ struct vb2_mem_ops { void *(*alloc)(void *alloc_ctx, unsigned long size); @@ -65,6 +82,12 @@ struct vb2_mem_ops { unsigned long size, int write); void (*put_userptr)(void *buf_priv); + void *(*attach_dmabuf)(void *alloc_ctx, struct dma_buf *dbuf, + unsigned long size, int write); + void (*detach_dmabuf)(void *buf_priv); + int (*map_dmabuf)(void *buf_priv); + void (*unmap_dmabuf)(void *buf_priv); + void *(*vaddr)(void *buf_priv); void *(*cookie)(void *buf_priv); @@ -75,6 +98,8 @@ struct vb2_mem_ops { struct vb2_plane { void *mem_priv; + struct dma_buf *dbuf; + unsigned int dbuf_mapped; }; /** @@ -83,12 +108,14 @@ struct vb2_plane { * @VB2_USERPTR: driver supports USERPTR with streaming API * @VB2_READ: driver supports read() style access * @VB2_WRITE: driver supports write() style access + * @VB2_DMABUF: driver supports DMABUF with streaming API */ enum vb2_io_modes { VB2_MMAP = (1 << 0), VB2_USERPTR = (1 << 1), VB2_READ = (1 << 2), VB2_WRITE = (1 << 3), + VB2_DMABUF = (1 << 4), }; /** -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:19 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:19 +0200 Subject: [Linaro-mm-sig] [PATCHv6 05/13] v4l: vb2-dma-contig: Shorten vb2_dma_contig prefix to vb2_dc In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-6-git-send-email-t.stanislaws@samsung.com> From: Laurent Pinchart Signed-off-by: Laurent Pinchart --- drivers/media/video/videobuf2-dma-contig.c | 36 ++++++++++++++-------------- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 4b71326..a05784f 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -32,9 +32,9 @@ struct vb2_dc_buf { struct vb2_vmarea_handler handler; }; -static void vb2_dma_contig_put(void *buf_priv); +static void vb2_dc_put(void *buf_priv); -static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) +static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) { struct vb2_dc_conf *conf = alloc_ctx; struct vb2_dc_buf *buf; @@ -56,7 +56,7 @@ static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) buf->size = size; buf->handler.refcount = &buf->refcount; - buf->handler.put = vb2_dma_contig_put; + buf->handler.put = vb2_dc_put; buf->handler.arg = buf; atomic_inc(&buf->refcount); @@ -64,7 +64,7 @@ static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) return buf; } -static void vb2_dma_contig_put(void *buf_priv) +static void vb2_dc_put(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv; @@ -75,14 +75,14 @@ static void vb2_dma_contig_put(void *buf_priv) } } -static void *vb2_dma_contig_cookie(void *buf_priv) +static void *vb2_dc_cookie(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv; return &buf->dma_addr; } -static void *vb2_dma_contig_vaddr(void *buf_priv) +static void *vb2_dc_vaddr(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv; if (!buf) @@ -91,14 +91,14 @@ static void *vb2_dma_contig_vaddr(void *buf_priv) return buf->vaddr; } -static unsigned int vb2_dma_contig_num_users(void *buf_priv) +static unsigned int vb2_dc_num_users(void *buf_priv) { struct vb2_dc_buf *buf = buf_priv; return atomic_read(&buf->refcount); } -static int vb2_dma_contig_mmap(void *buf_priv, struct vm_area_struct *vma) +static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) { struct vb2_dc_buf *buf = buf_priv; @@ -111,7 +111,7 @@ static int vb2_dma_contig_mmap(void *buf_priv, struct vm_area_struct *vma) &vb2_common_vm_ops, &buf->handler); } -static void *vb2_dma_contig_get_userptr(void *alloc_ctx, unsigned long vaddr, +static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, unsigned long size, int write) { struct vb2_dc_buf *buf; @@ -138,7 +138,7 @@ static void *vb2_dma_contig_get_userptr(void *alloc_ctx, unsigned long vaddr, return buf; } -static void vb2_dma_contig_put_userptr(void *mem_priv) +static void vb2_dc_put_userptr(void *mem_priv) { struct vb2_dc_buf *buf = mem_priv; @@ -150,14 +150,14 @@ static void vb2_dma_contig_put_userptr(void *mem_priv) } const struct vb2_mem_ops vb2_dma_contig_memops = { - .alloc = vb2_dma_contig_alloc, - .put = vb2_dma_contig_put, - .cookie = vb2_dma_contig_cookie, - .vaddr = vb2_dma_contig_vaddr, - .mmap = vb2_dma_contig_mmap, - .get_userptr = vb2_dma_contig_get_userptr, - .put_userptr = vb2_dma_contig_put_userptr, - .num_users = vb2_dma_contig_num_users, + .alloc = vb2_dc_alloc, + .put = vb2_dc_put, + .cookie = vb2_dc_cookie, + .vaddr = vb2_dc_vaddr, + .mmap = vb2_dc_mmap, + .get_userptr = vb2_dc_get_userptr, + .put_userptr = vb2_dc_put_userptr, + .num_users = vb2_dc_num_users, }; EXPORT_SYMBOL_GPL(vb2_dma_contig_memops); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:24 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:24 +0200 Subject: [Linaro-mm-sig] [PATCHv6 10/13] v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-11-git-send-email-t.stanislaws@samsung.com> From: Marek Szyprowski Add prepare/finish callbacks to vb2-dma-contig allocator. Signed-off-by: Marek Szyprowski --- drivers/media/video/videobuf2-dma-contig.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 068ec11..bb32e7e 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -150,6 +150,28 @@ static unsigned int vb2_dc_num_users(void *buf_priv) return atomic_read(&buf->refcount); } +static void vb2_dc_prepare(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!sgt) + return; + + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); +} + +static void vb2_dc_finish(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!sgt) + return; + + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); +} + /*********************************************/ /* callbacks for MMAP buffers */ /*********************************************/ @@ -403,6 +425,8 @@ const struct vb2_mem_ops vb2_dma_contig_memops = { .mmap = vb2_dc_mmap, .get_userptr = vb2_dc_get_userptr, .put_userptr = vb2_dc_put_userptr, + .prepare = vb2_dc_prepare, + .finish = vb2_dc_finish, .num_users = vb2_dc_num_users, }; EXPORT_SYMBOL_GPL(vb2_dma_contig_memops); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:21 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:21 +0200 Subject: [Linaro-mm-sig] [PATCHv6 07/13] v4l: vb2-dma-contig: Reorder functions In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-8-git-send-email-t.stanislaws@samsung.com> From: Laurent Pinchart Group functions by buffer type. Signed-off-by: Laurent Pinchart --- drivers/media/video/videobuf2-dma-contig.c | 92 ++++++++++++++++------------ 1 file changed, 54 insertions(+), 38 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index a019cd1..42c6431 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -21,14 +21,56 @@ struct vb2_dc_buf { struct device *dev; void *vaddr; - dma_addr_t dma_addr; unsigned long size; - struct vm_area_struct *vma; - atomic_t refcount; + dma_addr_t dma_addr; + + /* MMAP related */ struct vb2_vmarea_handler handler; + atomic_t refcount; + + /* USERPTR related */ + struct vm_area_struct *vma; }; -static void vb2_dc_put(void *buf_priv); +/*********************************************/ +/* callbacks for all buffers */ +/*********************************************/ + +static void *vb2_dc_cookie(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + return &buf->dma_addr; +} + +static void *vb2_dc_vaddr(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + return buf->vaddr; +} + +static unsigned int vb2_dc_num_users(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + return atomic_read(&buf->refcount); +} + +/*********************************************/ +/* callbacks for MMAP buffers */ +/*********************************************/ + +static void vb2_dc_put(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + + if (!atomic_dec_and_test(&buf->refcount)) + return; + + dma_free_coherent(buf->dev, buf->size, buf->vaddr, buf->dma_addr); + kfree(buf); +} static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) { @@ -58,40 +100,6 @@ static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) return buf; } -static void vb2_dc_put(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - - if (atomic_dec_and_test(&buf->refcount)) { - dma_free_coherent(buf->dev, buf->size, buf->vaddr, - buf->dma_addr); - kfree(buf); - } -} - -static void *vb2_dc_cookie(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - - return &buf->dma_addr; -} - -static void *vb2_dc_vaddr(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - if (!buf) - return NULL; - - return buf->vaddr; -} - -static unsigned int vb2_dc_num_users(void *buf_priv) -{ - struct vb2_dc_buf *buf = buf_priv; - - return atomic_read(&buf->refcount); -} - static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) { struct vb2_dc_buf *buf = buf_priv; @@ -105,6 +113,10 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) &vb2_common_vm_ops, &buf->handler); } +/*********************************************/ +/* callbacks for USERPTR buffers */ +/*********************************************/ + static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, unsigned long size, int write) { @@ -143,6 +155,10 @@ static void vb2_dc_put_userptr(void *mem_priv) kfree(buf); } +/*********************************************/ +/* DMA CONTIG exported functions */ +/*********************************************/ + const struct vb2_mem_ops vb2_dma_contig_memops = { .alloc = vb2_dc_alloc, .put = vb2_dc_put, -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:26 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:26 +0200 Subject: [Linaro-mm-sig] [PATCHv6 12/13] v4l: s5p-tv: mixer: support for dmabuf importing In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-13-git-send-email-t.stanislaws@samsung.com> This patch enhances s5p-tv with support for DMABUF importing via V4L2_MEMORY_DMABUF memory type. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/s5p-tv/Kconfig | 1 + drivers/media/video/s5p-tv/mixer_video.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/media/video/s5p-tv/Kconfig b/drivers/media/video/s5p-tv/Kconfig index f248b28..2e80126 100644 --- a/drivers/media/video/s5p-tv/Kconfig +++ b/drivers/media/video/s5p-tv/Kconfig @@ -10,6 +10,7 @@ config VIDEO_SAMSUNG_S5P_TV bool "Samsung TV driver for S5P platform (experimental)" depends on PLAT_S5P && PM_RUNTIME depends on EXPERIMENTAL + select DMA_SHARED_BUFFER default n ---help--- Say Y here to enable selecting the TV output devices for diff --git a/drivers/media/video/s5p-tv/mixer_video.c b/drivers/media/video/s5p-tv/mixer_video.c index 33fde2a..cff974a 100644 --- a/drivers/media/video/s5p-tv/mixer_video.c +++ b/drivers/media/video/s5p-tv/mixer_video.c @@ -1078,7 +1078,7 @@ struct mxr_layer *mxr_base_layer_create(struct mxr_device *mdev, layer->vb_queue = (struct vb2_queue) { .type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE, - .io_modes = VB2_MMAP | VB2_USERPTR, + .io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF, .drv_priv = layer, .buf_struct_size = sizeof(struct mxr_buffer), .ops = &mxr_video_qops, -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:23 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:23 +0200 Subject: [Linaro-mm-sig] [PATCHv6 09/13] v4l: vb2: add prepare/finish callbacks to allocators In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-10-git-send-email-t.stanislaws@samsung.com> From: Marek Szyprowski This patch adds support for prepare/finish callbacks in VB2 allocators. These callback are used for buffer flushing. Signed-off-by: Marek Szyprowski Acked-by: Laurent Pinchart --- drivers/media/video/videobuf2-core.c | 11 +++++++++++ include/media/videobuf2-core.h | 7 +++++++ 2 files changed, 18 insertions(+) diff --git a/drivers/media/video/videobuf2-core.c b/drivers/media/video/videobuf2-core.c index f43cfa4..d60ed25 100644 --- a/drivers/media/video/videobuf2-core.c +++ b/drivers/media/video/videobuf2-core.c @@ -845,6 +845,7 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state) { struct vb2_queue *q = vb->vb2_queue; unsigned long flags; + unsigned int plane; if (vb->state != VB2_BUF_STATE_ACTIVE) return; @@ -855,6 +856,10 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state) dprintk(4, "Done processing on buffer %d, state: %d\n", vb->v4l2_buf.index, vb->state); + /* sync buffers */ + for (plane = 0; plane < vb->num_planes; ++plane) + call_memop(q, finish, vb->planes[plane].mem_priv); + /* Add the buffer to the done buffers list */ spin_lock_irqsave(&q->done_lock, flags); vb->state = state; @@ -1137,9 +1142,15 @@ err: static void __enqueue_in_driver(struct vb2_buffer *vb) { struct vb2_queue *q = vb->vb2_queue; + unsigned int plane; vb->state = VB2_BUF_STATE_ACTIVE; atomic_inc(&q->queued_count); + + /* sync buffers */ + for (plane = 0; plane < vb->num_planes; ++plane) + call_memop(q, prepare, vb->planes[plane].mem_priv); + q->ops->buf_queue(vb); } diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h index 859bbaf..d079f92 100644 --- a/include/media/videobuf2-core.h +++ b/include/media/videobuf2-core.h @@ -56,6 +56,10 @@ struct vb2_fileio_data; * dmabuf * @unmap_dmabuf: releases access control to the dmabuf - allocator is notified * that this driver is done using the dmabuf for now + * @prepare: called everytime the buffer is passed from userspace to the + * driver, usefull for cache synchronisation, optional + * @finish: called everytime the buffer is passed back from the driver + * to the userspace, also optional * @vaddr: return a kernel virtual address to a given memory buffer * associated with the passed private structure or NULL if no * such mapping exists @@ -82,6 +86,9 @@ struct vb2_mem_ops { unsigned long size, int write); void (*put_userptr)(void *buf_priv); + void (*prepare)(void *buf_priv); + void (*finish)(void *buf_priv); + void *(*attach_dmabuf)(void *alloc_ctx, struct dma_buf *dbuf, unsigned long size, int write); void (*detach_dmabuf)(void *buf_priv); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:27 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:27 +0200 Subject: [Linaro-mm-sig] [PATCHv6 13/13] v4l: s5p-fimc: support for dmabuf importing In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-14-git-send-email-t.stanislaws@samsung.com> This patch enhances s5p-fimc with support for DMABUF importing via V4L2_MEMORY_DMABUF memory type. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park Acked-by: Sylwester Nawrocki --- drivers/media/video/s5p-fimc/Kconfig | 1 + drivers/media/video/s5p-fimc/fimc-capture.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/media/video/s5p-fimc/Kconfig b/drivers/media/video/s5p-fimc/Kconfig index a564f7e..3106026 100644 --- a/drivers/media/video/s5p-fimc/Kconfig +++ b/drivers/media/video/s5p-fimc/Kconfig @@ -14,6 +14,7 @@ config VIDEO_S5P_FIMC depends on I2C select VIDEOBUF2_DMA_CONTIG select V4L2_MEM2MEM_DEV + select DMA_SHARED_BUFFER help This is a V4L2 driver for Samsung S5P and EXYNOS4 SoC camera host interface and video postprocessor (FIMC and FIMC-LITE) devices. diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c index 3545745..cd27e33 100644 --- a/drivers/media/video/s5p-fimc/fimc-capture.c +++ b/drivers/media/video/s5p-fimc/fimc-capture.c @@ -1609,7 +1609,7 @@ static int fimc_register_capture_device(struct fimc_dev *fimc, q = &fimc->vid_cap.vbq; memset(q, 0, sizeof(*q)); q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; - q->io_modes = VB2_MMAP | VB2_USERPTR; + q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF; q->drv_priv = fimc->vid_cap.ctx; q->ops = &fimc_capture_qops; q->mem_ops = &vb2_dma_contig_memops; -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:22 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:22 +0200 Subject: [Linaro-mm-sig] [PATCHv6 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-9-git-send-email-t.stanislaws@samsung.com> This patch introduces usage of dma_map_sg to map memory behind a userspace pointer to a device as dma-contiguous mapping. This patch contains some of the code kindly provided by Marek Szyprowski and Kamil Debski and Andrzej Pietrasiewicz . Kind thanks for bug reports from Laurent Pinchart and Seung-Woo Kim . Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park Acked-by: Laurent Pinchart --- drivers/media/video/videobuf2-dma-contig.c | 268 ++++++++++++++++++++++++++-- 1 file changed, 252 insertions(+), 16 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 42c6431..068ec11 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -11,6 +11,8 @@ */ #include +#include +#include #include #include @@ -23,6 +25,8 @@ struct vb2_dc_buf { void *vaddr; unsigned long size; dma_addr_t dma_addr; + enum dma_data_direction dma_dir; + struct sg_table *dma_sgt; /* MMAP related */ struct vb2_vmarea_handler handler; @@ -33,6 +37,95 @@ struct vb2_dc_buf { }; /*********************************************/ +/* scatterlist table functions */ +/*********************************************/ + +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, + unsigned int n_pages, unsigned long offset, unsigned long size) +{ + struct sg_table *sgt; + unsigned int chunks; + unsigned int i; + unsigned int cur_page; + int ret; + struct scatterlist *s; + + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); + if (!sgt) + return ERR_PTR(-ENOMEM); + + /* compute number of chunks */ + chunks = 1; + for (i = 1; i < n_pages; ++i) + if (pages[i] != pages[i - 1] + 1) + ++chunks; + + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); + if (ret) { + kfree(sgt); + return ERR_PTR(-ENOMEM); + } + + /* merging chunks and putting them into the scatterlist */ + cur_page = 0; + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { + unsigned long chunk_size; + unsigned int j; + + for (j = cur_page + 1; j < n_pages; ++j) + if (pages[j] != pages[j - 1] + 1) + break; + + chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); + size -= chunk_size; + offset = 0; + cur_page = j; + } + + return sgt; +} + +static void vb2_dc_release_sgtable(struct sg_table *sgt) +{ + sg_free_table(sgt); + kfree(sgt); +} + +static void vb2_dc_sgt_foreach_page(struct sg_table *sgt, + void (*cb)(struct page *pg)) +{ + struct scatterlist *s; + unsigned int i; + + for_each_sg(sgt->sgl, s, sgt->nents, i) { + struct page *page = sg_page(s); + unsigned int n_pages = PAGE_ALIGN(s->offset + s->length) + >> PAGE_SHIFT; + unsigned int j; + + for (j = 0; j < n_pages; ++j, ++page) + cb(page); + } +} + +static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt) +{ + struct scatterlist *s; + dma_addr_t expected = sg_dma_address(sgt->sgl); + unsigned int i; + unsigned long size = 0; + + for_each_sg(sgt->sgl, s, sgt->nents, i) { + if (sg_dma_address(s) != expected) + break; + expected = sg_dma_address(s) + sg_dma_len(s); + size += sg_dma_len(s); + } + return size; +} + +/*********************************************/ /* callbacks for all buffers */ /*********************************************/ @@ -117,42 +210,185 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) /* callbacks for USERPTR buffers */ /*********************************************/ +static inline int vma_is_io(struct vm_area_struct *vma) +{ + return !!(vma->vm_flags & (VM_IO | VM_PFNMAP)); +} + +static int vb2_dc_get_user_pages(unsigned long start, struct page **pages, + int n_pages, struct vm_area_struct *vma, int write) +{ + if (vma_is_io(vma)) { + unsigned int i; + + for (i = 0; i < n_pages; ++i, start += PAGE_SIZE) { + unsigned long pfn; + int ret = follow_pfn(vma, start, &pfn); + + if (ret) { + printk(KERN_ERR "no page for address %lu\n", + start); + return ret; + } + pages[i] = pfn_to_page(pfn); + } + } else { + int n; + + n = get_user_pages(current, current->mm, start & PAGE_MASK, + n_pages, write, 1, pages, NULL); + /* negative error means that no page was pinned */ + n = max(n, 0); + if (n != n_pages) { + printk(KERN_ERR "got only %d of %d user pages\n", + n, n_pages); + while (n) + put_page(pages[--n]); + return -EFAULT; + } + } + + return 0; +} + +static void vb2_dc_put_dirty_page(struct page *page) +{ + set_page_dirty_lock(page); + put_page(page); +} + +static void vb2_dc_put_userptr(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + struct sg_table *sgt = buf->dma_sgt; + + dma_unmap_sg(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir); + if (!vma_is_io(buf->vma)) + vb2_dc_sgt_foreach_page(sgt, vb2_dc_put_dirty_page); + + vb2_dc_release_sgtable(sgt); + vb2_put_vma(buf->vma); + kfree(buf); +} + static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, - unsigned long size, int write) + unsigned long size, int write) { struct vb2_dc_buf *buf; + unsigned long start; + unsigned long end; + unsigned long offset; + struct page **pages; + int n_pages; + int ret = 0; struct vm_area_struct *vma; - dma_addr_t dma_addr = 0; - int ret; + struct sg_table *sgt; + unsigned long contig_size; buf = kzalloc(sizeof *buf, GFP_KERNEL); if (!buf) return ERR_PTR(-ENOMEM); - ret = vb2_get_contig_userptr(vaddr, size, &vma, &dma_addr); + buf->dev = alloc_ctx; + buf->dma_dir = write ? DMA_FROM_DEVICE : DMA_TO_DEVICE; + + start = vaddr & PAGE_MASK; + offset = vaddr & ~PAGE_MASK; + end = PAGE_ALIGN(vaddr + size); + n_pages = (end - start) >> PAGE_SHIFT; + + pages = kmalloc(n_pages * sizeof pages[0], GFP_KERNEL); + if (!pages) { + ret = -ENOMEM; + printk(KERN_ERR "failed to allocate pages table\n"); + goto fail_buf; + } + + /* current->mm->mmap_sem is taken by videobuf2 core */ + vma = find_vma(current->mm, vaddr); + if (!vma) { + printk(KERN_ERR "no vma for address %lu\n", vaddr); + ret = -EFAULT; + goto fail_pages; + } + + if (vma->vm_end < vaddr + size) { + printk(KERN_ERR "vma at %lu is too small for %lu bytes\n", + vaddr, size); + ret = -EFAULT; + goto fail_pages; + } + + buf->vma = vb2_get_vma(vma); + if (!buf->vma) { + printk(KERN_ERR "failed to copy vma\n"); + ret = -ENOMEM; + goto fail_pages; + } + + /* extract page list from userspace mapping */ + ret = vb2_dc_get_user_pages(start, pages, n_pages, vma, write); if (ret) { - printk(KERN_ERR "Failed acquiring VMA for vaddr 0x%08lx\n", - vaddr); - kfree(buf); - return ERR_PTR(ret); + printk(KERN_ERR "failed to get user pages\n"); + goto fail_vma; + } + + sgt = vb2_dc_pages_to_sgt(pages, n_pages, offset, size); + if (IS_ERR(sgt)) { + printk(KERN_ERR "failed to create scatterlist table\n"); + ret = -ENOMEM; + goto fail_get_user_pages; } + /* pages are no longer needed */ + kfree(pages); + pages = NULL; + + sgt->nents = dma_map_sg(buf->dev, sgt->sgl, sgt->orig_nents, + buf->dma_dir); + if (sgt->nents <= 0) { + printk(KERN_ERR "failed to map scatterlist\n"); + ret = -EIO; + goto fail_sgt; + } + + contig_size = vb2_dc_get_contiguous_size(sgt); + if (contig_size < size) { + printk(KERN_ERR "contiguous mapping is too small %lu/%lu\n", + contig_size, size); + ret = -EFAULT; + goto fail_map_sg; + } + + buf->dma_addr = sg_dma_address(sgt->sgl); buf->size = size; - buf->dma_addr = dma_addr; - buf->vma = vma; + buf->dma_sgt = sgt; return buf; -} -static void vb2_dc_put_userptr(void *mem_priv) -{ - struct vb2_dc_buf *buf = mem_priv; +fail_map_sg: + dma_unmap_sg(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); - if (!buf) - return; +fail_sgt: + if (!vma_is_io(buf->vma)) + vb2_dc_sgt_foreach_page(sgt, put_page); + vb2_dc_release_sgtable(sgt); +fail_get_user_pages: + if (pages && !vma_is_io(buf->vma)) + while (n_pages) + put_page(pages[--n_pages]); + +fail_vma: vb2_put_vma(buf->vma); + +fail_pages: + kfree(pages); /* kfree is NULL-proof */ + +fail_buf: kfree(buf); + + return ERR_PTR(ret); } /*********************************************/ -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 12:10:25 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 14:10:25 +0200 Subject: [Linaro-mm-sig] [PATCHv6 11/13] v4l: vb2-dma-contig: add support for dma_buf importing In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337775027-9489-12-git-send-email-t.stanislaws@samsung.com> From: Sumit Semwal This patch makes changes for adding dma-contig as a dma_buf user. It provides function implementations for the {attach, detach, map, unmap}_dmabuf() mem_ops of DMABUF memory type. Signed-off-by: Sumit Semwal Signed-off-by: Sumit Semwal [author of the original patch] Signed-off-by: Tomasz Stanislawski [integration with refactored dma-contig allocator] Acked-by: Laurent Pinchart --- drivers/media/video/videobuf2-dma-contig.c | 119 +++++++++++++++++++++++++++- 1 file changed, 117 insertions(+), 2 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index bb32e7e..9c213bc 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -10,6 +10,7 @@ * the Free Software Foundation. */ +#include #include #include #include @@ -34,6 +35,9 @@ struct vb2_dc_buf { /* USERPTR related */ struct vm_area_struct *vma; + + /* DMABUF related */ + struct dma_buf_attachment *db_attach; }; /*********************************************/ @@ -155,7 +159,8 @@ static void vb2_dc_prepare(void *buf_priv) struct vb2_dc_buf *buf = buf_priv; struct sg_table *sgt = buf->dma_sgt; - if (!sgt) + /* DMABUF exporter will flush the cache for us */ + if (!sgt || buf->db_attach) return; dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); @@ -166,7 +171,8 @@ static void vb2_dc_finish(void *buf_priv) struct vb2_dc_buf *buf = buf_priv; struct sg_table *sgt = buf->dma_sgt; - if (!sgt) + /* DMABUF exporter will flush the cache for us */ + if (!sgt || buf->db_attach) return; dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); @@ -414,6 +420,111 @@ fail_buf: } /*********************************************/ +/* callbacks for DMABUF buffers */ +/*********************************************/ + +static int vb2_dc_map_dmabuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + struct sg_table *sgt; + unsigned long contig_size; + + if (WARN_ON(!buf->db_attach)) { + printk(KERN_ERR "trying to pin a non attached buffer\n"); + return -EINVAL; + } + + if (WARN_ON(buf->dma_sgt)) { + printk(KERN_ERR "dmabuf buffer is already pinned\n"); + return 0; + } + + /* get the associated scatterlist for this buffer */ + sgt = dma_buf_map_attachment(buf->db_attach, buf->dma_dir); + if (IS_ERR_OR_NULL(sgt)) { + printk(KERN_ERR "Error getting dmabuf scatterlist\n"); + return -EINVAL; + } + + /* checking if dmabuf is big enough to store contiguous chunk */ + contig_size = vb2_dc_get_contiguous_size(sgt); + if (contig_size < buf->size) { + printk(KERN_ERR "contiguous chunk is too small %lu/%lu b\n", + contig_size, buf->size); + dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + return -EFAULT; + } + + buf->dma_addr = sg_dma_address(sgt->sgl); + buf->dma_sgt = sgt; + + return 0; +} + +static void vb2_dc_unmap_dmabuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + struct sg_table *sgt = buf->dma_sgt; + + if (WARN_ON(!buf->db_attach)) { + printk(KERN_ERR "trying to unpin a not attached buffer\n"); + return; + } + + if (WARN_ON(!sgt)) { + printk(KERN_ERR "dmabuf buffer is already unpinned\n"); + return; + } + + dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + + buf->dma_addr = 0; + buf->dma_sgt = NULL; +} + +static void vb2_dc_detach_dmabuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + + /* if vb2 works correctly you should never detach mapped buffer */ + if (WARN_ON(buf->dma_addr)) + vb2_dc_unmap_dmabuf(buf); + + /* detach this attachment */ + dma_buf_detach(buf->db_attach->dmabuf, buf->db_attach); + kfree(buf); +} + +static void *vb2_dc_attach_dmabuf(void *alloc_ctx, struct dma_buf *dbuf, + unsigned long size, int write) +{ + struct vb2_dc_buf *buf; + struct dma_buf_attachment *dba; + + if (dbuf->size < size) + return ERR_PTR(-EFAULT); + + buf = kzalloc(sizeof *buf, GFP_KERNEL); + if (!buf) + return ERR_PTR(-ENOMEM); + + buf->dev = alloc_ctx; + /* create attachment for the dmabuf with the user device */ + dba = dma_buf_attach(dbuf, buf->dev); + if (IS_ERR(dba)) { + printk(KERN_ERR "failed to attach dmabuf\n"); + kfree(buf); + return dba; + } + + buf->dma_dir = write ? DMA_FROM_DEVICE : DMA_TO_DEVICE; + buf->size = size; + buf->db_attach = dba; + + return buf; +} + +/*********************************************/ /* DMA CONTIG exported functions */ /*********************************************/ @@ -427,6 +538,10 @@ const struct vb2_mem_ops vb2_dma_contig_memops = { .put_userptr = vb2_dc_put_userptr, .prepare = vb2_dc_prepare, .finish = vb2_dc_finish, + .map_dmabuf = vb2_dc_map_dmabuf, + .unmap_dmabuf = vb2_dc_unmap_dmabuf, + .attach_dmabuf = vb2_dc_attach_dmabuf, + .detach_dmabuf = vb2_dc_detach_dmabuf, .num_users = vb2_dc_num_users, }; EXPORT_SYMBOL_GPL(vb2_dma_contig_memops); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:23 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:23 +0200 Subject: [Linaro-mm-sig] [PATCH 00/12] Support for dmabuf exporting for videobuf2 Message-ID: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Hello everyone, The patches adds support for DMABUF exporting to V4L2 stack. The latest support for DMABUF importing was posted in [1]. The exporter part is dependant on DMA mapping redesign [2] which is not merged into the mainline. Therefore it is posted as a separate patchset. Moreover some patches depends on vmap extension for DMABUF by Dave Airlie [3] and sg_alloc_table_from_pages function [4]. Changelog: v0: (RFC) - updated setup of VIDIOC_EXPBUF ioctl - doc updates - introduced workaround to avoid using dma_get_pages, - removed caching of exported dmabuf to avoid existence of circular reference between dmabuf and vb2_dc_buf or resource leakage - removed all 'change behaviour' patches - inital support for exporting in s5p-mfs driver - removal of vb2_mmap_pfn_range that is no longer used - use sg_alloc_table_from_pages instead of creating sglist in vb2_dc code - move attachment allocation to exporter's attach callback [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/48730 [2] http://thread.gmane.org/gmane.linux.kernel.cross-arch/14098 [3] http://permalink.gmane.org/gmane.comp.video.dri.devel/69302 [4] This patchset is rebased on 3.4-rc1 plus the following patchsets: Marek Szyprowski (1): v4l: vb2-dma-contig: let mmap method to use dma_mmap_coherent call Tomasz Stanislawski (11): v4l: add buffer exporting via dmabuf v4l: vb2: add buffer exporting via dmabuf v4l: vb2-dma-contig: add setup of sglist for MMAP buffers v4l: vb2-dma-contig: add support for DMABUF exporting v4l: vb2-dma-contig: add vmap/kmap for dmabuf exporting v4l: s5p-fimc: support for dmabuf exporting v4l: s5p-tv: mixer: support for dmabuf exporting v4l: s5p-mfc: support for dmabuf exporting v4l: vb2: remove vb2_mmap_pfn_range function v4l: vb2-dma-contig: use sg_alloc_table_from_pages function v4l: vb2-dma-contig: Move allocation of dbuf attachment to attach cb drivers/media/video/s5p-fimc/fimc-capture.c | 9 + drivers/media/video/s5p-mfc/s5p_mfc_dec.c | 13 ++ drivers/media/video/s5p-mfc/s5p_mfc_enc.c | 13 ++ drivers/media/video/s5p-tv/mixer_video.c | 10 + drivers/media/video/v4l2-compat-ioctl32.c | 1 + drivers/media/video/v4l2-dev.c | 1 + drivers/media/video/v4l2-ioctl.c | 6 + drivers/media/video/videobuf2-core.c | 67 ++++++ drivers/media/video/videobuf2-dma-contig.c | 323 ++++++++++++++++++++++----- drivers/media/video/videobuf2-memops.c | 40 ---- include/linux/videodev2.h | 26 +++ include/media/v4l2-ioctl.h | 2 + include/media/videobuf2-core.h | 2 + include/media/videobuf2-memops.h | 5 - 14 files changed, 411 insertions(+), 107 deletions(-) -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:24 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:24 +0200 Subject: [Linaro-mm-sig] [PATCH 01/12] v4l: vb2-dma-contig: let mmap method to use dma_mmap_coherent call In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-2-git-send-email-t.stanislaws@samsung.com> From: Marek Szyprowski Let mmap method to use dma_mmap_coherent call. This patch depends on DMA mapping redesign patches because the usage of dma_mmap_coherent breaks dma-contig allocator for architectures other than ARM and AVR. Signed-off-by: Marek Szyprowski --- drivers/media/video/videobuf2-dma-contig.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 9c213bc..52b4f59 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -224,14 +224,38 @@ static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) { struct vb2_dc_buf *buf = buf_priv; + int ret; if (!buf) { printk(KERN_ERR "No buffer to map\n"); return -EINVAL; } - return vb2_mmap_pfn_range(vma, buf->dma_addr, buf->size, - &vb2_common_vm_ops, &buf->handler); + /* + * dma_mmap_* uses vm_pgoff as in-buffer offset, but we want to + * map whole buffer + */ + vma->vm_pgoff = 0; + + ret = dma_mmap_coherent(buf->dev, vma, buf->vaddr, + buf->dma_addr, buf->size); + + if (ret) { + printk(KERN_ERR "Remapping memory failed, error: %d\n", ret); + return ret; + } + + vma->vm_flags |= VM_DONTEXPAND | VM_RESERVED; + vma->vm_private_data = &buf->handler; + vma->vm_ops = &vb2_common_vm_ops; + + vma->vm_ops->open(vma); + + printk(KERN_DEBUG "%s: mapped dma addr 0x%08lx at 0x%08lx, size %ld\n", + __func__, (unsigned long)buf->dma_addr, vma->vm_start, + buf->size); + + return 0; } /*********************************************/ -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:28 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:28 +0200 Subject: [Linaro-mm-sig] [PATCH 05/12] v4l: vb2-dma-contig: add support for DMABUF exporting In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-6-git-send-email-t.stanislaws@samsung.com> This patch adds support for exporting a dma-contig buffer using DMABUF interface. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/videobuf2-dma-contig.c | 119 ++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index ae656be..b5826e0 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -325,6 +325,124 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) } /*********************************************/ +/* DMABUF ops for exporters */ +/*********************************************/ + +struct vb2_dc_attachment { + struct sg_table sgt; + enum dma_data_direction dir; +}; + +static int vb2_dc_dmabuf_ops_attach(struct dma_buf *dbuf, struct device *dev, + struct dma_buf_attachment *dbuf_attach) +{ + /* nothing to be done */ + return 0; +} + +static void vb2_dc_dmabuf_ops_detach(struct dma_buf *dbuf, + struct dma_buf_attachment *db_attach) +{ + struct vb2_dc_attachment *attach = db_attach->priv; + struct sg_table *sgt; + + if (!attach) + return; + + sgt = &attach->sgt; + + dma_unmap_sg(db_attach->dev, sgt->sgl, sgt->nents, attach->dir); + sg_free_table(sgt); + kfree(attach); + db_attach->priv = NULL; +} + +static struct sg_table *vb2_dc_dmabuf_ops_map( + struct dma_buf_attachment *db_attach, enum dma_data_direction dir) +{ + struct dma_buf *dbuf = db_attach->dmabuf; + struct vb2_dc_buf *buf = dbuf->priv; + struct vb2_dc_attachment *attach = db_attach->priv; + struct sg_table *sgt; + struct scatterlist *rd, *wr; + int i, ret; + + /* return previously mapped sg table */ + if (attach) + return &attach->sgt; + + attach = kzalloc(sizeof *attach, GFP_KERNEL); + if (!attach) + return ERR_PTR(-ENOMEM); + + sgt = &attach->sgt; + attach->dir = dir; + + /* copying the buf->base_sgt to attachment */ + ret = sg_alloc_table(sgt, buf->sgt_base->orig_nents, GFP_KERNEL); + if (ret) { + kfree(attach); + return ERR_PTR(-ENOMEM); + } + + rd = buf->sgt_base->sgl; + wr = sgt->sgl; + for (i = 0; i < sgt->orig_nents; ++i) { + sg_set_page(wr, sg_page(rd), rd->length, rd->offset); + rd = sg_next(rd); + wr = sg_next(wr); + } + + /* mapping new sglist to the client */ + ret = dma_map_sg(db_attach->dev, sgt->sgl, sgt->orig_nents, dir); + if (ret <= 0) { + printk(KERN_ERR "failed to map scatterlist\n"); + sg_free_table(sgt); + kfree(attach); + return ERR_PTR(-EIO); + } + + db_attach->priv = attach; + + return sgt; +} + +static void vb2_dc_dmabuf_ops_unmap(struct dma_buf_attachment *db_attach, + struct sg_table *sgt, enum dma_data_direction dir) +{ + /* nothing to be done here */ +} + +static void vb2_dc_dmabuf_ops_release(struct dma_buf *dbuf) +{ + /* drop reference obtained in vb2_dc_get_dmabuf */ + vb2_dc_put(dbuf->priv); +} + +static struct dma_buf_ops vb2_dc_dmabuf_ops = { + .attach = vb2_dc_dmabuf_ops_attach, + .detach = vb2_dc_dmabuf_ops_detach, + .map_dma_buf = vb2_dc_dmabuf_ops_map, + .unmap_dma_buf = vb2_dc_dmabuf_ops_unmap, + .release = vb2_dc_dmabuf_ops_release, +}; + +static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv) +{ + struct vb2_dc_buf *buf = buf_priv; + struct dma_buf *dbuf; + + dbuf = dma_buf_export(buf, &vb2_dc_dmabuf_ops, buf->size, 0); + if (IS_ERR(dbuf)) + return NULL; + + /* dmabuf keeps reference to vb2 buffer */ + atomic_inc(&buf->refcount); + + return dbuf; +} + +/*********************************************/ /* callbacks for USERPTR buffers */ /*********************************************/ @@ -621,6 +739,7 @@ static void *vb2_dc_attach_dmabuf(void *alloc_ctx, struct dma_buf *dbuf, const struct vb2_mem_ops vb2_dma_contig_memops = { .alloc = vb2_dc_alloc, .put = vb2_dc_put, + .get_dmabuf = vb2_dc_get_dmabuf, .cookie = vb2_dc_cookie, .vaddr = vb2_dc_vaddr, .mmap = vb2_dc_mmap, -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:31 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:31 +0200 Subject: [Linaro-mm-sig] [PATCH 08/12] v4l: s5p-tv: mixer: support for dmabuf exporting In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-9-git-send-email-t.stanislaws@samsung.com> This patch enhances s5p-tv with support for DMABUF exporting via VIDIOC_EXPBUF ioctl. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/s5p-tv/mixer_video.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/media/video/s5p-tv/mixer_video.c b/drivers/media/video/s5p-tv/mixer_video.c index cff974a..d8def5b 100644 --- a/drivers/media/video/s5p-tv/mixer_video.c +++ b/drivers/media/video/s5p-tv/mixer_video.c @@ -697,6 +697,15 @@ static int mxr_dqbuf(struct file *file, void *priv, struct v4l2_buffer *p) return vb2_dqbuf(&layer->vb_queue, p, file->f_flags & O_NONBLOCK); } +static int mxr_expbuf(struct file *file, void *priv, + struct v4l2_exportbuffer *eb) +{ + struct mxr_layer *layer = video_drvdata(file); + + mxr_dbg(layer->mdev, "%s:%d\n", __func__, __LINE__); + return vb2_expbuf(&layer->vb_queue, eb); +} + static int mxr_streamon(struct file *file, void *priv, enum v4l2_buf_type i) { struct mxr_layer *layer = video_drvdata(file); @@ -724,6 +733,7 @@ static const struct v4l2_ioctl_ops mxr_ioctl_ops = { .vidioc_querybuf = mxr_querybuf, .vidioc_qbuf = mxr_qbuf, .vidioc_dqbuf = mxr_dqbuf, + .vidioc_expbuf = mxr_expbuf, /* Streaming control */ .vidioc_streamon = mxr_streamon, .vidioc_streamoff = mxr_streamoff, -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:25 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:25 +0200 Subject: [Linaro-mm-sig] [PATCH 02/12] v4l: add buffer exporting via dmabuf In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-3-git-send-email-t.stanislaws@samsung.com> This patch adds extension to V4L2 api. It allow to export a mmap buffer as file descriptor. New ioctl VIDIOC_EXPBUF is added. It takes a buffer offset used by mmap and return a file descriptor on success. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/v4l2-compat-ioctl32.c | 1 + drivers/media/video/v4l2-dev.c | 1 + drivers/media/video/v4l2-ioctl.c | 6 ++++++ include/linux/videodev2.h | 26 ++++++++++++++++++++++++++ include/media/v4l2-ioctl.h | 2 ++ 5 files changed, 36 insertions(+) diff --git a/drivers/media/video/v4l2-compat-ioctl32.c b/drivers/media/video/v4l2-compat-ioctl32.c index 5327ad3..45159d9 100644 --- a/drivers/media/video/v4l2-compat-ioctl32.c +++ b/drivers/media/video/v4l2-compat-ioctl32.c @@ -954,6 +954,7 @@ long v4l2_compat_ioctl32(struct file *file, unsigned int cmd, unsigned long arg) case VIDIOC_S_FBUF32: case VIDIOC_OVERLAY32: case VIDIOC_QBUF32: + case VIDIOC_EXPBUF: case VIDIOC_DQBUF32: case VIDIOC_STREAMON32: case VIDIOC_STREAMOFF32: diff --git a/drivers/media/video/v4l2-dev.c b/drivers/media/video/v4l2-dev.c index 5ccbd46..6bf6307 100644 --- a/drivers/media/video/v4l2-dev.c +++ b/drivers/media/video/v4l2-dev.c @@ -597,6 +597,7 @@ static void determine_valid_ioctls(struct video_device *vdev) SET_VALID_IOCTL(ops, VIDIOC_REQBUFS, vidioc_reqbufs); SET_VALID_IOCTL(ops, VIDIOC_QUERYBUF, vidioc_querybuf); SET_VALID_IOCTL(ops, VIDIOC_QBUF, vidioc_qbuf); + SET_VALID_IOCTL(ops, VIDIOC_EXPBUF, vidioc_expbuf); SET_VALID_IOCTL(ops, VIDIOC_DQBUF, vidioc_dqbuf); SET_VALID_IOCTL(ops, VIDIOC_OVERLAY, vidioc_overlay); SET_VALID_IOCTL(ops, VIDIOC_G_FBUF, vidioc_g_fbuf); diff --git a/drivers/media/video/v4l2-ioctl.c b/drivers/media/video/v4l2-ioctl.c index 31fc2ad..a73b14e 100644 --- a/drivers/media/video/v4l2-ioctl.c +++ b/drivers/media/video/v4l2-ioctl.c @@ -212,6 +212,7 @@ static struct v4l2_ioctl_info v4l2_ioctls[] = { IOCTL_INFO(VIDIOC_S_FBUF, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_OVERLAY, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_QBUF, 0), + IOCTL_INFO(VIDIOC_EXPBUF, 0), IOCTL_INFO(VIDIOC_DQBUF, 0), IOCTL_INFO(VIDIOC_STREAMON, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_STREAMOFF, INFO_FL_PRIO), @@ -957,6 +958,11 @@ static long __video_do_ioctl(struct file *file, dbgbuf(cmd, vfd, p); break; } + case VIDIOC_EXPBUF: + { + ret = ops->vidioc_expbuf(file, fh, arg); + break; + } case VIDIOC_DQBUF: { struct v4l2_buffer *p = arg; diff --git a/include/linux/videodev2.h b/include/linux/videodev2.h index 51b20f4..e8893a5 100644 --- a/include/linux/videodev2.h +++ b/include/linux/videodev2.h @@ -684,6 +684,31 @@ struct v4l2_buffer { #define V4L2_BUF_FLAG_NO_CACHE_INVALIDATE 0x0800 #define V4L2_BUF_FLAG_NO_CACHE_CLEAN 0x1000 +/** + * struct v4l2_exportbuffer - export of video buffer as DMABUF file descriptor + * + * @fd: file descriptor associated with DMABUF (set by driver) + * @mem_offset: buffer memory offset as returned by VIDIOC_QUERYBUF in struct + * v4l2_buffer::m.offset (for single-plane formats) or + * v4l2_plane::m.offset (for multi-planar formats) + * @flags: flags for newly created file, currently only O_CLOEXEC is + * supported, refer to manual of open syscall for more details + * + * Contains data used for exporting a video buffer as DMABUF file descriptor. + * The buffer is identified by a 'cookie' returned by VIDIOC_QUERYBUF + * (identical to the cookie used to mmap() the buffer to userspace). All + * reserved fields must be set to zero. The field reserved0 is expected to + * become a structure 'type' allowing an alternative layout of the structure + * content. Therefore this field should not be used for any other extensions. + */ +struct v4l2_exportbuffer { + __u32 fd; + __u32 reserved0; + __u32 mem_offset; + __u32 flags; + __u32 reserved[12]; +}; + /* * O V E R L A Y P R E V I E W */ @@ -2553,6 +2578,7 @@ struct v4l2_create_buffers { #define VIDIOC_S_FBUF _IOW('V', 11, struct v4l2_framebuffer) #define VIDIOC_OVERLAY _IOW('V', 14, int) #define VIDIOC_QBUF _IOWR('V', 15, struct v4l2_buffer) +#define VIDIOC_EXPBUF _IOWR('V', 16, struct v4l2_exportbuffer) #define VIDIOC_DQBUF _IOWR('V', 17, struct v4l2_buffer) #define VIDIOC_STREAMON _IOW('V', 18, int) #define VIDIOC_STREAMOFF _IOW('V', 19, int) diff --git a/include/media/v4l2-ioctl.h b/include/media/v4l2-ioctl.h index d8b76f7..ccd1faa 100644 --- a/include/media/v4l2-ioctl.h +++ b/include/media/v4l2-ioctl.h @@ -119,6 +119,8 @@ struct v4l2_ioctl_ops { int (*vidioc_reqbufs) (struct file *file, void *fh, struct v4l2_requestbuffers *b); int (*vidioc_querybuf)(struct file *file, void *fh, struct v4l2_buffer *b); int (*vidioc_qbuf) (struct file *file, void *fh, struct v4l2_buffer *b); + int (*vidioc_expbuf) (struct file *file, void *fh, + struct v4l2_exportbuffer *e); int (*vidioc_dqbuf) (struct file *file, void *fh, struct v4l2_buffer *b); int (*vidioc_create_bufs)(struct file *file, void *fh, struct v4l2_create_buffers *b); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:26 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:26 +0200 Subject: [Linaro-mm-sig] [PATCH 03/12] v4l: vb2: add buffer exporting via dmabuf In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-4-git-send-email-t.stanislaws@samsung.com> This patch adds extension to videobuf2-core. It allow to export a mmap buffer as a file descriptor. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/videobuf2-core.c | 67 ++++++++++++++++++++++++++++++++++ include/media/videobuf2-core.h | 2 + 2 files changed, 69 insertions(+) diff --git a/drivers/media/video/videobuf2-core.c b/drivers/media/video/videobuf2-core.c index d60ed25..923165a 100644 --- a/drivers/media/video/videobuf2-core.c +++ b/drivers/media/video/videobuf2-core.c @@ -1730,6 +1730,73 @@ static int __find_plane_by_offset(struct vb2_queue *q, unsigned long off, } /** + * vb2_expbuf() - Export a buffer as a file descriptor + * @q: videobuf2 queue + * @eb: export buffer structure passed from userspace to vidioc_expbuf + * handler in driver + * + * The return values from this function are intended to be directly returned + * from vidioc_expbuf handler in driver. + */ +int vb2_expbuf(struct vb2_queue *q, struct v4l2_exportbuffer *eb) +{ + struct vb2_buffer *vb = NULL; + struct vb2_plane *vb_plane; + unsigned int buffer, plane; + int ret; + struct dma_buf *dbuf; + + if (q->memory != V4L2_MEMORY_MMAP) { + dprintk(1, "Queue is not currently set up for mmap\n"); + return -EINVAL; + } + + if (!q->mem_ops->get_dmabuf) { + dprintk(1, "Queue does not support DMA buffer exporting\n"); + return -EINVAL; + } + + if (eb->flags & ~O_CLOEXEC) { + dprintk(1, "Queue does support only O_CLOEXEC flag\n"); + return -EINVAL; + } + + /* + * Find the plane corresponding to the offset passed by userspace. + */ + ret = __find_plane_by_offset(q, eb->mem_offset, &buffer, &plane); + if (ret) { + dprintk(1, "invalid offset %u\n", eb->mem_offset); + return ret; + } + + vb = q->bufs[buffer]; + vb_plane = &vb->planes[plane]; + + dbuf = call_memop(q, get_dmabuf, vb_plane->mem_priv); + if (IS_ERR_OR_NULL(dbuf)) { + dprintk(1, "Failed to export buffer %d, plane %d\n", + buffer, plane); + return -EINVAL; + } + + ret = dma_buf_fd(dbuf, eb->flags); + if (ret < 0) { + dprintk(3, "buffer %d, plane %d failed to export (%d)\n", + buffer, plane, ret); + dma_buf_put(dbuf); + return ret; + } + + dprintk(3, "buffer %d, plane %d exported as %d descriptor\n", + buffer, plane, ret); + eb->fd = ret; + + return 0; +} +EXPORT_SYMBOL_GPL(vb2_expbuf); + +/** * vb2_mmap() - map video buffers into application address space * @q: videobuf2 queue * @vma: vma passed to the mmap file operation handler in the driver diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h index d079f92..fe01f95 100644 --- a/include/media/videobuf2-core.h +++ b/include/media/videobuf2-core.h @@ -81,6 +81,7 @@ struct vb2_fileio_data; struct vb2_mem_ops { void *(*alloc)(void *alloc_ctx, unsigned long size); void (*put)(void *buf_priv); + struct dma_buf *(*get_dmabuf)(void *buf_priv); void *(*get_userptr)(void *alloc_ctx, unsigned long vaddr, unsigned long size, int write); @@ -350,6 +351,7 @@ int vb2_queue_init(struct vb2_queue *q); void vb2_queue_release(struct vb2_queue *q); int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b); +int vb2_expbuf(struct vb2_queue *q, struct v4l2_exportbuffer *eb); int vb2_dqbuf(struct vb2_queue *q, struct v4l2_buffer *b, bool nonblocking); int vb2_streamon(struct vb2_queue *q, enum v4l2_buf_type type); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:30 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:30 +0200 Subject: [Linaro-mm-sig] [PATCH 07/12] v4l: s5p-fimc: support for dmabuf exporting In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-8-git-send-email-t.stanislaws@samsung.com> This patch enhances s5p-fimc with support for DMABUF exporting via VIDIOC_EXPBUF ioctl. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/s5p-fimc/fimc-capture.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c index cd27e33..52c9b36 100644 --- a/drivers/media/video/s5p-fimc/fimc-capture.c +++ b/drivers/media/video/s5p-fimc/fimc-capture.c @@ -1101,6 +1101,14 @@ static int fimc_cap_qbuf(struct file *file, void *priv, return vb2_qbuf(&fimc->vid_cap.vbq, buf); } +static int fimc_cap_expbuf(struct file *file, void *priv, + struct v4l2_exportbuffer *eb) +{ + struct fimc_dev *fimc = video_drvdata(file); + + return vb2_expbuf(&fimc->vid_cap.vbq, eb); +} + static int fimc_cap_dqbuf(struct file *file, void *priv, struct v4l2_buffer *buf) { @@ -1225,6 +1233,7 @@ static const struct v4l2_ioctl_ops fimc_capture_ioctl_ops = { .vidioc_qbuf = fimc_cap_qbuf, .vidioc_dqbuf = fimc_cap_dqbuf, + .vidioc_expbuf = fimc_cap_expbuf, .vidioc_prepare_buf = fimc_cap_prepare_buf, .vidioc_create_bufs = fimc_cap_create_bufs, -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:32 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:32 +0200 Subject: [Linaro-mm-sig] [PATCH 09/12] v4l: s5p-mfc: support for dmabuf exporting In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-10-git-send-email-t.stanislaws@samsung.com> This patch enhances s5p-mfc with support for DMABUF exporting via VIDIOC_EXPBUF ioctl. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park CC: Kamil Debski --- drivers/media/video/s5p-mfc/s5p_mfc_dec.c | 13 +++++++++++++ drivers/media/video/s5p-mfc/s5p_mfc_enc.c | 13 +++++++++++++ 2 files changed, 26 insertions(+) diff --git a/drivers/media/video/s5p-mfc/s5p_mfc_dec.c b/drivers/media/video/s5p-mfc/s5p_mfc_dec.c index c25ec02..e1ebc76 100644 --- a/drivers/media/video/s5p-mfc/s5p_mfc_dec.c +++ b/drivers/media/video/s5p-mfc/s5p_mfc_dec.c @@ -564,6 +564,18 @@ static int vidioc_dqbuf(struct file *file, void *priv, struct v4l2_buffer *buf) return -EINVAL; } +/* Export DMA buffer */ +static int vidioc_expbuf(struct file *file, void *priv, + struct v4l2_exportbuffer *eb) +{ + struct s5p_mfc_ctx *ctx = fh_to_ctx(priv); + + if (eb->mem_offset < DST_QUEUE_OFF_BASE) + return vb2_expbuf(&ctx->vq_src, eb); + else + return vb2_expbuf(&ctx->vq_dst, eb); +} + /* Stream on */ static int vidioc_streamon(struct file *file, void *priv, enum v4l2_buf_type type) @@ -739,6 +751,7 @@ static const struct v4l2_ioctl_ops s5p_mfc_dec_ioctl_ops = { .vidioc_querybuf = vidioc_querybuf, .vidioc_qbuf = vidioc_qbuf, .vidioc_dqbuf = vidioc_dqbuf, + .vidioc_expbuf = vidioc_expbuf, .vidioc_streamon = vidioc_streamon, .vidioc_streamoff = vidioc_streamoff, .vidioc_g_crop = vidioc_g_crop, diff --git a/drivers/media/video/s5p-mfc/s5p_mfc_enc.c b/drivers/media/video/s5p-mfc/s5p_mfc_enc.c index acedb20..887f1aa 100644 --- a/drivers/media/video/s5p-mfc/s5p_mfc_enc.c +++ b/drivers/media/video/s5p-mfc/s5p_mfc_enc.c @@ -1141,6 +1141,18 @@ static int vidioc_dqbuf(struct file *file, void *priv, struct v4l2_buffer *buf) return -EINVAL; } +/* Export DMA buffer */ +static int vidioc_expbuf(struct file *file, void *priv, + struct v4l2_exportbuffer *eb) +{ + struct s5p_mfc_ctx *ctx = fh_to_ctx(priv); + + if (eb->mem_offset < DST_QUEUE_OFF_BASE) + return vb2_expbuf(&ctx->vq_src, eb); + else + return vb2_expbuf(&ctx->vq_dst, eb); +} + /* Stream on */ static int vidioc_streamon(struct file *file, void *priv, enum v4l2_buf_type type) @@ -1486,6 +1498,7 @@ static const struct v4l2_ioctl_ops s5p_mfc_enc_ioctl_ops = { .vidioc_querybuf = vidioc_querybuf, .vidioc_qbuf = vidioc_qbuf, .vidioc_dqbuf = vidioc_dqbuf, + .vidioc_expbuf = vidioc_expbuf, .vidioc_streamon = vidioc_streamon, .vidioc_streamoff = vidioc_streamoff, .vidioc_s_parm = vidioc_s_parm, -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:34 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:34 +0200 Subject: [Linaro-mm-sig] [PATCH 11/12] v4l: vb2-dma-contig: use sg_alloc_table_from_pages function In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-12-git-send-email-t.stanislaws@samsung.com> This patch makes use of sg_alloc_table_from_pages to simplify handling of sg tables. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/videobuf2-dma-contig.c | 90 ++++++++-------------------- 1 file changed, 25 insertions(+), 65 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 59ee81c..b5caf1d 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -32,7 +32,7 @@ struct vb2_dc_buf { /* MMAP related */ struct vb2_vmarea_handler handler; atomic_t refcount; - struct sg_table *sgt_base; + struct sg_table sgt_base; /* USERPTR related */ struct vm_area_struct *vma; @@ -45,57 +45,6 @@ struct vb2_dc_buf { /* scatterlist table functions */ /*********************************************/ -static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages, - unsigned int n_pages, unsigned long offset, unsigned long size) -{ - struct sg_table *sgt; - unsigned int chunks; - unsigned int i; - unsigned int cur_page; - int ret; - struct scatterlist *s; - - sgt = kzalloc(sizeof *sgt, GFP_KERNEL); - if (!sgt) - return ERR_PTR(-ENOMEM); - - /* compute number of chunks */ - chunks = 1; - for (i = 1; i < n_pages; ++i) - if (pages[i] != pages[i - 1] + 1) - ++chunks; - - ret = sg_alloc_table(sgt, chunks, GFP_KERNEL); - if (ret) { - kfree(sgt); - return ERR_PTR(-ENOMEM); - } - - /* merging chunks and putting them into the scatterlist */ - cur_page = 0; - for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { - unsigned long chunk_size; - unsigned int j; - - for (j = cur_page + 1; j < n_pages; ++j) - if (pages[j] != pages[j - 1] + 1) - break; - - chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); - size -= chunk_size; - offset = 0; - cur_page = j; - } - - return sgt; -} - -static void vb2_dc_release_sgtable(struct sg_table *sgt) -{ - sg_free_table(sgt); - kfree(sgt); -} static void vb2_dc_sgt_foreach_page(struct sg_table *sgt, void (*cb)(struct page *pg)) @@ -190,7 +139,7 @@ static void vb2_dc_put(void *buf_priv) if (!atomic_dec_and_test(&buf->refcount)) return; - vb2_dc_release_sgtable(buf->sgt_base); + sg_free_table(&buf->sgt_base); dma_free_coherent(buf->dev, buf->size, buf->vaddr, buf->dma_addr); kfree(buf); } @@ -254,9 +203,9 @@ static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) goto fail_pages; } - buf->sgt_base = vb2_dc_pages_to_sgt(pages, n_pages, 0, size); - if (IS_ERR(buf->sgt_base)) { - ret = PTR_ERR(buf->sgt_base); + ret = sg_alloc_table_from_pages(&buf->sgt_base, + pages, n_pages, 0, size, GFP_KERNEL); + if (ret) { dev_err(dev, "failed to prepare sg table\n"); goto fail_pages; } @@ -379,13 +328,13 @@ static struct sg_table *vb2_dc_dmabuf_ops_map( attach->dir = dir; /* copying the buf->base_sgt to attachment */ - ret = sg_alloc_table(sgt, buf->sgt_base->orig_nents, GFP_KERNEL); + ret = sg_alloc_table(sgt, buf->sgt_base.orig_nents, GFP_KERNEL); if (ret) { kfree(attach); return ERR_PTR(-ENOMEM); } - rd = buf->sgt_base->sgl; + rd = buf->sgt_base.sgl; wr = sgt->sgl; for (i = 0; i < sgt->orig_nents; ++i) { sg_set_page(wr, sg_page(rd), rd->length, rd->offset); @@ -519,7 +468,8 @@ static void vb2_dc_put_userptr(void *buf_priv) if (!vma_is_io(buf->vma)) vb2_dc_sgt_foreach_page(sgt, vb2_dc_put_dirty_page); - vb2_dc_release_sgtable(sgt); + sg_free_table(sgt); + kfree(sgt); vb2_put_vma(buf->vma); kfree(buf); } @@ -586,13 +536,20 @@ static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, goto fail_vma; } - sgt = vb2_dc_pages_to_sgt(pages, n_pages, offset, size); - if (IS_ERR(sgt)) { - printk(KERN_ERR "failed to create scatterlist table\n"); + sgt = kzalloc(sizeof *sgt, GFP_KERNEL); + if (!sgt) { + printk(KERN_ERR "failed to allocate sg table\n"); ret = -ENOMEM; goto fail_get_user_pages; } + ret = sg_alloc_table_from_pages(sgt, pages, n_pages, + offset, size, GFP_KERNEL); + if (ret) { + printk(KERN_ERR "failed to initialize sg table\n"); + goto fail_sgt; + } + /* pages are no longer needed */ kfree(pages); pages = NULL; @@ -602,7 +559,7 @@ static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, if (sgt->nents <= 0) { printk(KERN_ERR "failed to map scatterlist\n"); ret = -EIO; - goto fail_sgt; + goto fail_sgt_init; } contig_size = vb2_dc_get_contiguous_size(sgt); @@ -622,10 +579,13 @@ static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, fail_map_sg: dma_unmap_sg(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); -fail_sgt: +fail_sgt_init: if (!vma_is_io(buf->vma)) vb2_dc_sgt_foreach_page(sgt, put_page); - vb2_dc_release_sgtable(sgt); + sg_free_table(sgt); + +fail_sgt: + kfree(sgt); fail_get_user_pages: if (pages && !vma_is_io(buf->vma)) -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:27 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:27 +0200 Subject: [Linaro-mm-sig] [PATCH 04/12] v4l: vb2-dma-contig: add setup of sglist for MMAP buffers In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-5-git-send-email-t.stanislaws@samsung.com> This patch adds the setup of sglist list for MMAP buffers. It is needed for buffer exporting via DMABUF mechanism. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/videobuf2-dma-contig.c | 70 +++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 2 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index 52b4f59..ae656be 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -32,6 +32,7 @@ struct vb2_dc_buf { /* MMAP related */ struct vb2_vmarea_handler handler; atomic_t refcount; + struct sg_table *sgt_base; /* USERPTR related */ struct vm_area_struct *vma; @@ -189,14 +190,37 @@ static void vb2_dc_put(void *buf_priv) if (!atomic_dec_and_test(&buf->refcount)) return; + vb2_dc_release_sgtable(buf->sgt_base); dma_free_coherent(buf->dev, buf->size, buf->vaddr, buf->dma_addr); kfree(buf); } +static int vb2_dc_kaddr_to_pages(unsigned long kaddr, + struct page **pages, unsigned int n_pages) +{ + unsigned int i; + unsigned long pfn; + struct vm_area_struct vma = { + .vm_flags = VM_IO | VM_PFNMAP, + .vm_mm = current->mm, + }; + + for (i = 0; i < n_pages; ++i, kaddr += PAGE_SIZE) { + if (follow_pfn(&vma, kaddr, &pfn)) + break; + pages[i] = pfn_to_page(pfn); + } + + return i; +} + static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) { struct device *dev = alloc_ctx; struct vb2_dc_buf *buf; + int ret = -ENOMEM; + int n_pages; + struct page **pages = NULL; buf = kzalloc(sizeof *buf, GFP_KERNEL); if (!buf) @@ -205,10 +229,41 @@ static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) buf->vaddr = dma_alloc_coherent(dev, size, &buf->dma_addr, GFP_KERNEL); if (!buf->vaddr) { dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size); - kfree(buf); - return ERR_PTR(-ENOMEM); + goto fail_buf; + } + + WARN_ON((unsigned long)buf->vaddr & ~PAGE_MASK); + WARN_ON(buf->dma_addr & ~PAGE_MASK); + + n_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; + + pages = kmalloc(n_pages * sizeof pages[0], GFP_KERNEL); + if (!pages) { + dev_err(dev, "failed to alloc page table\n"); + goto fail_dma; + } + + ret = vb2_dc_kaddr_to_pages((unsigned long)buf->vaddr, pages, n_pages); + if (ret < 0) { + dev_err(dev, "failed to get buffer pages from DMA API\n"); + goto fail_pages; + } + if (ret != n_pages) { + ret = -EFAULT; + dev_err(dev, "failed to get all pages from DMA API\n"); + goto fail_pages; + } + + buf->sgt_base = vb2_dc_pages_to_sgt(pages, n_pages, 0, size); + if (IS_ERR(buf->sgt_base)) { + ret = PTR_ERR(buf->sgt_base); + dev_err(dev, "failed to prepare sg table\n"); + goto fail_pages; } + /* pages are no longer needed */ + kfree(pages); + buf->dev = dev; buf->size = size; @@ -219,6 +274,17 @@ static void *vb2_dc_alloc(void *alloc_ctx, unsigned long size) atomic_inc(&buf->refcount); return buf; + +fail_pages: + kfree(pages); + +fail_dma: + dma_free_coherent(dev, size, buf->vaddr, buf->dma_addr); + +fail_buf: + kfree(buf); + + return ERR_PTR(ret); } static int vb2_dc_mmap(void *buf_priv, struct vm_area_struct *vma) -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:29 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:29 +0200 Subject: [Linaro-mm-sig] [PATCH 06/12] v4l: vb2-dma-contig: add vmap/kmap for dmabuf exporting In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-7-git-send-email-t.stanislaws@samsung.com> This patch adds support for vmap and kmap callbacks for DMABUF exporter. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/videobuf2-dma-contig.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index b5826e0..59ee81c 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -419,11 +419,28 @@ static void vb2_dc_dmabuf_ops_release(struct dma_buf *dbuf) vb2_dc_put(dbuf->priv); } +static void *vb2_dc_dmabuf_ops_kmap(struct dma_buf *dbuf, unsigned long pgnum) +{ + struct vb2_dc_buf *buf = dbuf->priv; + + return buf->vaddr + pgnum * PAGE_SIZE; +} + +static void *vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf) +{ + struct vb2_dc_buf *buf = dbuf->priv; + + return buf->vaddr; +} + static struct dma_buf_ops vb2_dc_dmabuf_ops = { .attach = vb2_dc_dmabuf_ops_attach, .detach = vb2_dc_dmabuf_ops_detach, .map_dma_buf = vb2_dc_dmabuf_ops_map, .unmap_dma_buf = vb2_dc_dmabuf_ops_unmap, + .kmap = vb2_dc_dmabuf_ops_kmap, + .kmap_atomic = vb2_dc_dmabuf_ops_kmap, + .vmap = vb2_dc_dmabuf_ops_vmap, .release = vb2_dc_dmabuf_ops_release, }; -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:33 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:33 +0200 Subject: [Linaro-mm-sig] [PATCH 10/12] v4l: vb2: remove vb2_mmap_pfn_range function In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-11-git-send-email-t.stanislaws@samsung.com> This patch removes vb2_mmap_pfn_range from videobuf2 helpers. The function is no longer used in vb2 code. Suggested-by: Laurent Pinchart Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/videobuf2-memops.c | 40 -------------------------------- include/media/videobuf2-memops.h | 5 ---- 2 files changed, 45 deletions(-) diff --git a/drivers/media/video/videobuf2-memops.c b/drivers/media/video/videobuf2-memops.c index 504cd4c..81c1ad8 100644 --- a/drivers/media/video/videobuf2-memops.c +++ b/drivers/media/video/videobuf2-memops.c @@ -137,46 +137,6 @@ int vb2_get_contig_userptr(unsigned long vaddr, unsigned long size, EXPORT_SYMBOL_GPL(vb2_get_contig_userptr); /** - * vb2_mmap_pfn_range() - map physical pages to userspace - * @vma: virtual memory region for the mapping - * @paddr: starting physical address of the memory to be mapped - * @size: size of the memory to be mapped - * @vm_ops: vm operations to be assigned to the created area - * @priv: private data to be associated with the area - * - * Returns 0 on success. - */ -int vb2_mmap_pfn_range(struct vm_area_struct *vma, unsigned long paddr, - unsigned long size, - const struct vm_operations_struct *vm_ops, - void *priv) -{ - int ret; - - size = min_t(unsigned long, vma->vm_end - vma->vm_start, size); - - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); - ret = remap_pfn_range(vma, vma->vm_start, paddr >> PAGE_SHIFT, - size, vma->vm_page_prot); - if (ret) { - printk(KERN_ERR "Remapping memory failed, error: %d\n", ret); - return ret; - } - - vma->vm_flags |= VM_DONTEXPAND | VM_RESERVED; - vma->vm_private_data = priv; - vma->vm_ops = vm_ops; - - vma->vm_ops->open(vma); - - pr_debug("%s: mapped paddr 0x%08lx at 0x%08lx, size %ld\n", - __func__, paddr, vma->vm_start, size); - - return 0; -} -EXPORT_SYMBOL_GPL(vb2_mmap_pfn_range); - -/** * vb2_common_vm_open() - increase refcount of the vma * @vma: virtual memory region for the mapping * diff --git a/include/media/videobuf2-memops.h b/include/media/videobuf2-memops.h index 84e1f6c..f05444c 100644 --- a/include/media/videobuf2-memops.h +++ b/include/media/videobuf2-memops.h @@ -33,11 +33,6 @@ extern const struct vm_operations_struct vb2_common_vm_ops; int vb2_get_contig_userptr(unsigned long vaddr, unsigned long size, struct vm_area_struct **res_vma, dma_addr_t *res_pa); -int vb2_mmap_pfn_range(struct vm_area_struct *vma, unsigned long paddr, - unsigned long size, - const struct vm_operations_struct *vm_ops, - void *priv); - struct vm_area_struct *vb2_get_vma(struct vm_area_struct *vma); void vb2_put_vma(struct vm_area_struct *vma); -- 1.7.9.5 From t.stanislaws at samsung.com Wed May 23 13:07:35 2012 From: t.stanislaws at samsung.com (Tomasz Stanislawski) Date: Wed, 23 May 2012 15:07:35 +0200 Subject: [Linaro-mm-sig] [PATCH 12/12] v4l: vb2-dma-contig: Move allocation of dbuf attachment to attach cb In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <1337778455-27912-13-git-send-email-t.stanislaws@samsung.com> The allocation of dma_buf_attachment is moved to attach callback. The initialization is left in map callback. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park --- drivers/media/video/videobuf2-dma-contig.c | 39 ++++++++++++++++++---------- 1 file changed, 26 insertions(+), 13 deletions(-) diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index b5caf1d..3bf7c45 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -285,7 +285,15 @@ struct vb2_dc_attachment { static int vb2_dc_dmabuf_ops_attach(struct dma_buf *dbuf, struct device *dev, struct dma_buf_attachment *dbuf_attach) { - /* nothing to be done */ + struct vb2_dc_attachment *attach; + + attach = kzalloc(sizeof *attach, GFP_KERNEL); + if (!attach) + return -ENOMEM; + + attach->dir = DMA_NONE; + dbuf_attach->priv = attach; + return 0; } @@ -300,7 +308,9 @@ static void vb2_dc_dmabuf_ops_detach(struct dma_buf *dbuf, sgt = &attach->sgt; - dma_unmap_sg(db_attach->dev, sgt->sgl, sgt->nents, attach->dir); + /* checking if scaterlist was ever mapped */ + if (attach->dir != DMA_NONE) + dma_unmap_sg(db_attach->dev, sgt->sgl, sgt->nents, attach->dir); sg_free_table(sgt); kfree(attach); db_attach->priv = NULL; @@ -314,25 +324,28 @@ static struct sg_table *vb2_dc_dmabuf_ops_map( struct vb2_dc_attachment *attach = db_attach->priv; struct sg_table *sgt; struct scatterlist *rd, *wr; - int i, ret; + int ret; + unsigned int i; + + if (WARN_ON(dir == DMA_NONE)) + return ERR_PTR(-EINVAL); /* return previously mapped sg table */ - if (attach) + if (attach->dir == dir) return &attach->sgt; - attach = kzalloc(sizeof *attach, GFP_KERNEL); - if (!attach) - return ERR_PTR(-ENOMEM); + /* reattaching is not allowed */ + if (WARN_ON(attach->dir != DMA_NONE)) + return ERR_PTR(-EBUSY); sgt = &attach->sgt; - attach->dir = dir; - /* copying the buf->base_sgt to attachment */ + /* Copy the buf->base_sgt scatter list to the attachment, as we can't + * map the same scatter list to multiple attachments at the same time. + */ ret = sg_alloc_table(sgt, buf->sgt_base.orig_nents, GFP_KERNEL); - if (ret) { - kfree(attach); + if (ret) return ERR_PTR(-ENOMEM); - } rd = buf->sgt_base.sgl; wr = sgt->sgl; @@ -347,10 +360,10 @@ static struct sg_table *vb2_dc_dmabuf_ops_map( if (ret <= 0) { printk(KERN_ERR "failed to map scatterlist\n"); sg_free_table(sgt); - kfree(attach); return ERR_PTR(-EIO); } + attach->dir = dir; db_attach->priv = attach; return sgt; -- 1.7.9.5 From sangwook.lee at linaro.org Wed May 23 14:01:10 2012 From: sangwook.lee at linaro.org (Sangwook Lee) Date: Wed, 23 May 2012 15:01:10 +0100 Subject: [Linaro-mm-sig] [PATCH 00/12] Support for dmabuf exporting for videobuf2 In-Reply-To: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> References: <1337778455-27912-1-git-send-email-t.stanislaws@samsung.com> Message-ID: Hi Tomasz On 23 May 2012 14:07, Tomasz Stanislawski wrote: > Hello everyone, > The patches adds support for DMABUF exporting to V4L2 stack. The latest > support for DMABUF importing was posted in [1]. The exporter part is > dependant > on DMA mapping redesign [2] which is not merged into the mainline. > Therefore it > is posted as a separate patchset. Moreover some patches depends on vmap > extension for DMABUF by Dave Airlie [3] and sg_alloc_table_from_pages > function > [4]. > Do you have your own git ? Thanks Sangwook > Changelog: > v0: (RFC) > - updated setup of VIDIOC_EXPBUF ioctl > - doc updates > - introduced workaround to avoid using dma_get_pages, > - removed caching of exported dmabuf to avoid existence of circular > reference > between dmabuf and vb2_dc_buf or resource leakage > - removed all 'change behaviour' patches > - inital support for exporting in s5p-mfs driver > - removal of vb2_mmap_pfn_range that is no longer used > - use sg_alloc_table_from_pages instead of creating sglist in vb2_dc code > - move attachment allocation to exporter's attach callback > > [1] > http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/48730 > [2] http://thread.gmane.org/gmane.linux.kernel.cross-arch/14098 > [3] http://permalink.gmane.org/gmane.comp.video.dri.devel/69302 > [4] This patchset is rebased on 3.4-rc1 plus the following patchsets: > > Marek Szyprowski (1): > v4l: vb2-dma-contig: let mmap method to use dma_mmap_coherent call > > Tomasz Stanislawski (11): > v4l: add buffer exporting via dmabuf > v4l: vb2: add buffer exporting via dmabuf > v4l: vb2-dma-contig: add setup of sglist for MMAP buffers > v4l: vb2-dma-contig: add support for DMABUF exporting > v4l: vb2-dma-contig: add vmap/kmap for dmabuf exporting > v4l: s5p-fimc: support for dmabuf exporting > v4l: s5p-tv: mixer: support for dmabuf exporting > v4l: s5p-mfc: support for dmabuf exporting > v4l: vb2: remove vb2_mmap_pfn_range function > v4l: vb2-dma-contig: use sg_alloc_table_from_pages function > v4l: vb2-dma-contig: Move allocation of dbuf attachment to attach cb > > drivers/media/video/s5p-fimc/fimc-capture.c | 9 + > drivers/media/video/s5p-mfc/s5p_mfc_dec.c | 13 ++ > drivers/media/video/s5p-mfc/s5p_mfc_enc.c | 13 ++ > drivers/media/video/s5p-tv/mixer_video.c | 10 + > drivers/media/video/v4l2-compat-ioctl32.c | 1 + > drivers/media/video/v4l2-dev.c | 1 + > drivers/media/video/v4l2-ioctl.c | 6 + > drivers/media/video/videobuf2-core.c | 67 ++++++ > drivers/media/video/videobuf2-dma-contig.c | 323 > ++++++++++++++++++++++----- > drivers/media/video/videobuf2-memops.c | 40 ---- > include/linux/videodev2.h | 26 +++ > include/media/v4l2-ioctl.h | 2 + > include/media/videobuf2-core.h | 2 + > include/media/videobuf2-memops.h | 5 - > 14 files changed, 411 insertions(+), 107 deletions(-) > > -- > 1.7.9.5 > > > _______________________________________________ > Linaro-mm-sig mailing list > Linaro-mm-sig at lists.linaro.org > http://lists.linaro.org/mailman/listinfo/linaro-mm-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.szyprowski at samsung.com Thu May 24 12:26:12 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Thu, 24 May 2012 14:26:12 +0200 Subject: [Linaro-mm-sig] [PATCHv2 3/4] mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework In-Reply-To: <4FBB3B41.8010102@kernel.org> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> <4FBB3B41.8010102@kernel.org> Message-ID: <01e501cd39a8$67f34ea0$37d9ebe0$%szyprowski@samsung.com> Hi Minchan, On Tuesday, May 22, 2012 9:08 AM Minchan Kim wrote: > On 05/17/2012 07:54 PM, Marek Szyprowski wrote: > > > Add new type of vm_area intented to be used for consisten mappings > > created by dma-mapping framework. > > > > Signed-off-by: Marek Szyprowski > > Reviewed-by: Kyungmin Park > > --- > > include/linux/vmalloc.h | 1 + > > mm/vmalloc.c | 3 +++ > > 2 files changed, 4 insertions(+) > > > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > > index 6071e91..8a9555a 100644 > > --- a/include/linux/vmalloc.h > > +++ b/include/linux/vmalloc.h > > @@ -14,6 +14,7 @@ struct vm_area_struct; /* vma defining user mapping in > mm_types.h */ > > #define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */ > > #define VM_VPAGES 0x00000010 /* buffer for pages was vmalloc'ed */ > > #define VM_UNLIST 0x00000020 /* vm_struct is not listed in vmlist */ > > +#define VM_DMA 0x00000040 /* used by dma-mapping framework */ > > /* bits [20..32] reserved for arch specific ioremap internals */ > > > > > > /* > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 8cb7f22..9c13bab 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2582,6 +2582,9 @@ static int s_show(struct seq_file *m, void *p) > > if (v->flags & VM_IOREMAP) > > seq_printf(m, " ioremap"); > > > > + if (v->flags & VM_DMA) > > + seq_printf(m, " dma"); > > + > > Hmm, VM_DMA would become generic flag? > AFAIU, maybe VM_DMA would be used only on ARM arch. Right now yes, it will be used only on ARM architecture, but maybe other architecture will start using it once it is available. > Of course, it isn't performance sensitive part but there in no reason to check it, either > in other architecture except ARM. > > I suggest following as > > #ifdef CONFIG_ARM > #define VM_DMA 0x00000040 > #else > #define VM_DMA 0x0 > #end > > Maybe it could remove check code at compile time. I've been told to avoid such #ifdef construction if there is no really good reason for it. The only justification was significant impact on the performance, otherwise it would be just a good example of typical over-engineering. > > if (v->flags & VM_ALLOC) > > seq_printf(m, " vmalloc"); Best regards -- Marek Szyprowski Samsung Poland R&D Center From lethal at linux-sh.org Thu May 24 12:28:54 2012 From: lethal at linux-sh.org (Paul Mundt) Date: Thu, 24 May 2012 21:28:54 +0900 Subject: [Linaro-mm-sig] [PATCHv2 3/4] mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework In-Reply-To: <01e501cd39a8$67f34ea0$37d9ebe0$%szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> <4FBB3B41.8010102@kernel.org> <01e501cd39a8$67f34ea0$37d9ebe0$%szyprowski@samsung.com> Message-ID: <20120524122854.GD11860@linux-sh.org> On Thu, May 24, 2012 at 02:26:12PM +0200, Marek Szyprowski wrote: > On Tuesday, May 22, 2012 9:08 AM Minchan Kim wrote: > > Hmm, VM_DMA would become generic flag? > > AFAIU, maybe VM_DMA would be used only on ARM arch. > > Right now yes, it will be used only on ARM architecture, but maybe other architecture will > start using it once it is available. > There's very little about the code in question that is ARM-specific to begin with. I plan to adopt similar changes on SH once the work has settled one way or the other, so we'll probably use the VMA flag there, too. From tom.cooksey at arm.com Thu May 24 16:21:48 2012 From: tom.cooksey at arm.com (Tom Cooksey) Date: Thu, 24 May 2012 17:21:48 +0100 Subject: [Linaro-mm-sig] New "xf86-video-armsoc" DDX driver In-Reply-To: <20120521090328.GA4970@phenom.ffwll.local> References: <4fba0034.e1d9440a.7f33.0bfeSMTPIN_ADDED@mx.google.com> <20120521090328.GA4970@phenom.ffwll.local> Message-ID: <000101cd39c9$51cf6b10$f56e4130$@cooksey@arm.com> > -----Original Message----- > From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel > Vetter > Sent: 21 May 2012 10:04 > To: Dave Airlie > Cc: Tom Cooksey; linaro-mm-sig at lists.linaro.org; xorg- > devel at lists.x.org; dri-devel at lists.freedesktop.org > Subject: Re: [Linaro-mm-sig] New "xf86-video-armsoc" DDX driver > > On Mon, May 21, 2012 at 09:55:06AM +0100, Dave Airlie wrote: > > > * Define a new x-server sub-module interface to allow a seperate > > > > .so 2D driver to be loaded (this is the approach the current > > > > OMAP DDX uses). > > > > This seems the sanest. > > Or go the intel glamour route and stitch together a somewhat generic 2d > accel code on top of GL. That should give you reasonable (albeit likely > not stellar) X render performance. > -Daniel I'm not sure that would perform well on a tile-based deferred renderer like Mali. To perform well, we need to gather an entire frame's worth of rendering/draw-calls before passing them to the GPU to render. I believe this is not the typical use-case of EXA? How much of the framebuffer is re-drawn between flushes? Cheers, Tom From sumit.semwal at linaro.org Fri May 25 07:55:40 2012 From: sumit.semwal at linaro.org (Sumit Semwal) Date: Fri, 25 May 2012 13:25:40 +0530 Subject: [Linaro-mm-sig] [GIT PULL]: dma-buf updates for 3.5 Message-ID: Hi Linus, Here's the first signed-tag pull request for dma-buf framework. Could you please pull the dma-buf updates for 3.5? This includes the following key items: - mmap support - vmap support - related documentation updates These are needed by various drivers to allow mmap/vmap of dma-buf shared buffers. Dave Airlie has some prime patches dependent on the vmap pull as well. Thanks and best regards, ~Sumit. The following changes since commit 76e10d158efb6d4516018846f60c2ab5501900bc: Linux 3.4 (2012-05-20 15:29:13 -0700) are available in the git repository at: ssh://sumitsemwal at git.linaro.org/~/public_git/linux-dma-buf.git tags/tag-for-linus-3.5 for you to fetch changes up to b25b086d23eb852bf3cfdeb60409b4967ebb3c0c: dma-buf: add initial vmap documentation (2012-05-25 12:51:11 +0530) ---------------------------------------------------------------- dma-buf updates for 3.5 ---------------------------------------------------------------- Daniel Vetter (1): dma-buf: mmap support Dave Airlie (2): dma-buf: add vmap interface dma-buf: add initial vmap documentation Sumit Semwal (1): dma-buf: minor documentation fixes. Documentation/dma-buf-sharing.txt | 109 ++++++++++++++++++++++++++++++++++--- drivers/base/dma-buf.c | 99 ++++++++++++++++++++++++++++++++- include/linux/dma-buf.h | 33 +++++++++++ 3 files changed, 233 insertions(+), 8 deletions(-) From airlied at gmail.com Fri May 25 09:04:17 2012 From: airlied at gmail.com (Dave Airlie) Date: Fri, 25 May 2012 10:04:17 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: fix disabled vmap function Message-ID: <1337936657-24837-1-git-send-email-airlied@gmail.com> From: Dave Airlie include/linux/dma-buf.h: In function ?dma_buf_vmap?: include/linux/dma-buf.h:260:1: warning: no return statement in function returning non-void [-Wreturn-type] Reported-by: wfg at linux.intel.com Signed-off-by: Dave Airlie --- include/linux/dma-buf.h | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index d8c2865..506bb7b 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -257,6 +257,7 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, static inline void *dma_buf_vmap(struct dma_buf *dmabuf) { + return NULL; } static inline void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) -- 1.7.6 From sumit.semwal at linaro.org Fri May 25 09:17:10 2012 From: sumit.semwal at linaro.org (Sumit Semwal) Date: Fri, 25 May 2012 14:47:10 +0530 Subject: [Linaro-mm-sig] [GIT PULL]: dma-buf updates for 3.5 In-Reply-To: References: Message-ID: Hi Linus, On 25 May 2012 13:25, Sumit Semwal wrote: > Hi Linus, > > Here's the first signed-tag pull request for dma-buf framework. > > Could you please pull the dma-buf updates for 3.5? This includes the > following key items: > - mmap support > - vmap support > - related documentation updates > > These are needed by various drivers to allow mmap/vmap of dma-buf > shared buffers. Dave Airlie has some prime patches dependent on the > vmap pull as well. > > Thanks and best regards, > ~Sumit. > > > The following changes since commit 76e10d158efb6d4516018846f60c2ab5501900bc: > > ?Linux 3.4 (2012-05-20 15:29:13 -0700) > > are available in the git repository at: > > ?ssh://sumitsemwal at git.linaro.org/~/public_git/linux-dma-buf.git > tags/tag-for-linus-3.5 I am really sorry - I goofed up in the git URL (sent the ssh URL instead). Could you please use git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git tags/tag-for-linus-3.5 instead, or should I send a new pull request with the corrected URL? Thanks, and best regards, ~Sumit. > > for you to fetch changes up to b25b086d23eb852bf3cfdeb60409b4967ebb3c0c: > > ?dma-buf: add initial vmap documentation (2012-05-25 12:51:11 +0530) > > ---------------------------------------------------------------- > dma-buf updates for 3.5 > > ---------------------------------------------------------------- > Daniel Vetter (1): > ? ? ?dma-buf: mmap support > > Dave Airlie (2): > ? ? ?dma-buf: add vmap interface > ? ? ?dma-buf: add initial vmap documentation > > Sumit Semwal (1): > ? ? ?dma-buf: minor documentation fixes. > > ?Documentation/dma-buf-sharing.txt | ?109 ++++++++++++++++++++++++++++++++++--- > ?drivers/base/dma-buf.c ? ? ? ? ? ?| ? 99 ++++++++++++++++++++++++++++++++- > ?include/linux/dma-buf.h ? ? ? ? ? | ? 33 +++++++++++ > ?3 files changed, 233 insertions(+), 8 deletions(-) -- Thanks and regards, Sumit Semwal Linaro Kernel Engineer - Graphics working group Linaro.org???Open source software for ARM SoCs Follow?Linaro:?Facebook?|?Twitter?|?Blog From sumit.semwal at linaro.org Fri May 25 09:18:55 2012 From: sumit.semwal at linaro.org (Sumit Semwal) Date: Fri, 25 May 2012 14:48:55 +0530 Subject: [Linaro-mm-sig] [PATCH] dma-buf: fix disabled vmap function In-Reply-To: <1337936657-24837-1-git-send-email-airlied@gmail.com> References: <1337936657-24837-1-git-send-email-airlied@gmail.com> Message-ID: Hi Dave, On 25 May 2012 14:34, Dave Airlie wrote: > From: Dave Airlie > > include/linux/dma-buf.h: In function ?dma_buf_vmap?: > include/linux/dma-buf.h:260:1: warning: no return statement in function returning non-void [-Wreturn-type] > > Reported-by: wfg at linux.intel.com > Signed-off-by: Dave Airlie > --- > ?include/linux/dma-buf.h | ? ?1 + > ?1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index d8c2865..506bb7b 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -257,6 +257,7 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, > > ?static inline void *dma_buf_vmap(struct dma_buf *dmabuf) > ?{ > + ? ? ? return NULL; > ?} I fixed this as part of rebasing while applying your vmap patch to my for-next - so it is already in my pull request. Best regards, ~Sumit. > > ?static inline void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) > -- > 1.7.6 > > > _______________________________________________ > Linaro-mm-sig mailing list > Linaro-mm-sig at lists.linaro.org > http://lists.linaro.org/mailman/listinfo/linaro-mm-sig From airlied at gmail.com Fri May 25 10:01:16 2012 From: airlied at gmail.com (Dave Airlie) Date: Fri, 25 May 2012 11:01:16 +0100 Subject: [Linaro-mm-sig] [PATCH] dma-buf: fix disabled vmap function In-Reply-To: References: <1337936657-24837-1-git-send-email-airlied@gmail.com> Message-ID: On Fri, May 25, 2012 at 10:18 AM, Sumit Semwal wrote: > Hi Dave, > > On 25 May 2012 14:34, Dave Airlie wrote: >> From: Dave Airlie >> >> include/linux/dma-buf.h: In function ?dma_buf_vmap?: >> include/linux/dma-buf.h:260:1: warning: no return statement in function returning non-void [-Wreturn-type] >> >> Reported-by: wfg at linux.intel.com >> Signed-off-by: Dave Airlie >> --- >> ?include/linux/dma-buf.h | ? ?1 + >> ?1 files changed, 1 insertions(+), 0 deletions(-) >> >> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h >> index d8c2865..506bb7b 100644 >> --- a/include/linux/dma-buf.h >> +++ b/include/linux/dma-buf.h >> @@ -257,6 +257,7 @@ static inline void dma_buf_kunmap(struct dma_buf *dmabuf, >> >> ?static inline void *dma_buf_vmap(struct dma_buf *dmabuf) >> ?{ >> + ? ? ? return NULL; >> ?} > I fixed this as part of rebasing while applying your vmap patch to my > for-next - so it is already in my pull request. > Ah I didn't read the full warning mail, it was giving out about Tomasz tree, he must have merged an older version. Thanks, Dave. From tom.cooksey at arm.com Fri May 25 11:08:29 2012 From: tom.cooksey at arm.com (Tom Cooksey) Date: Fri, 25 May 2012 12:08:29 +0100 Subject: [Linaro-mm-sig] [RFC] Synchronizing access to buffers shared with dma-buf between drivers/devices Message-ID: <000201cd3a66$b7182320$25486960$@cooksey@arm.com> Hi All, I realise it's been a while since this was last discussed, however I'd like to bring up kernel-side synchronization again. By kernel-side synchronization, I mean allowing multiple drivers/devices wanting to access the same buffer to do so without bouncing up to userspace to resolve dependencies such as "the display controller can't start scanning out a buffer until the GPU has finished rendering into it". As such, this is really just an optimization which reduces latency between E.g. The GPU finishing a rendering job and that buffer being scanned out. I appreciate this particular example is already solved on desktop graphics cards as the display controller and 3D core are both controlled by the same driver, so no "generic" mechanism is needed. However on ARM SoCs, the 3D core (like an ARM Mali) and display controller tend to be driven by separate drivers, so some mechanism is needed to allow both drivers to synchronize their access to buffers. There are multiple ways synchronization can be achieved, fences/sync objects is one common approach, however we're presenting a different approach. Personally, I quite like fence sync objects, however we believe it requires a lot of userspace interfaces to be changed to pass around sync object handles. Our hope is that the kds approach will require less effort to make use of as no existing userspace interfaces need to be changed. E.g. To use explicit fences, the struct drm_mode_crtc_page_flip would need a new members to pass in the handle(s) of sync object(s) which the flip depends on (I.e. don't flip until these fences fire). The additional benefit of our approach is that it prevents userspace specifying dependency loops which can cause a deadlock (see kds.txt for an explanation of what I mean here). I have waited until now to bring this up again because I am now able to share the code I was trying (and failing I think) to explain previously. The code has now been released under the GPLv2 from ARM Mali's developer portal, however I've attempted to turn that into a patch to allow it to be discussed on this list. Please find the patch inline below. While KDS defines a very generic mechanism, I am proposing that this code or at least the concepts be merged with the existing dma_buf code, so a the struct kds_resource members get moved to struct dma_buf, kds_* functions get renamed to dma_buf_* functions, etc. So I guess what I'm saying is please don't review the actual code just yet, only the concepts the code describes, where kds_resource == dma_duf. Cheers, Tom Author: Tom Cooksey Date: Fri May 25 10:45:27 2012 +0100 Add new system to allow synchronizing access to resources See Documentation/kds.txt for details, however the general idea is that this kds framework synchronizes multiple drivers ("clients") wanting to access the same resources, where a resource is typically a 2D image buffer being shared around using dma-buf. Note: This patch is created by extracting the sources from the tarball on and putting them in roughly the right places. diff --git a/Documentation/kds.txt b/Documentation/kds.txt new file mode 100644 index 0000000..a96db21 --- /dev/null +++ b/Documentation/kds.txt @@ -0,0 +1,113 @@ +# +# (C) COPYRIGHT 2012 ARM Limited. All rights reserved. +# +# This program is free software and is provided to you under the terms of the GNU General Public License version 2 +# as published by the Free Software Foundation, and any use by you of this program is subject to the terms of such GNU licence. +# +# A copy of the licence is included with the program, and can also be obtained from Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. +# +# + + +============================== +kds - Kernel Dependency System +============================== + +Introduction +------------ +kds provides a mechanism for clients to atomically lock down multiple abstract resources. +This can be done either synchronously or asynchronously. +Abstract resources is used to allow a set of clients to use kds to control access to any +resource, an example is structured memory buffers. + +kds supports that buffer is locked for exclusive access and sharing of buffers. + +kds can be built as either a integrated feature of the kernel or as a module. +It supports being compiled as a module both in-tree and out-of-tree. + + +Concepts +-------- +A core concept in kds is abstract resources. +A kds resource is just an abstraction for some client object, kds doesn't care what it is. +Typically EGL will consider UMP buffers as being a resource, thus each UMP buffer has +a kds resource for synchronization to the buffer. + +kds allows a client to create and destroy the abstract resource objects. +A new resource object is made available asap (it is just a simple malloc with some initializations), +while destroy it requires some external synchronization. + +The other core concept in kds is consumer of resources. +kds is requested to allow a client to consume a set of resources and the client will be notified when it can consume the resources. + +Exclusive access allows only one client to consume a resource. +Shared access permits multiple consumers to acceess a resource concurrently. + + +APIs +---- +kds provides simple resource allocate and destroy functions. +Clients use this to instantiate and control the lifetime of the resources kds manages. + +kds provides two ways to wait for resources: +- Asynchronous wait: the client specifies a function pointer to be called when wait is over +- Synchronous wait: Function blocks until access is gained. + +The synchronous API has a timeout for the wait. +The call can early out if a signal is delivered. + +After a client is done consuming the resource kds must be notified to release the resources and let some other client take ownership. +This is done via resource set release call. + +A Windows comparison: +kds implements WaitForMultipleObjectsEx(..., bWaitAll = TRUE, ...) but also has an asynchronous version in addition. +kds resources can be seen as being the same as NT object manager resources. + +Internals +--------- +kds guarantees atomicity when a set of resources is operated on. +This is implemented via a global resource lock which is taken by kds when it updates resource objects. + +Internally a resource in kds is a linked list head with some flags. + +When a consumer requests access to a set of resources it is queued on each of the resources. +The link from the consumer to the resources can be triggered. Once all links are triggered +the registered callback is called or the blocking function returns. +A link is considered triggered if it is the first on the list of consumers of a resource, +or if all the links ahead of it is marked as shared and itself is of the type shared. + +When the client is done consuming the consumer object is removed from the linked lists of +the resources and a potential new consumer becomes the head of the resources. +As we add and remove consumers atomically across all resources we can guarantee that +we never introduces a A->B + B->A type of loops/deadlocks. + + +kbase/base implementation +------------------------- +A HW job needs access to a set of shared resources. +EGL tracks this and encodes the set along with the atom in the ringbuffer. +EGL allocates a (k)base dep object to represent the dependency to the set of resources and encodes that along with the list of resources. +This dep object is use to create a dependency from a job chain(atom) to the resources it needs to run. +When kbase decodes the atom in the ringbuffer it finds the set of resources and calls kds to request all the needed resources. +As EGL needs to know when the kds request is delivered a new base event object is needed: atom enqueued. This event is only delivered for atoms which uses kds. +The callback kbase registers trigger the dependency object described which would trigger the existing JD system to release the job chain. +When the atom is done kds resource set release is call to release the resources. + +EGL will typically use exclusive access to the render target, while all buffers used as input can be marked as shared. + + +Buffer publish/vsync +-------------------- +EGL will use a separate ioctl or DRM flip to request the flip. +If the LCD driver is integrated with kds EGL can do these operations early. +The LCD driver must then implement the ioctl or DRM flip to be asynchronous with kds async call. +The LCD driver binds a kds resource to each virtual buffer (2 buffers in case of double-buffering). +EGL will make a dependency to the target kds resource in the kbase atom. +After EGL receives a atom enqueued event it can ask the LCD driver to pan to the target kds resource. +When the atom is completed it'll release the resource and the LCD driver will get its callback. +In the callback it'll load the target buffer into the DMA unit of the LCD hardware. +The LCD driver will be the consumer of both buffers for a short period. +The LCD driver will call kds resource set release on the previous on-screen buffer when the next vsync/dma read end is handled. + + diff --git a/drivers/misc/kds.c b/drivers/misc/kds.c new file mode 100644 index 0000000..8d7d55e --- /dev/null +++ b/drivers/misc/kds.c @@ -0,0 +1,461 @@ +/* + * + * (C) COPYRIGHT 2012 ARM Limited. All rights reserved. + * + * This program is free software and is provided to you under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation, and any use by you of this program is subject to the terms of such GNU licence. + * + * A copy of the licence is included with the program, and can also be obtained from Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + */ + + + +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#define KDS_LINK_TRIGGERED (1u << 0) +#define KDS_LINK_EXCLUSIVE (1u << 1) + +#define KDS_IGNORED NULL +#define KDS_INVALID (void*)-2 +#define KDS_RESOURCE (void*)-1 + +struct kds_resource_set +{ + unsigned long num_resources; + unsigned long pending; + unsigned long locked_resources; + struct kds_callback * cb; + void * callback_parameter; + void * callback_extra_parameter; + struct list_head callback_link; + struct work_struct callback_work; + struct kds_link resources[0]; +}; + +static DEFINE_MUTEX(kds_lock); + +int kds_callback_init(struct kds_callback * cb, int direct, kds_callback_fn user_cb) +{ + int ret = 0; + + cb->direct = direct; + cb->user_cb = user_cb; + + if (!direct) + { + cb->wq = alloc_workqueue("kds", WQ_UNBOUND | WQ_HIGHPRI, WQ_UNBOUND_MAX_ACTIVE); + if (!cb->wq) + ret = -ENOMEM; + } + else + { + cb->wq = NULL; + } + + return ret; +} +EXPORT_SYMBOL(kds_callback_init); + +void kds_callback_term(struct kds_callback * cb) +{ + if (!cb->direct) + { + BUG_ON(!cb->wq); + destroy_workqueue(cb->wq); + } + else + { + BUG_ON(cb->wq); + } +} + +EXPORT_SYMBOL(kds_callback_term); + +static void kds_do_user_callback(struct kds_resource_set * rset) +{ + rset->cb->user_cb(rset->callback_parameter, rset->callback_extra_parameter); +} + +static void kds_queued_callback(struct work_struct * work) +{ + struct kds_resource_set * rset; + rset = container_of( work, struct kds_resource_set, callback_work); + + kds_do_user_callback(rset); +} + +static void kds_callback_perform(struct kds_resource_set * rset) +{ + if (rset->cb->direct) + kds_do_user_callback(rset); + else + { + int result; + result = queue_work(rset->cb->wq, &rset->callback_work); + /* if we got a 0 return it means we've triggered the same rset twice! */ + BUG_ON(!result); + } +} + +void kds_resource_init(struct kds_resource * res) +{ + BUG_ON(!res); + INIT_LIST_HEAD(&res->waiters.link); + res->waiters.parent = KDS_RESOURCE; +} +EXPORT_SYMBOL(kds_resource_init); + +void kds_resource_term(struct kds_resource * res) +{ + BUG_ON(!res); + BUG_ON(!list_empty(&res->waiters.link)); + res->waiters.parent = KDS_INVALID; +} +EXPORT_SYMBOL(kds_resource_term); + +int kds_async_waitall( + struct kds_resource_set ** pprset, + unsigned long flags, + struct kds_callback * cb, + void * callback_parameter, + void * callback_extra_parameter, + int number_resources, + unsigned long * exclusive_access_bitmap, + struct kds_resource ** resource_list) +{ + struct kds_resource_set * rset = NULL; + int i; + int triggered; + int err = -EFAULT; + + BUG_ON(!pprset); + BUG_ON(!resource_list); + BUG_ON(!cb); + + mutex_lock(&kds_lock); + + if ((flags & KDS_FLAG_LOCKED_ACTION) == KDS_FLAG_LOCKED_FAIL) + { + for (i = 0; i < number_resources; i++) + { + if (resource_list[i]->lock_count) + { + err = -EBUSY; + goto errout; + } + } + } + + rset = kmalloc(sizeof(*rset) + number_resources * sizeof(struct kds_link), GFP_KERNEL); + if (!rset) + { + err = -ENOMEM; + goto errout; + } + + rset->num_resources = number_resources; + rset->pending = number_resources; + rset->locked_resources = 0; + rset->cb = cb; + rset->callback_parameter = callback_parameter; + rset->callback_extra_parameter = callback_extra_parameter; + INIT_LIST_HEAD(&rset->callback_link); + INIT_WORK(&rset->callback_work, kds_queued_callback); + + for (i = 0; i < number_resources; i++) + { + unsigned long link_state = 0; + + INIT_LIST_HEAD(&rset->resources[i].link); + rset->resources[i].parent = rset; + + if (test_bit(i, exclusive_access_bitmap)) + { + link_state |= KDS_LINK_EXCLUSIVE; + } + + /* no-one else waiting? */ + if (list_empty(&resource_list[i]->waiters.link)) + { + link_state |= KDS_LINK_TRIGGERED; + rset->pending--; + } + /* Adding a non-exclusive and the current tail is a triggered non-exclusive? */ + else if (((link_state & KDS_LINK_EXCLUSIVE) == 0) && + (((list_entry(resource_list[i]->waiters.link.prev, struct kds_link, link)->state & (KDS_LINK_EXCLUSIVE | KDS_LINK_TRIGGERED)) == KDS_LINK_TRIGGERED))) + { + link_state |= KDS_LINK_TRIGGERED; + rset->pending--; + } + /* locked & ignore locked? */ + else if ((resource_list[i]->lock_count) && ((flags & KDS_FLAG_LOCKED_ACTION) == KDS_FLAG_LOCKED_IGNORE) ) + { + link_state |= KDS_LINK_TRIGGERED; + rset->pending--; + rset->resources[i].parent = KDS_IGNORED; /* to disable decrementing the pending count when we get the ignored resource */ + } + rset->resources[i].state = link_state; + list_add_tail(&rset->resources[i].link, &resource_list[i]->waiters.link); + } + + triggered = (rset->pending == 0); + + mutex_unlock(&kds_lock); + + /* set the pointer before the callback is called so it sees it */ + *pprset = rset; + + if (triggered) + { + /* all resources obtained, trigger callback */ + kds_callback_perform(rset); + } + + return 0; + +errout: + mutex_unlock(&kds_lock); + return err; +} +EXPORT_SYMBOL(kds_async_waitall); + +static void wake_up_sync_call(void * callback_parameter, void * callback_extra_parameter) +{ + wait_queue_head_t * wait = (wait_queue_head_t*)callback_parameter; + wake_up(wait); +} + +static struct kds_callback sync_cb = +{ + wake_up_sync_call, + 1, + NULL, +}; + +struct kds_resource_set * kds_waitall( + int number_resources, + unsigned long * exclusive_access_bitmap, + struct kds_resource ** resource_list, + unsigned long jiffies_timeout) +{ + struct kds_resource_set * rset; + int i; + int triggered = 0; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake); + + rset = kmalloc(sizeof(*rset) + number_resources * sizeof(struct kds_link), GFP_KERNEL); + if (!rset) + return rset; + + rset->num_resources = number_resources; + rset->pending = number_resources; + rset->locked_resources = 1; + INIT_LIST_HEAD(&rset->callback_link); + INIT_WORK(&rset->callback_work, kds_queued_callback); + + mutex_lock(&kds_lock); + + for (i = 0; i < number_resources; i++) + { + unsigned long link_state = 0; + + if (likely(resource_list[i]->lock_count < ULONG_MAX)) + resource_list[i]->lock_count++; + else + break; + + if (test_bit(i, exclusive_access_bitmap)) + { + link_state |= KDS_LINK_EXCLUSIVE; + } + + if (list_empty(&resource_list[i]->waiters.link)) + { + link_state |= KDS_LINK_TRIGGERED; + rset->pending--; + } + /* Adding a non-exclusive and the current tail is a triggered non-exclusive? */ + else if (((link_state & KDS_LINK_EXCLUSIVE) == 0) && + (((list_entry(resource_list[i]->waiters.link.prev, struct kds_link, link)->state & (KDS_LINK_EXCLUSIVE | KDS_LINK_TRIGGERED)) == KDS_LINK_TRIGGERED))) + { + link_state |= KDS_LINK_TRIGGERED; + rset->pending--; + } + + INIT_LIST_HEAD(&rset->resources[i].link); + rset->resources[i].parent = rset; + rset->resources[i].state = link_state; + list_add_tail(&rset->resources[i].link, &resource_list[i]->waiters.link); + } + + if (i < number_resources) + { + /* an overflow was detected, roll back */ + while (i--) + { + list_del(&rset->resources[i].link); + resource_list[i]->lock_count--; + } + mutex_unlock(&kds_lock); + kfree(rset); + return ERR_PTR(-EFAULT); + } + + if (rset->pending == 0) + triggered = 1; + else + { + rset->cb = &sync_cb; + rset->callback_parameter = &wake; + rset->callback_extra_parameter = NULL; + } + + mutex_unlock(&kds_lock); + + if (!triggered) + { + long wait_res; + if ( KDS_WAIT_BLOCKING == jiffies_timeout ) + { + wait_res = wait_event_interruptible(wake, rset->pending == 0); + } + else + { + wait_res = wait_event_interruptible_timeout(wake, rset->pending == 0, jiffies_timeout); + } + if ((wait_res == -ERESTARTSYS) || (wait_res == 0)) + { + /* use \a kds_resource_set_release to roll back */ + kds_resource_set_release(&rset); + return ERR_PTR(wait_res); + } + } + return rset; +} +EXPORT_SYMBOL(kds_waitall); + +void kds_resource_set_release(struct kds_resource_set ** pprset) +{ + struct list_head triggered = LIST_HEAD_INIT(triggered); + struct kds_resource_set * rset; + struct kds_resource_set * it; + int i; + + BUG_ON(!pprset); + + mutex_lock(&kds_lock); + + rset = *pprset; + if (!rset) + { + /* caught a race between a cancelation + * and a completion, nothing to do */ + mutex_unlock(&kds_lock); + return; + } + + /* clear user pointer so we'll be the only + * thread handling the release */ + *pprset = NULL; + + for (i = 0; i < rset->num_resources; i++) + { + struct kds_resource * resource; + struct kds_link * it = NULL; + + /* fetch the previous entry on the linked list */ + it = list_entry(rset->resources[i].link.prev, struct kds_link, link); + /* unlink ourself */ + list_del(&rset->resources[i].link); + + /* any waiters? */ + if (list_empty(&it->link)) + continue; + + /* were we the head of the list? (head if prev is a resource) */ + if (it->parent != KDS_RESOURCE) + continue; + + /* we were the head, find the kds_resource */ + resource = container_of(it, struct kds_resource, waiters); + + if (rset->locked_resources) + { + resource->lock_count--; + } + + /* we know there is someone waiting from the any-waiters test above */ + + /* find the head of the waiting list */ + it = list_first_entry(&resource->waiters.link, struct kds_link, link); + + /* new exclusive owner? */ + if (it->state & KDS_LINK_EXCLUSIVE) + { + /* link now triggered */ + it->state |= KDS_LINK_TRIGGERED; + /* a parent to update? */ + if (it->parent != KDS_IGNORED) + { + if (0 == --it->parent->pending) + { + /* new owner now triggered, track for callback later */ + list_add(&it->parent->callback_link, &triggered); + } + } + } + /* exclusive releasing ? */ + else if (rset->resources[i].state & KDS_LINK_EXCLUSIVE) + { + /* trigger non-exclusive until end-of-list or first exclusive */ + list_for_each_entry(it, &resource->waiters.link, link) + { + /* exclusive found, stop triggering */ + if (it->state & KDS_LINK_EXCLUSIVE) + break; + + it->state |= KDS_LINK_TRIGGERED; + /* a parent to update? */ + if (it->parent != KDS_IGNORED) + { + if (0 == --it->parent->pending) + { + /* new owner now triggered, track for callback later */ + list_add(&it->parent->callback_link, &triggered); + } + } + } + } + + } + + mutex_unlock(&kds_lock); + + while (!list_empty(&triggered)) + { + it = list_first_entry(&triggered, struct kds_resource_set, callback_link); + list_del(&it->callback_link); + kds_callback_perform(it); + } + + cancel_work_sync(&rset->callback_work); + + /* free the resource set */ + kfree(rset); +} +EXPORT_SYMBOL(kds_resource_set_release); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("ARM Ltd."); +MODULE_VERSION("1.0"); diff --git a/include/linux/kds.h b/include/linux/kds.h new file mode 100644 index 0000000..65e5706 --- /dev/null +++ b/include/linux/kds.h @@ -0,0 +1,154 @@ +/* + * + * (C) COPYRIGHT 2012 ARM Limited. All rights reserved. + * + * This program is free software and is provided to you under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation, and any use by you of this program is subject to the terms of such GNU licence. + * + * A copy of the licence is included with the program, and can also be obtained from Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + */ + + + +#ifndef _KDS_H_ +#define _KDS_H_ + +#include +#include + +#define KDS_WAIT_BLOCKING (ULONG_MAX) + +/* what to do when waitall must wait for a synchronous locked resource: */ +#define KDS_FLAG_LOCKED_FAIL (0u << 0) /* fail waitall */ +#define KDS_FLAG_LOCKED_IGNORE (1u << 0) /* don't wait, but block other that waits */ +#define KDS_FLAG_LOCKED_WAIT (2u << 0) /* wait (normal */ +#define KDS_FLAG_LOCKED_ACTION (3u << 0) /* mask to extract the action to do on locked resources */ + +struct kds_resource_set; + +typedef void (*kds_callback_fn) (void * callback_parameter, void * callback_extra_parameter); + +struct kds_callback +{ + kds_callback_fn user_cb; /* real cb */ + int direct; /* do direct or queued call? */ + struct workqueue_struct * wq; +}; + +struct kds_link +{ + struct kds_resource_set * parent; + struct list_head link; + unsigned long state; +}; + +struct kds_resource +{ + struct kds_link waiters; + unsigned long lock_count; +}; + +/* callback API */ + +/* Initialize a callback object. + * + * Typically created per context or per hw resource. + * + * Callbacks can be performed directly if no nested locking can + * happen in the client. + * + * Nested locking can occur when a lock is held during the kds_async_waitall or + * kds_resource_set_release call. If the callback needs to take the same lock + * nested locking will happen. + * + * If nested locking could happen non-direct callbacks can be requested. + * Callbacks will then be called asynchronous to the triggering call. + */ +int kds_callback_init(struct kds_callback * cb, int direct, kds_callback_fn user_cb); + +/* Terminate the use of a callback object. + * + * If the callback object was set up as non-direct + * any pending callbacks will be flushed first. + * Note that to avoid a deadlock the lock callbacks needs + * can't be held when a callback object is terminated. + */ +void kds_callback_term(struct kds_callback * cb); + + +/* resource object API */ + +/* initialize a resource handle for a shared resource */ +void kds_resource_init(struct kds_resource * resource); + +/* + * Will assert if the resource is being used or waited on. + * The caller should NOT try to terminate a resource that could still have clients. + * After the function returns the resource is no longer known by kds. + */ +void kds_resource_term(struct kds_resource * resource); + +/* Asynchronous wait for a set of resources. + * Callback will be called when all resources are available. + * If all the resources was available the callback will be called before kds_async_waitall returns. + * So one must not hold any locks the callback code-flow can take when calling kds_async_waitall. + * Caller considered to own/use the resources until \a kds_rset_release is called. + * flags is one or more of the KDS_FLAG_* set. + * exclusive_access_bitmap is a bitmap where a high bit means exclusive access while a low bit means shared access. + * Use the Linux __set_bit API, where the index of the buffer to control is used as the bit index. + * + * Standard Linux error return value. + */ +int kds_async_waitall( + struct kds_resource_set ** pprset, + unsigned long flags, + struct kds_callback * cb, + void * callback_parameter, + void * callback_extra_parameter, + int number_resources, + unsigned long * exclusive_access_bitmap, + struct kds_resource ** resource_list); + +/* Synchronous wait for a set of resources. + * Function will return when one of these have happened: + * - all resources have been obtained + * - timeout lapsed while waiting + * - a signal was received while waiting + * + * Caller considered to own/use the resources when the function returns. + * Caller must release the resources using \a kds_rset_release. + * + * Calling this function while holding already locked resources or other locking primitives is dangerous. + * One must if this is needed decide on a lock order of the resources and/or the other locking primitives + * and always take the resources/locking primitives in the specific order. + * + * Use the ERR_PTR framework to decode the return value. + * NULL = time out + * If IS_ERR then PTR_ERR gives: + * ERESTARTSYS = signal received, retry call after signal + * all other values = internal error, lock failed + * Other values = successful wait, now the owner, must call kds_resource_set_release + */ +struct kds_resource_set * kds_waitall( + int number_resources, + unsigned long * exclusive_access_bitmap, + struct kds_resource ** resource_list, + unsigned long jifies_timeout); + +/* Release resources after use. + * Caller must handle that other async callbacks will trigger, + * so must avoid holding any locks a callback will take. + * + * The function takes a pointer to your poiner to handle a race + * between a cancelation and a completion. + * + * If the caller can't guarantee that a race can't occur then + * the passed in pointer must be the same in both call paths + * to allow kds to manage the potential race. + */ +void kds_resource_set_release(struct kds_resource_set ** pprset); + +#endif /* _KDS_H_ */ + From arnd at arndb.de Sat May 26 07:48:29 2012 From: arnd at arndb.de (Arnd Bergmann) Date: Sat, 26 May 2012 07:48:29 +0000 Subject: [Linaro-mm-sig] [GIT PULL]: dma-buf updates for 3.5 In-Reply-To: References: Message-ID: <201205260748.29800.arnd@arndb.de> On Friday 25 May 2012, Linus Torvalds wrote: > Please do something like > > gpg --keyserver pgp.mit.edu --send-keys 7126925D > > to actually make your public key public. > > Of course, if it isn't public, I assume it hasn't actually been signed > by anybody, which makes it largely useless. But any future signing > action will validate the pre-signing uses of the key, so that's > fixable. We have a Linaro meeting this week, so we will make sure to get this key and other relevant ones signed by everyone who is here and in the kernel keyring. Arnd From sumit.semwal at linaro.org Sat May 26 14:17:54 2012 From: sumit.semwal at linaro.org (Sumit Semwal) Date: Sat, 26 May 2012 19:47:54 +0530 Subject: [Linaro-mm-sig] [GIT PULL]: dma-buf updates for 3.5 In-Reply-To: References: Message-ID: Hi Linus, On 25 May 2012 22:14, Linus Torvalds wrote: > On Fri, May 25, 2012 at 2:17 AM, Sumit Semwal wrote: >> >> I am really sorry - I goofed up in the git URL (sent the ssh URL >> instead). > > I was going to send you an acerbic email asking for your private ssh > key, but then noticed that you had sent another email with the public > version of the git tree.. Well, it was stupid indeed - learning for me; won't happen again. > >> Could you please use >> >> git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git tags/tag-for-linus-3.5 >> >> instead, or should I send a new pull request with the corrected URL? > > Done. However, while your tag seems to be signed, your key is not > available publicly: > > ? [torvalds at i5 ~]$ gpg --recv-key 7126925D > ? gpg: requesting key 7126925D from hkp server pgp.mit.edu > ? gpgkeys: key 7126925D not found on keyserver > > so I can't check if it is signed by anybody. > > Please do something like > > ? gpg --keyserver pgp.mit.edu --send-keys 7126925D > > to actually make your public key public. Thanks; it is done. > > Of course, if it isn't public, I assume it hasn't actually been signed > by anybody, which makes it largely useless. But any future signing > action will validate the pre-signing uses of the key, so that's > fixable. Like Arnd has mentioned, we would do a key signing party here at the Linaro meeting, and make sure that relevant ones are signed. > > ? ? ? ? ? ? ? ? ? ? Linus -- Thanks and best regards, ~Sumit From torvalds at linux-foundation.org Fri May 25 16:44:11 2012 From: torvalds at linux-foundation.org (Linus Torvalds) Date: Fri, 25 May 2012 09:44:11 -0700 Subject: [Linaro-mm-sig] [GIT PULL]: dma-buf updates for 3.5 In-Reply-To: References: Message-ID: On Fri, May 25, 2012 at 2:17 AM, Sumit Semwal wrote: > > I am really sorry - I goofed up in the git URL (sent the ssh URL > instead). I was going to send you an acerbic email asking for your private ssh key, but then noticed that you had sent another email with the public version of the git tree.. > Could you please use > > git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git tags/tag-for-linus-3.5 > > instead, or should I send a new pull request with the corrected URL? Done. However, while your tag seems to be signed, your key is not available publicly: [torvalds at i5 ~]$ gpg --recv-key 7126925D gpg: requesting key 7126925D from hkp server pgp.mit.edu gpgkeys: key 7126925D not found on keyserver so I can't check if it is signed by anybody. Please do something like gpg --keyserver pgp.mit.edu --send-keys 7126925D to actually make your public key public. Of course, if it isn't public, I assume it hasn't actually been signed by anybody, which makes it largely useless. But any future signing action will validate the pre-signing uses of the key, so that's fixable. Linus From kosaki.motohiro at gmail.com Sun May 27 12:35:23 2012 From: kosaki.motohiro at gmail.com (KOSAKI Motohiro) Date: Sun, 27 May 2012 08:35:23 -0400 Subject: [Linaro-mm-sig] [PATCHv2 3/4] mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework In-Reply-To: <20120524122854.GD11860@linux-sh.org> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> <4FBB3B41.8010102@kernel.org> <01e501cd39a8$67f34ea0$37d9ebe0$%szyprowski@samsung.com> <20120524122854.GD11860@linux-sh.org> Message-ID: On Thu, May 24, 2012 at 8:28 AM, Paul Mundt wrote: > On Thu, May 24, 2012 at 02:26:12PM +0200, Marek Szyprowski wrote: >> On Tuesday, May 22, 2012 9:08 AM Minchan Kim wrote: >> > Hmm, VM_DMA would become generic flag? >> > AFAIU, maybe VM_DMA would be used only on ARM arch. >> >> Right now yes, it will be used only on ARM architecture, but maybe other architecture will >> start using it once it is available. >> > There's very little about the code in question that is ARM-specific to > begin with. I plan to adopt similar changes on SH once the work has > settled one way or the other, so we'll probably use the VMA flag there, > too. I don't think VM_DMA is good idea because x86_64 has two dma zones. x86 unaware patches make no sense. From m.szyprowski at samsung.com Mon May 28 08:19:39 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Mon, 28 May 2012 10:19:39 +0200 Subject: [Linaro-mm-sig] [PATCHv2 3/4] mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework In-Reply-To: References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> <4FBB3B41.8010102@kernel.org> <01e501cd39a8$67f34ea0$37d9ebe0$%szyprowski@samsung.com> <20120524122854.GD11860@linux-sh.org> Message-ID: <001d01cd3caa$a05d0510$e1170f30$%szyprowski@samsung.com> Hello, On Sunday, May 27, 2012 2:35 PM KOSAKI Motohiro wrote: > On Thu, May 24, 2012 at 8:28 AM, Paul Mundt wrote: > > On Thu, May 24, 2012 at 02:26:12PM +0200, Marek Szyprowski wrote: > >> On Tuesday, May 22, 2012 9:08 AM Minchan Kim wrote: > >> > Hmm, VM_DMA would become generic flag? > >> > AFAIU, maybe VM_DMA would be used only on ARM arch. > >> > >> Right now yes, it will be used only on ARM architecture, but maybe other architecture will > >> start using it once it is available. > >> > > There's very little about the code in question that is ARM-specific to > > begin with. I plan to adopt similar changes on SH once the work has > > settled one way or the other, so we'll probably use the VMA flag there, > > too. > > I don't think VM_DMA is good idea because x86_64 has two dma zones. x86 unaware > patches make no sense. I see no problems to add VM_DMA64 later if x86_64 starts using vmalloc areas for creating kernel mappings for the dma buffers (I assume that there are 2 dma zones: one 32bit and one 64bit). Right now x86 and x86_64 don't use vmalloc areas for dma buffers, so I hardly see how this patch can be considered as 'x86 unaware'. Best regards -- Marek Szyprowski Samsung Poland R&D Center From laurent.pinchart at ideasonboard.com Mon May 28 21:12:06 2012 From: laurent.pinchart at ideasonboard.com (Laurent Pinchart) Date: Mon, 28 May 2012 23:12:06 +0200 Subject: [Linaro-mm-sig] [PATCHv6 01/13] v4l: Add DMABUF as a memory type In-Reply-To: <1337775027-9489-2-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> <1337775027-9489-2-git-send-email-t.stanislaws@samsung.com> Message-ID: <39362499.rezgOX57D6@avalon> Hi Tomasz, Thanks for the patch. On Wednesday 23 May 2012 14:10:15 Tomasz Stanislawski wrote: > From: Sumit Semwal > > Adds DMABUF memory type to v4l framework. Also adds the related file > descriptor in v4l2_plane and v4l2_buffer. Sorry not to have caught this earlier, but haven't you forgotten to add support for V4L2_MEMORY_DMABUF to v4l2-compat-ioctl32.c ? > Signed-off-by: Tomasz Stanislawski > [original work in the PoC for buffer sharing] > Signed-off-by: Sumit Semwal > Signed-off-by: Sumit Semwal > Acked-by: Laurent Pinchart > --- > drivers/media/video/v4l2-ioctl.c | 1 + > include/linux/videodev2.h | 7 +++++++ > 2 files changed, 8 insertions(+) > > diff --git a/drivers/media/video/v4l2-ioctl.c > b/drivers/media/video/v4l2-ioctl.c index 91be4e8..31fc2ad 100644 > --- a/drivers/media/video/v4l2-ioctl.c > +++ b/drivers/media/video/v4l2-ioctl.c > @@ -175,6 +175,7 @@ static const char *v4l2_memory_names[] = { > [V4L2_MEMORY_MMAP] = "mmap", > [V4L2_MEMORY_USERPTR] = "userptr", > [V4L2_MEMORY_OVERLAY] = "overlay", > + [V4L2_MEMORY_DMABUF] = "dmabuf", > }; > > #define prt_names(a, arr) ((((a) >= 0) && ((a) < ARRAY_SIZE(arr))) ? \ > diff --git a/include/linux/videodev2.h b/include/linux/videodev2.h > index 370d111..51b20f4 100644 > --- a/include/linux/videodev2.h > +++ b/include/linux/videodev2.h > @@ -185,6 +185,7 @@ enum v4l2_memory { > V4L2_MEMORY_MMAP = 1, > V4L2_MEMORY_USERPTR = 2, > V4L2_MEMORY_OVERLAY = 3, > + V4L2_MEMORY_DMABUF = 4, > }; > > /* see also http://vektor.theorem.ca/graphics/ycbcr/ */ > @@ -591,6 +592,8 @@ struct v4l2_requestbuffers { > * should be passed to mmap() called on the video node) > * @userptr: when memory is V4L2_MEMORY_USERPTR, a userspace pointer > * pointing to this plane > + * @fd: when memory is V4L2_MEMORY_DMABUF, a userspace file > + * descriptor associated with this plane > * @data_offset: offset in the plane to the start of data; usually 0, > * unless there is a header in front of the data > * > @@ -605,6 +608,7 @@ struct v4l2_plane { > union { > __u32 mem_offset; > unsigned long userptr; > + int fd; > } m; > __u32 data_offset; > __u32 reserved[11]; > @@ -629,6 +633,8 @@ struct v4l2_plane { > * (or a "cookie" that should be passed to mmap() as offset) > * @userptr: for non-multiplanar buffers with memory == > V4L2_MEMORY_USERPTR; * a userspace pointer pointing to this buffer > + * @fd: for non-multiplanar buffers with memory == V4L2_MEMORY_DMABUF; > + * a userspace file descriptor associated with this buffer > * @planes: for multiplanar buffers; userspace pointer to the array of > plane * info structs for this buffer > * @length: size in bytes of the buffer (NOT its payload) for single- plane > @@ -655,6 +661,7 @@ struct v4l2_buffer { > __u32 offset; > unsigned long userptr; > struct v4l2_plane *planes; > + int fd; > } m; > __u32 length; > __u32 input; -- Regards, Laurent Pinchart From laurent.pinchart at ideasonboard.com Mon May 28 21:30:05 2012 From: laurent.pinchart at ideasonboard.com (Laurent Pinchart) Date: Mon, 28 May 2012 23:30:05 +0200 Subject: [Linaro-mm-sig] [PATCHv6 02/13] Documentation: media: description of DMABUF importing in V4L2 In-Reply-To: <1337775027-9489-3-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> <1337775027-9489-3-git-send-email-t.stanislaws@samsung.com> Message-ID: <3552222.dzY4fiG81O@avalon> Hi Tomasz, Thanks for the patch. On Wednesday 23 May 2012 14:10:16 Tomasz Stanislawski wrote: > This patch adds description and usage examples for importing > DMABUF file descriptor in V4L2. > > Signed-off-by: Tomasz Stanislawski > Signed-off-by: Kyungmin Park > CC: linux-doc at vger.kernel.org [snip] > @@ -103,6 +105,7 @@ as the &v4l2-format; type > field. See memory > Applications set this field to > V4L2_MEMORY_MMAP or > +V4L2_MEMORY_DMABUF or > V4L2_MEMORY_USERPTR. See />. > If you resubmit to fix the compat-ioctl issue in 01/13, could you please replace this with Applications set this field to V4L2_MEMORY_MMAP, V4L2_MEMORY_DMABUF or V4L2_MEMORY_USERPTR. See . like in v5 ? -- Regards, Laurent Pinchart From laurent.pinchart at ideasonboard.com Mon May 28 22:25:03 2012 From: laurent.pinchart at ideasonboard.com (Laurent Pinchart) Date: Tue, 29 May 2012 00:25:03 +0200 Subject: [Linaro-mm-sig] [PATCHv6 00/13] Integration of videobuf2 with dmabuf In-Reply-To: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> Message-ID: <5090892.Z3RkLXNQ1U@avalon> Hi Tomasz, On Wednesday 23 May 2012 14:10:14 Tomasz Stanislawski wrote: > Hello everyone, > This patchset adds support for DMABUF [2] importing to V4L2 stack. > The support for DMABUF exporting was moved to separate patchset > due to dependency on patches for DMA mapping redesign by > Marek Szyprowski [4]. Except for the small issue with patches 01/13 and 02/13, the set is ready for upstream as far as I'm concerned. > v6: > - fixed missing entry in v4l2_memory_names > - fixed a bug occuring after get_user_pages failure I've missed that one, what was it ? > - fixed a bug caused by using invalid vma for get_user_pages > - prepare/finish no longer call dma_sync for dmabuf buffers > > v5: > - removed change of importer/exporter behaviour > - fixes vb2_dc_pages_to_sgt basing on Laurent's hints > - changed pin/unpin words to lock/unlock in Doc > > v4: > - rebased on mainline 3.4-rc2 > - included missing importing support for s5p-fimc and s5p-tv > - added patch for changing map/unmap for importers > - fixes to Documentation part > - coding style fixes > - pairing {map/unmap}_dmabuf in vb2-core > - fixing variable types and semantic of arguments in videobufb2-dma-contig.c > > v3: > - rebased on mainline 3.4-rc1 > - split 'code refactor' patch to multiple smaller patches > - squashed fixes to Sumit's patches > - patchset is no longer dependant on 'DMA mapping redesign' > - separated path for handling IO and non-IO mappings > - add documentation for DMABUF importing to V4L > - removed all DMABUF exporter related code > - removed usage of dma_get_pages extension > > v2: > - extended VIDIOC_EXPBUF argument from integer memoffset to struct > v4l2_exportbuffer > - added patch that breaks DMABUF spec on (un)map_atachment callcacks but > allows to work with existing implementation of DMABUF prime in DRM > - all dma-contig code refactoring patches were squashed > - bugfixes > > v1: List of changes since [1]. > - support for DMA api extension dma_get_pages, the function is used to > retrieve pages used to create DMA mapping. > - small fixes/code cleanup to videobuf2 > - added prepare and finish callbacks to vb2 allocators, it is used keep > consistency between dma-cpu acess to the memory (by Marek Szyprowski) > - support for exporting of DMABUF buffer in V4L2 and Videobuf2, originated > from [3]. > - support for dma-buf exporting in vb2-dma-contig allocator > - support for DMABUF for s5p-tv and s5p-fimc (capture interface) drivers, > originated from [3] > - changed handling for userptr buffers (by Marek Szyprowski, Andrzej > Pietrasiewicz) > - let mmap method to use dma_mmap_writecombine call (by Marek Szyprowski) > > [1] > http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/4296 > 6/focus=42968 [2] https://lkml.org/lkml/2011/12/26/29 > [3] > http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/3635 > 4/focus=36355 [4] > http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 > > Laurent Pinchart (2): > v4l: vb2-dma-contig: Shorten vb2_dma_contig prefix to vb2_dc > v4l: vb2-dma-contig: Reorder functions > > Marek Szyprowski (2): > v4l: vb2: add prepare/finish callbacks to allocators > v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator > > Sumit Semwal (4): > v4l: Add DMABUF as a memory type > v4l: vb2: add support for shared buffer (dma_buf) > v4l: vb: remove warnings about MEMORY_DMABUF > v4l: vb2-dma-contig: add support for dma_buf importing > > Tomasz Stanislawski (5): > Documentation: media: description of DMABUF importing in V4L2 > v4l: vb2-dma-contig: Remove unneeded allocation context structure > v4l: vb2-dma-contig: add support for scatterlist in userptr mode > v4l: s5p-tv: mixer: support for dmabuf importing > v4l: s5p-fimc: support for dmabuf importing > > Documentation/DocBook/media/v4l/compat.xml | 4 + > Documentation/DocBook/media/v4l/io.xml | 179 +++++++ > .../DocBook/media/v4l/vidioc-create-bufs.xml | 1 + > Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 + > Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 45 +- > drivers/media/video/s5p-fimc/Kconfig | 1 + > drivers/media/video/s5p-fimc/fimc-capture.c | 2 +- > drivers/media/video/s5p-tv/Kconfig | 1 + > drivers/media/video/s5p-tv/mixer_video.c | 2 +- > drivers/media/video/v4l2-ioctl.c | 1 + > drivers/media/video/videobuf-core.c | 4 + > drivers/media/video/videobuf2-core.c | 207 +++++++- > drivers/media/video/videobuf2-dma-contig.c | 520 ++++++++++++++--- > include/linux/videodev2.h | 7 + > include/media/videobuf2-core.h | 34 ++ > 15 files changed, 924 insertions(+), 99 deletions(-) -- Regards, Laurent Pinchart From linux at arm.linux.org.uk Tue May 29 12:29:32 2012 From: linux at arm.linux.org.uk (Russell King - ARM Linux) Date: Tue, 29 May 2012 13:29:32 +0100 Subject: [Linaro-mm-sig] [GIT PULL] CMA and ARM DMA-mapping updates for v3.5 In-Reply-To: <1337672417-10065-1-git-send-email-m.szyprowski@samsung.com> References: <1337672417-10065-1-git-send-email-m.szyprowski@samsung.com> Message-ID: <20120529122932.GA24623@n2100.arm.linux.org.uk> I notice we have new warnings as a result of CMA being merged, though thankfully they're just in Kconfig: warning: (ARM) selects CMA which has unmet direct dependencies (HAVE_DMA_CONTIGUOUS && HAVE_MEMBLOCK && EXPERIMENTAL) This seems totally weird: you're mandating that ARM must have CMA selected, but it's an experimental feature? So you're implying that the entire ARM kernel becomes totally experimental for the next release cycle? I think this needs fixing. From m.szyprowski at samsung.com Tue May 29 14:50:45 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Tue, 29 May 2012 16:50:45 +0200 Subject: [Linaro-mm-sig] [GIT PULL] CMA and ARM DMA-mapping updates for v3.5 In-Reply-To: <20120529122932.GA24623@n2100.arm.linux.org.uk> References: <1337672417-10065-1-git-send-email-m.szyprowski@samsung.com> <20120529122932.GA24623@n2100.arm.linux.org.uk> Message-ID: <015e01cd3daa$6d910e50$48b32af0$%szyprowski@samsung.com> Hello, On Tuesday, May 29, 2012 2:30 PM Russell King - ARM Linux wrote: > I notice we have new warnings as a result of CMA being merged, though > thankfully they're just in Kconfig: > > warning: (ARM) selects CMA which has unmet direct dependencies (HAVE_DMA_CONTIGUOUS && > HAVE_MEMBLOCK && EXPERIMENTAL) > > This seems totally weird: you're mandating that ARM must have CMA > selected, but it's an experimental feature? So you're implying that > the entire ARM kernel becomes totally experimental for the next > release cycle? > > I think this needs fixing. No, that wasn't my intention. I will provide a patch which removes unconditional dependency on CMA - it will let one to disable CMA and use old allocation method if needed, but this requires a few more changes in the dma-mapping implementation. I wasn't aware of the consequences and no one has complained about this since v15 of CMA patches (Aug 2011). Best regards -- Marek Szyprowski Samsung Poland R&D Center From kosaki.motohiro at gmail.com Wed May 30 00:11:09 2012 From: kosaki.motohiro at gmail.com (KOSAKI Motohiro) Date: Tue, 29 May 2012 20:11:09 -0400 Subject: [Linaro-mm-sig] [PATCHv2 4/4] ARM: dma-mapping: remove custom consistent dma region In-Reply-To: <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> Message-ID: <4FC5659D.6040805@gmail.com> > static void * > __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, > const void *caller) > { > - struct arm_vmregion *c; > - size_t align; > - int bit; > + struct vm_struct *area; > + unsigned long addr; > > - if (!consistent_pte) { > - printk(KERN_ERR "%s: not initialised\n", __func__); > + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); In this patch, VM_DMA is only used here. So, is this no effect? From konrad.wilk at oracle.com Tue May 29 15:07:14 2012 From: konrad.wilk at oracle.com (Konrad Rzeszutek Wilk) Date: Tue, 29 May 2012 11:07:14 -0400 Subject: [Linaro-mm-sig] [PATCHv2 3/4] mm: vmalloc: add VM_DMA flag to indicate areas used by dma-mapping framework In-Reply-To: <001d01cd3caa$a05d0510$e1170f30$%szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-4-git-send-email-m.szyprowski@samsung.com> <4FBB3B41.8010102@kernel.org> <01e501cd39a8$67f34ea0$37d9ebe0$%szyprowski@samsung.com> <20120524122854.GD11860@linux-sh.org> <001d01cd3caa$a05d0510$e1170f30$%szyprowski@samsung.com> Message-ID: <20120529150714.GA8293@phenom.dumpdata.com> On Mon, May 28, 2012 at 10:19:39AM +0200, Marek Szyprowski wrote: > Hello, > > On Sunday, May 27, 2012 2:35 PM KOSAKI Motohiro wrote: > > > On Thu, May 24, 2012 at 8:28 AM, Paul Mundt wrote: > > > On Thu, May 24, 2012 at 02:26:12PM +0200, Marek Szyprowski wrote: > > >> On Tuesday, May 22, 2012 9:08 AM Minchan Kim wrote: > > >> > Hmm, VM_DMA would become generic flag? > > >> > AFAIU, maybe VM_DMA would be used only on ARM arch. > > >> > > >> Right now yes, it will be used only on ARM architecture, but maybe other architecture will > > >> start using it once it is available. > > >> > > > There's very little about the code in question that is ARM-specific to > > > begin with. I plan to adopt similar changes on SH once the work has > > > settled one way or the other, so we'll probably use the VMA flag there, > > > too. > > > > I don't think VM_DMA is good idea because x86_64 has two dma zones. x86 unaware > > patches make no sense. > > I see no problems to add VM_DMA64 later if x86_64 starts using vmalloc areas for creating > kernel mappings for the dma buffers (I assume that there are 2 dma zones: one 32bit and one > 64bit). Right now x86 and x86_64 don't use vmalloc areas for dma buffers, so I hardly see > how this patch can be considered as 'x86 unaware'. Well they do - kind off. It is usually done by calling vmalloc_32 and then using the DMA API on top of those pages (or sometimes the non-portable virt_to_phys macro). Introducing this and replacing the vmalloc_32 with this seems like a nice step in making those device drivers APIs more portable? From m.szyprowski at samsung.com Wed May 30 07:15:53 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Wed, 30 May 2012 09:15:53 +0200 Subject: [Linaro-mm-sig] [PATCHv2 4/4] ARM: dma-mapping: remove custom consistent dma region In-Reply-To: <4FC5659D.6040805@gmail.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> <4FC5659D.6040805@gmail.com> Message-ID: <019401cd3e34$0c6af4d0$2540de70$%szyprowski@samsung.com> Hello, On Wednesday, May 30, 2012 2:11 AM KOSAKI Motohiro wrote: > > static void * > > __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, > > const void *caller) > > { > > - struct arm_vmregion *c; > > - size_t align; > > - int bit; > > + struct vm_struct *area; > > + unsigned long addr; > > > > - if (!consistent_pte) { > > - printk(KERN_ERR "%s: not initialised\n", __func__); > > + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); > > In this patch, VM_DMA is only used here. So, is this no effect? I introduced it mainly to let user know which areas have been allocated by the dma-mapping api. I also plan to add a check suggested by Minchan Kim in __dma_free_remap() if the vmalloc area have been in fact allocated with VM_DMA set. Best regards -- Marek Szyprowski Samsung Poland R&D Center From kosaki.motohiro at gmail.com Wed May 30 07:22:53 2012 From: kosaki.motohiro at gmail.com (KOSAKI Motohiro) Date: Wed, 30 May 2012 03:22:53 -0400 Subject: [Linaro-mm-sig] [PATCHv2 4/4] ARM: dma-mapping: remove custom consistent dma region In-Reply-To: <019401cd3e34$0c6af4d0$2540de70$%szyprowski@samsung.com> References: <1337252085-22039-1-git-send-email-m.szyprowski@samsung.com> <1337252085-22039-5-git-send-email-m.szyprowski@samsung.com> <4FC5659D.6040805@gmail.com> <019401cd3e34$0c6af4d0$2540de70$%szyprowski@samsung.com> Message-ID: <4FC5CACD.6000105@gmail.com> (5/30/12 3:15 AM), Marek Szyprowski wrote: > Hello, > > On Wednesday, May 30, 2012 2:11 AM KOSAKI Motohiro wrote: > >>> static void * >>> __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, >>> const void *caller) >>> { >>> - struct arm_vmregion *c; >>> - size_t align; >>> - int bit; >>> + struct vm_struct *area; >>> + unsigned long addr; >>> >>> - if (!consistent_pte) { >>> - printk(KERN_ERR "%s: not initialised\n", __func__); >>> + area = get_vm_area_caller(size, VM_DMA | VM_USERMAP, caller); >> >> In this patch, VM_DMA is only used here. So, is this no effect? > > I introduced it mainly to let user know which areas have been allocated by the dma-mapping api. vma->flags are limited resource, it has only 32 (or 64) bits. Please don't use it for such unimportant thing. > I also plan to add a check suggested by Minchan Kim in __dma_free_remap() if the vmalloc area > have been in fact allocated with VM_DMA set. From m.szyprowski at samsung.com Wed May 30 08:59:51 2012 From: m.szyprowski at samsung.com (Marek Szyprowski) Date: Wed, 30 May 2012 10:59:51 +0200 Subject: [Linaro-mm-sig] [PATCH] ARM: dma-mapping: remove unconditional dependency on CMA In-Reply-To: <20120529122932.GA24623@n2100.arm.linux.org.uk> References: <20120529122932.GA24623@n2100.arm.linux.org.uk> Message-ID: <1338368391-25057-1-git-send-email-m.szyprowski@samsung.com> CMA has been enabled unconditionally on all ARMv6+ systems to solve the long standing issue of double kernel mappings for all dma coherent buffers. This however created a dependency on CONFIG_EXPERIMENTAL for the whole ARM architecture what should be really avoided. This patch removes this dependency and lets one use old, well-tested dma-mapping implementation also on ARMv6+ systems without the need to use EXPERIMENTAL stuff. Reported-by: Russell King Signed-off-by: Marek Szyprowski --- arch/arm/Kconfig | 1 - arch/arm/mm/dma-mapping.c | 10 ++++------ 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 5e76013..ca7354f 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -7,7 +7,6 @@ config ARM select HAVE_IDE if PCI || ISA || PCMCIA select HAVE_DMA_ATTRS select HAVE_DMA_CONTIGUOUS if (CPU_V6 || CPU_V6K || CPU_V7) - select CMA if (CPU_V6 || CPU_V6K || CPU_V7) select HAVE_MEMBLOCK select RTC_LIB select SYS_SUPPORTS_APM_EMULATION diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index ea6b431..106c4c0 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -268,10 +268,8 @@ static int __init consistent_init(void) unsigned long base = consistent_base; unsigned long num_ptes = (CONSISTENT_END - base) >> PMD_SHIFT; -#ifndef CONFIG_ARM_DMA_USE_IOMMU - if (cpu_architecture() >= CPU_ARCH_ARMv6) + if (IS_ENABLED(CONFIG_CMA) && !IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) return 0; -#endif consistent_pte = kmalloc(num_ptes * sizeof(pte_t), GFP_KERNEL); if (!consistent_pte) { @@ -342,7 +340,7 @@ static int __init coherent_init(void) struct page *page; void *ptr; - if (cpu_architecture() < CPU_ARCH_ARMv6) + if (!IS_ENABLED(CONFIG_CMA)) return 0; ptr = __alloc_from_contiguous(NULL, size, prot, &page); @@ -704,7 +702,7 @@ static void *__dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, if (arch_is_coherent() || nommu()) addr = __alloc_simple_buffer(dev, size, gfp, &page); - else if (cpu_architecture() < CPU_ARCH_ARMv6) + else if (!IS_ENABLED(CONFIG_CMA)) addr = __alloc_remap_buffer(dev, size, gfp, prot, &page, caller); else if (gfp & GFP_ATOMIC) addr = __alloc_from_pool(dev, size, &page, caller); @@ -773,7 +771,7 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr, if (arch_is_coherent() || nommu()) { __dma_free_buffer(page, size); - } else if (cpu_architecture() < CPU_ARCH_ARMv6) { + } else if (!IS_ENABLED(CONFIG_CMA)) { __dma_free_remap(cpu_addr, size); __dma_free_buffer(page, size); } else { -- 1.7.1.569.g6f426 From sumit.semwal at ti.com Wed May 30 15:26:34 2012 From: sumit.semwal at ti.com (Semwal, Sumit) Date: Wed, 30 May 2012 23:26:34 +0800 Subject: [Linaro-mm-sig] [PATCHv6 00/13] Integration of videobuf2 with dmabuf In-Reply-To: <5090892.Z3RkLXNQ1U@avalon> References: <1337775027-9489-1-git-send-email-t.stanislaws@samsung.com> <5090892.Z3RkLXNQ1U@avalon> Message-ID: On Tue, May 29, 2012 at 6:25 AM, Laurent Pinchart wrote: > Hi Tomasz, Hi Tomasz, Laurent, Mauro, > > On Wednesday 23 May 2012 14:10:14 Tomasz Stanislawski wrote: >> Hello everyone, >> This patchset adds support for DMABUF [2] importing to V4L2 stack. >> The support for DMABUF exporting was moved to separate patchset >> due to dependency on patches for DMA mapping redesign by >> Marek Szyprowski [4]. > > Except for the small issue with patches 01/13 and 02/13, the set is ready for > upstream as far as I'm concerned. +1; Mauro: how do you think about this series? Getting it landed into 3.5 would make life lot easier :) > >> v6: >> - fixed missing entry in v4l2_memory_names >> - fixed a bug occuring after get_user_pages failure > > I've missed that one, what was it ? > >> - fixed a bug caused by using invalid vma for get_user_pages >> - prepare/finish no longer call dma_sync for dmabuf buffers >> >> v5: >> - removed change of importer/exporter behaviour >> - fixes vb2_dc_pages_to_sgt basing on Laurent's hints >> - changed pin/unpin words to lock/unlock in Doc >> >> v4: >> - rebased on mainline 3.4-rc2 >> - included missing importing support for s5p-fimc and s5p-tv >> - added patch for changing map/unmap for importers >> - fixes to Documentation part >> - coding style fixes >> - pairing {map/unmap}_dmabuf in vb2-core >> - fixing variable types and semantic of arguments in videobufb2-dma-contig.c >> >> v3: >> - rebased on mainline 3.4-rc1 >> - split 'code refactor' patch to multiple smaller patches >> - squashed fixes to Sumit's patches >> - patchset is no longer dependant on 'DMA mapping redesign' >> - separated path for handling IO and non-IO mappings >> - add documentation for DMABUF importing to V4L >> - removed all DMABUF exporter related code >> - removed usage of dma_get_pages extension >> >> v2: >> - extended VIDIOC_EXPBUF argument from integer memoffset to struct >> ? v4l2_exportbuffer >> - added patch that breaks DMABUF spec on (un)map_atachment callcacks but >> allows to work with existing implementation of DMABUF prime in DRM >> - all dma-contig code refactoring patches were squashed >> - bugfixes >> >> v1: List of changes since [1]. >> - support for DMA api extension dma_get_pages, the function is used to >> retrieve pages used to create DMA mapping. >> - small fixes/code cleanup to videobuf2 >> - added prepare and finish callbacks to vb2 allocators, it is used keep >> ? consistency between dma-cpu acess to the memory (by Marek Szyprowski) >> - support for exporting of DMABUF buffer in V4L2 and Videobuf2, originated >> from [3]. >> - support for dma-buf exporting in vb2-dma-contig allocator >> - support for DMABUF for s5p-tv and s5p-fimc (capture interface) drivers, >> ? originated from [3] >> - changed handling for userptr buffers (by Marek Szyprowski, Andrzej >> ? Pietrasiewicz) >> - let mmap method to use dma_mmap_writecombine call (by Marek Szyprowski) >> >> [1] >> http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/4296 >> 6/focus=42968 [2] https://lkml.org/lkml/2011/12/26/29 >> [3] >> http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/3635 >> 4/focus=36355 [4] >> http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 >> >> Laurent Pinchart (2): >> ? v4l: vb2-dma-contig: Shorten vb2_dma_contig prefix to vb2_dc >> ? v4l: vb2-dma-contig: Reorder functions >> >> Marek Szyprowski (2): >> ? v4l: vb2: add prepare/finish callbacks to allocators >> ? v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator >> >> Sumit Semwal (4): >> ? v4l: Add DMABUF as a memory type >> ? v4l: vb2: add support for shared buffer (dma_buf) >> ? v4l: vb: remove warnings about MEMORY_DMABUF >> ? v4l: vb2-dma-contig: add support for dma_buf importing >> >> Tomasz Stanislawski (5): >> ? Documentation: media: description of DMABUF importing in V4L2 >> ? v4l: vb2-dma-contig: Remove unneeded allocation context structure >> ? v4l: vb2-dma-contig: add support for scatterlist in userptr mode >> ? v4l: s5p-tv: mixer: support for dmabuf importing >> ? v4l: s5p-fimc: support for dmabuf importing >> >> ?Documentation/DocBook/media/v4l/compat.xml ? ? ? ? | ? ?4 + >> ?Documentation/DocBook/media/v4l/io.xml ? ? ? ? ? ? | ?179 +++++++ >> ?.../DocBook/media/v4l/vidioc-create-bufs.xml ? ? ? | ? ?1 + >> ?Documentation/DocBook/media/v4l/vidioc-qbuf.xml ? ?| ? 15 + >> ?Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | ? 45 +- >> ?drivers/media/video/s5p-fimc/Kconfig ? ? ? ? ? ? ? | ? ?1 + >> ?drivers/media/video/s5p-fimc/fimc-capture.c ? ? ? ?| ? ?2 +- >> ?drivers/media/video/s5p-tv/Kconfig ? ? ? ? ? ? ? ? | ? ?1 + >> ?drivers/media/video/s5p-tv/mixer_video.c ? ? ? ? ? | ? ?2 +- >> ?drivers/media/video/v4l2-ioctl.c ? ? ? ? ? ? ? ? ? | ? ?1 + >> ?drivers/media/video/videobuf-core.c ? ? ? ? ? ? ? ?| ? ?4 + >> ?drivers/media/video/videobuf2-core.c ? ? ? ? ? ? ? | ?207 +++++++- >> ?drivers/media/video/videobuf2-dma-contig.c ? ? ? ? | ?520 ++++++++++++++--- >> ?include/linux/videodev2.h ? ? ? ? ? ? ? ? ? ? ? ? ?| ? ?7 + >> ?include/media/videobuf2-core.h ? ? ? ? ? ? ? ? ? ? | ? 34 ++ >> ?15 files changed, 924 insertions(+), 99 deletions(-) > -- > Regards, > > Laurent Pinchart > Best regards, ~Sumit.