Hello,
This short patch series is a snapshot of my proof-of-concept integration of the generic IOMMU interface with DMA-mapping framework for ARM architecture and Samsung IOMMU driver.
In this version I rebased the code onto the updated DMA-mapping framework posted a few minutes ago. Management of io address space have been moved from genalloc to pure bitmap-based allocator. I've also added support for mapping a scatterlist with dma_map_sg/dma_unmap_sg. DMA scatterlist interface turned out to be a bit tricky task. Scatterlist may describe a set of disjoint buffers that cannot be easily merged together if they don't start and end on page boundary. In such case we need to allocate more than one buffer in io address space and map respective pages. This results in a code that might be bit hard to understand in the first try.
Right now the code support only 4KiB pages.
The patches have been tested on Samsung Exynos4 platform and FIMC device. Samsung IOMMU driver has been provided for the reference. It is still a work-in-progress code, but because of my holidays I wanted to avoid delaying it further.
Here is the link to the intial version of my ARM & DMA-mapping integration patches: http://www.spinics.net/lists/linux-mm/msg19856.html
All the patches will be available on the following GIT tree: git://git.infradead.org/users/kmpark/linux-2.6-samsung dma-mapping-v3
Git web interface: http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/...
Future:
1. Add all missing operations for IOMMU mappings (map_single/page/sync_*)
2. Move sync_* operations into separate function for better code sharing between iommu and non-iommu dma-mapping code
3. Rebase onto CMA patches and solve the issue with double mapping and page attributes
4. Add support for pages larger than 4KiB.
Please note that this is very early version of patches, definitely NOT intended for merging. I just wanted to make sure that the direction is right and share the code with others that might want to cooperate on dma-mapping improvements.
Best regards -- Marek Szyprowski Samsung Poland R&D Center
Patch summary:
Andrzej Pietrasiewicz (1): ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver
Marek Szyprowski (1): ARM: initial proof-of-concept IOMMU mapper for DMA-mapping
arch/arm/Kconfig | 7 + arch/arm/include/asm/device.h | 4 + arch/arm/include/asm/dma-iommu.h | 29 + arch/arm/mach-exynos4/Kconfig | 5 - arch/arm/mach-exynos4/Makefile | 2 +- arch/arm/mach-exynos4/clock.c | 47 +- arch/arm/mach-exynos4/dev-sysmmu.c | 609 +++++++++++------ arch/arm/mach-exynos4/include/mach/irqs.h | 34 +- arch/arm/mach-exynos4/include/mach/sysmmu.h | 46 -- arch/arm/mm/dma-mapping.c | 504 ++++++++++++++- arch/arm/mm/vmregion.h | 2 +- arch/arm/plat-s5p/Kconfig | 21 +- arch/arm/plat-s5p/include/plat/sysmmu.h | 119 ++-- arch/arm/plat-s5p/sysmmu.c | 855 ++++++++++++++++++------ arch/arm/plat-samsung/include/plat/devs.h | 1 - arch/arm/plat-samsung/include/plat/fimc-core.h | 25 + 16 files changed, 1724 insertions(+), 586 deletions(-) create mode 100644 arch/arm/include/asm/dma-iommu.h delete mode 100644 arch/arm/mach-exynos4/include/mach/sysmmu.h
Add initial proof of concept implementation of DMA-mapping API for devices that have IOMMU support. Only dma_alloc_coherent, dma_free_coherent, and dma_mmap_coherent as well as dma_(un)map_sg and dma_sync_sg_for_cpu/device functions are supported.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com --- arch/arm/Kconfig | 7 + arch/arm/include/asm/device.h | 4 + arch/arm/include/asm/dma-iommu.h | 29 +++ arch/arm/mm/dma-mapping.c | 504 ++++++++++++++++++++++++++++++++++++-- arch/arm/mm/vmregion.h | 2 +- 5 files changed, 527 insertions(+), 19 deletions(-) create mode 100644 arch/arm/include/asm/dma-iommu.h
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 10b0e0e..3fcc183 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -41,6 +41,13 @@ config ARM config ARM_HAS_SG_CHAIN bool
+config NEED_SG_DMA_LENGTH + bool + +config ARM_DMA_USE_IOMMU + select NEED_SG_DMA_LENGTH + bool + config HAVE_PWM bool
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h index d3b35d8..bd34378 100644 --- a/arch/arm/include/asm/device.h +++ b/arch/arm/include/asm/device.h @@ -11,6 +11,10 @@ struct dev_archdata { #ifdef CONFIG_DMABOUNCE struct dmabounce_device_info *dmabounce; #endif +#ifdef CONFIG_ARM_DMA_USE_IOMMU + void *iommu_priv; + struct dma_iommu_mapping *mapping; +#endif };
struct pdev_archdata { diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h new file mode 100644 index 0000000..0b2677e --- /dev/null +++ b/arch/arm/include/asm/dma-iommu.h @@ -0,0 +1,29 @@ +#ifndef ASMARM_DMA_IOMMU_H +#define ASMARM_DMA_IOMMU_H + +#ifdef __KERNEL__ + +#include <linux/mm_types.h> +#include <linux/scatterlist.h> +#include <linux/dma-debug.h> +#include <linux/kmemcheck.h> + +#include <asm/memory.h> + +struct dma_iommu_mapping { + /* iommu specific data */ + struct iommu_domain *domain; + + void *bitmap; + size_t bits; + unsigned int order; + dma_addr_t base; + + struct mutex lock; +}; + +int arm_iommu_attach_device(struct device *dev, dma_addr_t base, + dma_addr_t size, int order); + +#endif /* __KERNEL__ */ +#endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 0421a2e..020bde1 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -18,6 +18,7 @@ #include <linux/device.h> #include <linux/dma-mapping.h> #include <linux/highmem.h> +#include <linux/slab.h>
#include <asm/memory.h> #include <asm/highmem.h> @@ -25,6 +26,9 @@ #include <asm/tlbflush.h> #include <asm/sizes.h>
+#include <linux/iommu.h> +#include <asm/dma-iommu.h> + #include "mm.h"
/* @@ -154,6 +158,20 @@ static u64 get_coherent_dma_mask(struct device *dev) return mask; }
+static inline void __clear_pages(struct page *page, size_t size) +{ + void *ptr; + /* + * Ensure that the allocated pages are zeroed, and that any data + * lurking in the kernel direct-mapped region is invalidated. + */ + ptr = page_address(page); + memset(ptr, 0, size); + dmac_flush_range(ptr, ptr + size); + outer_flush_range(__pa(ptr), __pa(ptr) + size); +} + + /* * Allocate a DMA buffer for 'dev' of size 'size' using the * specified gfp mask. Note that 'size' must be page aligned. @@ -162,7 +180,6 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf { unsigned long order = get_order(size); struct page *page, *p, *e; - void *ptr; u64 mask = get_coherent_dma_mask(dev);
#ifdef CONFIG_DMA_API_DEBUG @@ -191,14 +208,7 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf for (p = page + (size >> PAGE_SHIFT), e = page + (1 << order); p < e; p++) __free_page(p);
- /* - * Ensure that the allocated pages are zeroed, and that any data - * lurking in the kernel direct-mapped region is invalidated. - */ - ptr = page_address(page); - memset(ptr, 0, size); - dmac_flush_range(ptr, ptr + size); - outer_flush_range(__pa(ptr), __pa(ptr) + size); + __clear_pages(page, size);
return page; } @@ -326,7 +336,7 @@ __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot) u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1);
pte = consistent_pte[idx] + off; - c->vm_pages = page; + c->priv = page;
do { BUG_ON(!pte_none(*pte)); @@ -428,6 +438,14 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, return addr; }
+static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) +{ + prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? + pgprot_writecombine(prot) : + pgprot_dmacoherent(prot); + return prot; +} + /* * Allocate DMA-coherent memory space and return both the kernel remapped * virtual and bus address for that space. @@ -435,9 +453,7 @@ __dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) { - pgprot_t prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? - pgprot_writecombine(pgprot_kernel) : - pgprot_dmacoherent(pgprot_kernel); + pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); void *memory;
if (dma_alloc_from_coherent(dev, size, handle, &memory)) @@ -458,10 +474,7 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, unsigned long user_size, kern_size; struct arm_vmregion *c;
- vma->vm_page_prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? - pgprot_writecombine(vma->vm_page_prot) : - pgprot_dmacoherent(vma->vm_page_prot); - + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); @@ -472,8 +485,9 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
if (off < kern_size && user_size <= (kern_size - off)) { + struct page *pages = c->priv; ret = remap_pfn_range(vma, vma->vm_start, - page_to_pfn(c->vm_pages) + off, + page_to_pfn(pages) + off, user_size << PAGE_SHIFT, vma->vm_page_prot); } @@ -612,6 +626,9 @@ int arm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, int i, j;
for_each_sg(sg, s, nents, i) { +#ifdef CONFIG_NEED_SG_DMA_LENGTH + s->dma_length = s->length; +#endif s->dma_address = ops->map_page(dev, sg_page(s), s->offset, s->length, dir, attrs); if (dma_mapping_error(dev, s->dma_address)) @@ -717,3 +734,454 @@ static int __init dma_debug_do_init(void) return 0; } fs_initcall(dma_debug_do_init); + +#ifdef CONFIG_ARM_DMA_USE_IOMMU + +/* IOMMU */ + +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, size_t size) +{ + unsigned int order = get_order(size); + unsigned int align = 0; + unsigned int count, start; + + if (order > mapping->order) + align = (1 << (order - mapping->order)) - 1; + + count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order; + + start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0, count, align); + if (start > mapping->bits) + return ~0; + + bitmap_set(mapping->bitmap, start, count); + + return mapping->base + (start << (mapping->order + PAGE_SHIFT)); +} + +static inline void __free_iova(struct dma_iommu_mapping *mapping, dma_addr_t addr, size_t size) +{ + unsigned int start = (addr - mapping->base) >> (mapping->order + PAGE_SHIFT); + unsigned int count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order; + + bitmap_clear(mapping->bitmap, start, count); +} + +static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp) +{ + struct page **pages; + int count = size >> PAGE_SHIFT; + int i; + + pages = kzalloc(count * sizeof(struct page*), gfp); + if (!pages) + return NULL; + + for (i=0; i<count; i++) { + pages[i] = alloc_page(gfp); + if (!pages[i]) + goto error; + + __clear_pages(pages[i], PAGE_SIZE); + } + + return pages; +error: + while (--i) + if (pages[i]) + __free_pages(pages[i], 0); + kfree(pages); + return NULL; +} + +static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size) +{ + int count = size >> PAGE_SHIFT; + int i; + for (i=0; i< count; i++) + if (pages[i]) + __free_pages(pages[i], 0); + kfree(pages); + return 0; +} + +static void * +__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) +{ + struct arm_vmregion *c; + size_t align; + size_t count = size >> PAGE_SHIFT; + int bit; + + if (!consistent_pte[0]) { + printk(KERN_ERR "%s: not initialised\n", __func__); + dump_stack(); + return NULL; + } + + /* + * Align the virtual region allocation - maximum alignment is + * a section size, minimum is a page size. This helps reduce + * fragmentation of the DMA space, and also prevents allocations + * smaller than a section from crossing a section boundary. + */ + bit = fls(size - 1); + if (bit > SECTION_SHIFT) + bit = SECTION_SHIFT; + align = 1 << bit; + + /* + * Allocate a virtual address in the consistent mapping region. + */ + c = arm_vmregion_alloc(&consistent_head, align, size, + gfp & ~(__GFP_DMA | __GFP_HIGHMEM)); + if (c) { + pte_t *pte; + int idx = CONSISTENT_PTE_INDEX(c->vm_start); + int i = 0; + u32 off = CONSISTENT_OFFSET(c->vm_start) & (PTRS_PER_PTE-1); + + pte = consistent_pte[idx] + off; + c->priv = pages; + + do { + BUG_ON(!pte_none(*pte)); + + set_pte_ext(pte, mk_pte(pages[i], prot), 0); + pte++; + off++; + i++; + if (off >= PTRS_PER_PTE) { + off = 0; + pte = consistent_pte[++idx]; + } + } while (i < count); + + dsb(); + + return (void *)c->vm_start; + } + return NULL; +} + +static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + unsigned int count = size >> PAGE_SHIFT; + dma_addr_t dma_addr, iova; + int i, ret = ~0; + + dma_addr = __alloc_iova(mapping, size); + if (dma_addr == 0) + goto fail; + + iova = dma_addr; + for (i=0; i<count; i++) { + unsigned int phys = page_to_phys(pages[i]); + ret = iommu_map(mapping->domain, iova, phys, 0, 0); + if (ret < 0) + goto fail; + iova += PAGE_SIZE; + } + + return dma_addr; +fail: + return ~0; +} + +static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + unsigned int count = size >> PAGE_SHIFT; + int i; + + for (i=0; i<count; i++) { + iommu_unmap(mapping->domain, iova, 0); + iova += PAGE_SIZE; + } + __free_iova(mapping, iova, size); + return 0; +} + +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, + dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); + struct page **pages; + void *addr = NULL; + + *handle = ~0; + size = PAGE_ALIGN(size); + + mutex_lock(&mapping->lock); + + pages = __iommu_alloc_buffer(dev, size, gfp); + if (!pages) + goto err_unlock; + + *handle = __iommu_create_mapping(dev, pages, size); + if (*handle == ~0) + goto err_buffer; + + addr = __iommu_alloc_remap(pages, size, gfp, prot); + if (!addr) + goto err_mapping; + + mutex_unlock(&mapping->lock); + return addr; + +err_mapping: + __iommu_remove_mapping(dev, *handle, size); +err_buffer: + __iommu_free_buffer(dev, pages, size); +err_unlock: + mutex_unlock(&mapping->lock); + return NULL; +} + +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) +{ + unsigned long user_size; + struct arm_vmregion *c; + + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); + user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); + if (c) { + struct page **pages = c->priv; + + unsigned long uaddr = vma->vm_start; + unsigned long usize = vma->vm_end - vma->vm_start; + int i = 0; + + do { + int ret; + + ret = vm_insert_page(vma, uaddr, pages[i++]); + if (ret) { + printk(KERN_ERR "Remapping memory, error: %d\n", ret); + return ret; + } + + uaddr += PAGE_SIZE; + usize -= PAGE_SIZE; + } while (usize > 0); + } + return 0; +} + +/* + * free a page as defined by the above mapping. + * Must not be called with IRQs disabled. + */ +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, + dma_addr_t handle, struct dma_attrs *attrs) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + struct arm_vmregion *c; + size = PAGE_ALIGN(size); + + mutex_lock(&mapping->lock); + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); + if (c) { + struct page **pages = c->priv; + __dma_free_remap(cpu_addr, size); + __iommu_remove_mapping(dev, handle, size); + __iommu_free_buffer(dev, pages, size); + } + mutex_unlock(&mapping->lock); +} + +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, + size_t size, dma_addr_t *handle, + enum dma_data_direction dir) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + dma_addr_t dma_addr, iova; + int ret = 0; + + *handle = ~0; + mutex_lock(&mapping->lock); + + iova = dma_addr = __alloc_iova(mapping, size); + if (dma_addr == 0) + goto fail; + + while (size) { + unsigned int phys = page_to_phys(sg_page(sg)); + unsigned int len = sg->offset + sg->length; + + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(sg_page(sg), sg->offset, sg->length, dir); + + while (len) { + ret = iommu_map(mapping->domain, iova, phys, 0, 0); + if (ret < 0) + goto fail; + iova += PAGE_SIZE; + len -= PAGE_SIZE; + size -= PAGE_SIZE; + } + sg = sg_next(sg); + } + + *handle = dma_addr; + mutex_unlock(&mapping->lock); + + return 0; +fail: + __iommu_remove_mapping(dev, iova, size); + mutex_unlock(&mapping->lock); + return ret; +} + +int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + struct scatterlist *s = sg, *dma = sg, *start = sg; + int i, count = 1; + unsigned int offset = s->offset; + unsigned int size = s->offset + s->length; + + for (i = 1; i < nents; i++) { + s->dma_address = ~0; + s->dma_length = 0; + + s = sg_next(s); + + if (s->offset || (size & (PAGE_SIZE - 1))) { + if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0) + goto bad_mapping; + + dma->dma_address += offset; + dma->dma_length = size; + + size = offset = s->offset; + start = s; + dma = sg_next(dma); + count += 1; + } + size += sg->length; + } + __map_sg_chunk(dev, start, size, &dma->dma_address, dir); + d->dma_address += offset; + + return count; + +bad_mapping: + for_each_sg(sg, s, count-1, i) + __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); + return 0; +} + +void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + struct scatterlist *s; + int i; + + for_each_sg(sg, s, nents, i) { + if (sg_dma_len(s)) + __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(sg_page(sg), sg->offset, sg->length, dir); + } +} + + +/** + * dma_sync_sg_for_cpu + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices + * @sg: list of buffers + * @nents: number of buffers to map (returned from dma_map_sg) + * @dir: DMA transfer direction (same as was passed to dma_map_sg) + */ +void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction dir) +{ + struct scatterlist *s; + int i; + + for_each_sg(sg, s, nents, i) + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(sg_page(sg), sg->offset, sg->length, dir); +} + +/** + * dma_sync_sg_for_device + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices + * @sg: list of buffers + * @nents: number of buffers to map (returned from dma_map_sg) + * @dir: DMA transfer direction (same as was passed to dma_map_sg) + */ +void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction dir) +{ + struct scatterlist *s; + int i; + + for_each_sg(sg, s, nents, i) + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(sg_page(sg), sg->offset, sg->length, dir); +} + +struct dma_map_ops iommu_ops = { + .alloc = arm_iommu_alloc_attrs, + .free = arm_iommu_free_attrs, + .mmap = arm_iommu_mmap_attrs, + .map_sg = arm_iommu_map_sg, + .unmap_sg = arm_iommu_unmap_sg, + .sync_sg_for_cpu = arm_iommu_sync_sg_for_cpu, + .sync_sg_for_device = arm_iommu_sync_sg_for_device, +}; + +int arm_iommu_attach_device(struct device *dev, dma_addr_t base, size_t size, int order) +{ + unsigned int count = (size >> PAGE_SHIFT) - order; + unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); + struct dma_iommu_mapping *mapping; + int err = -ENOMEM; + + mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL); + if (!mapping) + goto err; + + mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL); + if (!mapping->bitmap) + goto err2; + + mapping->base = base; + mapping->bits = bitmap_size; + mapping->order = order; + mutex_init(&mapping->lock); + + mapping->domain = iommu_domain_alloc(); + if (!mapping->domain) + goto err3; + + err = iommu_attach_device(mapping->domain, dev); + if (err != 0) + goto err4; + + dev->archdata.mapping = mapping; + set_dma_ops(dev, &iommu_ops); + + printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev)); + return 0; + +err4: + iommu_domain_free(mapping->domain); +err3: + kfree(mapping->bitmap); +err2: + kfree(mapping); +err: + return -ENOMEM; +} +EXPORT_SYMBOL(arm_iommu_attach_device); + +#endif diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h index 15e9f04..6bbc402 100644 --- a/arch/arm/mm/vmregion.h +++ b/arch/arm/mm/vmregion.h @@ -17,7 +17,7 @@ struct arm_vmregion { struct list_head vm_list; unsigned long vm_start; unsigned long vm_end; - struct page *vm_pages; + void *priv; int vm_active; };
Hi, a few comments On Fri, September 2, 2011 6:56 am, Marek Szyprowski wrote: ...
+struct dma_iommu_mapping {
- /* iommu specific data */
- struct iommu_domain *domain;
- void *bitmap;
In the earlier version of this patch you had this as a genpool instead of just doing the bitmaps manually. Is there a reason genpool can't be used to get the iova addresses?
- size_t bits;
- unsigned int order;
- dma_addr_t base;
- struct mutex lock;
+};
<snip>
+int arm_iommu_attach_device(struct device *dev, dma_addr_t base, size_t size, int order) +{
- unsigned int count = (size >> PAGE_SHIFT) - order;
- unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long);
- struct dma_iommu_mapping *mapping;
- int err = -ENOMEM;
- mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
- if (!mapping)
goto err;
- mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
- if (!mapping->bitmap)
goto err2;
- mapping->base = base;
- mapping->bits = bitmap_size;
- mapping->order = order;
- mutex_init(&mapping->lock);
- mapping->domain = iommu_domain_alloc();
- if (!mapping->domain)
goto err3;
- err = iommu_attach_device(mapping->domain, dev);
- if (err != 0)
goto err4;
- dev->archdata.mapping = mapping;
- set_dma_ops(dev, &iommu_ops);
- printk(KERN_INFO "Attached IOMMU controller to %s device.\n",
dev_name(dev));
- return 0;
+err4:
- iommu_domain_free(mapping->domain);
+err3:
- kfree(mapping->bitmap);
+err2:
- kfree(mapping);
+err:
- return -ENOMEM;
+} +EXPORT_SYMBOL(arm_iommu_attach_device);
+#endif
Attach makes the assumption that each iommu device will exist in a separate domain. What if multiple devices want to use the same iommu domain? The msm iommu implementation has many different iommu devices but many of these will need the same buffer to be mapped in each context so currently many devices share the same domain. Without this, the same map call would need to happen for each device, which creates extra map calls and overhead.
Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig
Laura
Hello,
On Thursday, September 08, 2011 6:42 PM Laura Abbott wrote:
Hi, a few comments On Fri, September 2, 2011 6:56 am, Marek Szyprowski wrote: ...
+struct dma_iommu_mapping {
- /* iommu specific data */
- struct iommu_domain *domain;
- void *bitmap;
In the earlier version of this patch you had this as a genpool instead of just doing the bitmaps manually. Is there a reason genpool can't be used to get the iova addresses?
IMHO genpool was a bit overkill in this case and required some additional patches for aligned allocations. In the next version I also want to extend this bitmap based allocator to dynamically resize the bitmap for more than one page if the io address space gets exhausted.
- size_t bits;
- unsigned int order;
- dma_addr_t base;
- struct mutex lock;
+};
<snip> > +int arm_iommu_attach_device(struct device *dev, dma_addr_t base, size_t > size, int order) > +{ > + unsigned int count = (size >> PAGE_SHIFT) - order; > + unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); > + struct dma_iommu_mapping *mapping; > + int err = -ENOMEM; > + > + mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL); > + if (!mapping) > + goto err; > + > + mapping->bitmap = kzalloc(bitmap_size, GFP_KERNEL); > + if (!mapping->bitmap) > + goto err2; > + > + mapping->base = base; > + mapping->bits = bitmap_size; > + mapping->order = order; > + mutex_init(&mapping->lock); > + > + mapping->domain = iommu_domain_alloc(); > + if (!mapping->domain) > + goto err3; > + > + err = iommu_attach_device(mapping->domain, dev); > + if (err != 0) > + goto err4; > + > + dev->archdata.mapping = mapping; > + set_dma_ops(dev, &iommu_ops); > + > + printk(KERN_INFO "Attached IOMMU controller to %s device.\n", > dev_name(dev)); > + return 0; > + > +err4: > + iommu_domain_free(mapping->domain); > +err3: > + kfree(mapping->bitmap); > +err2: > + kfree(mapping); > +err: > + return -ENOMEM; > +} > +EXPORT_SYMBOL(arm_iommu_attach_device); > + > +#endif
Attach makes the assumption that each iommu device will exist in a separate domain. What if multiple devices want to use the same iommu domain? The msm iommu implementation has many different iommu devices but many of these will need the same buffer to be mapped in each context so currently many devices share the same domain. Without this, the same map call would need to happen for each device, which creates extra map calls and overhead.
Ah, right. I forgot about the case when devices need to share one domain. Moving iommu_domain_alloc out of arm_iommu_attach_device and giving that function just a pointer to the iommu domain should solve this issue. I will change this in the next version of the patches.
Best regards
Marek, Here is a patch that has fixes to get SDHC driver work as a DMA IOMMU client. Here is the overview of changes.
1. Converted the mutex to spinlock to handle atomic context calls and used spinlock in necessary places. 2. Implemented arm_iommu_map_page and arm_iommu_unmap_page, which are used by MMC host stack. 3. Fixed the bugs identified during testing with SDHC driver.
From: Krishna Reddy vdumpa@nvidia.com Date: Fri, 7 Oct 2011 17:25:59 -0700 Subject: [PATCH] ARM: dma-mapping: Implement arm_iommu_map_page/unmap_page and fix issues.
Change-Id: I47a1a0065538fa0a161dd6d551b38079bd8f84fd --- arch/arm/include/asm/dma-iommu.h | 3 +- arch/arm/mm/dma-mapping.c | 182 +++++++++++++++++++++----------------- 2 files changed, 102 insertions(+), 83 deletions(-)
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h index 0b2677e..ad1a4d9 100644 --- a/arch/arm/include/asm/dma-iommu.h +++ b/arch/arm/include/asm/dma-iommu.h @@ -7,6 +7,7 @@ #include <linux/scatterlist.h> #include <linux/dma-debug.h> #include <linux/kmemcheck.h> +#include <linux/spinlock_types.h>
#include <asm/memory.h>
@@ -19,7 +20,7 @@ struct dma_iommu_mapping { unsigned int order; dma_addr_t base;
- struct mutex lock; + spinlock_t lock; };
int arm_iommu_attach_device(struct device *dev, dma_addr_t base, diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 020bde1..0befd88 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -739,32 +739,42 @@ fs_initcall(dma_debug_do_init);
/* IOMMU */
-static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, size_t size) +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, + size_t size) { - unsigned int order = get_order(size); unsigned int align = 0; unsigned int count, start; + unsigned long flags;
- if (order > mapping->order) - align = (1 << (order - mapping->order)) - 1; + count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) + + (1 << mapping->order) - 1) >> mapping->order;
- count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order; - - start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0, count, align); - if (start > mapping->bits) + spin_lock_irqsave(&mapping->lock, flags); + start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, + 0, count, align); + if (start > mapping->bits) { + spin_unlock_irqrestore(&mapping->lock, flags); return ~0; + }
bitmap_set(mapping->bitmap, start, count); + spin_unlock_irqrestore(&mapping->lock, flags);
return mapping->base + (start << (mapping->order + PAGE_SHIFT)); }
-static inline void __free_iova(struct dma_iommu_mapping *mapping, dma_addr_t addr, size_t size) +static inline void __free_iova(struct dma_iommu_mapping *mapping, + dma_addr_t addr, size_t size) { - unsigned int start = (addr - mapping->base) >> (mapping->order + PAGE_SHIFT); - unsigned int count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order; + unsigned int start = (addr - mapping->base) >> + (mapping->order + PAGE_SHIFT); + unsigned int count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) + + (1 << mapping->order) - 1) >> mapping->order; + unsigned long flags;
+ spin_lock_irqsave(&mapping->lock, flags); bitmap_clear(mapping->bitmap, start, count); + spin_unlock_irqrestore(&mapping->lock, flags); }
static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp) @@ -867,7 +877,7 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size) { struct dma_iommu_mapping *mapping = dev->archdata.mapping; - unsigned int count = size >> PAGE_SHIFT; + unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; dma_addr_t dma_addr, iova; int i, ret = ~0;
@@ -892,13 +902,12 @@ fail: static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size) { struct dma_iommu_mapping *mapping = dev->archdata.mapping; - unsigned int count = size >> PAGE_SHIFT; + unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; int i;
- for (i=0; i<count; i++) { - iommu_unmap(mapping->domain, iova, 0); - iova += PAGE_SIZE; - } + iova = iova & PAGE_MASK; + for (i=0; i<count; i++) + iommu_unmap(mapping->domain, iova + i * PAGE_SIZE, 0); __free_iova(mapping, iova, size); return 0; } @@ -906,7 +915,6 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) { - struct dma_iommu_mapping *mapping = dev->archdata.mapping; pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); struct page **pages; void *addr = NULL; @@ -914,11 +922,9 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, *handle = ~0; size = PAGE_ALIGN(size);
- mutex_lock(&mapping->lock); - pages = __iommu_alloc_buffer(dev, size, gfp); if (!pages) - goto err_unlock; + goto exit;
*handle = __iommu_create_mapping(dev, pages, size); if (*handle == ~0) @@ -928,15 +934,13 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, if (!addr) goto err_mapping;
- mutex_unlock(&mapping->lock); return addr;
err_mapping: __iommu_remove_mapping(dev, *handle, size); err_buffer: __iommu_free_buffer(dev, pages, size); -err_unlock: - mutex_unlock(&mapping->lock); +exit: return NULL; }
@@ -944,11 +948,9 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs) { - unsigned long user_size; struct arm_vmregion *c;
vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); - user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); if (c) { @@ -981,11 +983,9 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) { - struct dma_iommu_mapping *mapping = dev->archdata.mapping; struct arm_vmregion *c; size = PAGE_ALIGN(size);
- mutex_lock(&mapping->lock); c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); if (c) { struct page **pages = c->priv; @@ -993,7 +993,6 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, __iommu_remove_mapping(dev, handle, size); __iommu_free_buffer(dev, pages, size); } - mutex_unlock(&mapping->lock); }
static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, @@ -1001,80 +1000,93 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, enum dma_data_direction dir) { struct dma_iommu_mapping *mapping = dev->archdata.mapping; - dma_addr_t dma_addr, iova; + dma_addr_t iova; int ret = 0; + unsigned long i; + phys_addr_t phys = page_to_phys(sg_page(sg));
+ size = PAGE_ALIGN(size); *handle = ~0; - mutex_lock(&mapping->lock);
- iova = dma_addr = __alloc_iova(mapping, size); - if (dma_addr == 0) - goto fail; - - while (size) { - unsigned int phys = page_to_phys(sg_page(sg)); - unsigned int len = sg->offset + sg->length; + iova = __alloc_iova(mapping, size); + if (iova == 0) + return -ENOMEM;
- if (!arch_is_coherent()) - __dma_page_cpu_to_dev(sg_page(sg), sg->offset, sg->length, dir); - - while (len) { - ret = iommu_map(mapping->domain, iova, phys, 0, 0); - if (ret < 0) - goto fail; - iova += PAGE_SIZE; - len -= PAGE_SIZE; - size -= PAGE_SIZE; - } - sg = sg_next(sg); + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(sg_page(sg), sg->offset, + sg->length, dir); + for (i = 0; i < (size >> PAGE_SHIFT); i++) { + ret = iommu_map(mapping->domain, iova + i * PAGE_SIZE, + phys + i * PAGE_SIZE, 0, 0); + if (ret < 0) + goto fail; } - - *handle = dma_addr; - mutex_unlock(&mapping->lock); + *handle = iova;
return 0; fail: + while (i--) + iommu_unmap(mapping->domain, iova + i * PAGE_SIZE, 0); + __iommu_remove_mapping(dev, iova, size); - mutex_unlock(&mapping->lock); return ret; }
+static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + dma_addr_t dma_addr; + + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(page, offset, size, dir); + + BUG_ON((offset+size) > PAGE_SIZE); + dma_addr = __iommu_create_mapping(dev, &page, PAGE_SIZE); + return dma_addr + offset; +} + +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + phys_addr_t phys; + + phys = iommu_iova_to_phys(mapping->domain, handle); + __iommu_remove_mapping(dev, handle, size); + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(pfn_to_page(__phys_to_pfn(phys)), + phys & ~PAGE_MASK, size, dir); +} + int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir, struct dma_attrs *attrs) { - struct scatterlist *s = sg, *dma = sg, *start = sg; - int i, count = 1; - unsigned int offset = s->offset; - unsigned int size = s->offset + s->length; + struct scatterlist *s; + unsigned int size; + int i, count = 0;
- for (i = 1; i < nents; i++) { + for_each_sg(sg, s, nents, i) { s->dma_address = ~0; s->dma_length = 0; + size = s->offset + s->length;
- s = sg_next(s); - - if (s->offset || (size & (PAGE_SIZE - 1))) { - if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0) - goto bad_mapping; - - dma->dma_address += offset; - dma->dma_length = size; + if (__map_sg_chunk(dev, s, size, &s->dma_address, dir) < 0) + goto bad_mapping;
- size = offset = s->offset; - start = s; - dma = sg_next(dma); - count += 1; - } - size += sg->length; + s->dma_address += s->offset; + s->dma_length = s->length; + count++; } - __map_sg_chunk(dev, start, size, &dma->dma_address, dir); - d->dma_address += offset;
return count;
bad_mapping: - for_each_sg(sg, s, count-1, i) - __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); + for_each_sg(sg, s, count, i) { + __iommu_remove_mapping(dev, sg_dma_address(s), + PAGE_ALIGN(sg_dma_len(s))); + } return 0; }
@@ -1086,9 +1098,11 @@ void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
for_each_sg(sg, s, nents, i) { if (sg_dma_len(s)) - __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); + __iommu_remove_mapping(dev, sg_dma_address(s), + sg_dma_len(s)); if (!arch_is_coherent()) - __dma_page_dev_to_cpu(sg_page(sg), sg->offset, sg->length, dir); + __dma_page_dev_to_cpu(sg_page(s), s->offset, + s->length, dir); } }
@@ -1108,7 +1122,8 @@ void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
for_each_sg(sg, s, nents, i) if (!arch_is_coherent()) - __dma_page_dev_to_cpu(sg_page(sg), sg->offset, sg->length, dir); + __dma_page_dev_to_cpu(sg_page(s), s->offset, + s->length, dir); }
/** @@ -1126,13 +1141,16 @@ void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
for_each_sg(sg, s, nents, i) if (!arch_is_coherent()) - __dma_page_cpu_to_dev(sg_page(sg), sg->offset, sg->length, dir); + __dma_page_cpu_to_dev(sg_page(s), s->offset, + s->length, dir); }
struct dma_map_ops iommu_ops = { .alloc = arm_iommu_alloc_attrs, .free = arm_iommu_free_attrs, .mmap = arm_iommu_mmap_attrs, + .map_page = arm_iommu_map_page, + .unmap_page = arm_iommu_unmap_page, .map_sg = arm_iommu_map_sg, .unmap_sg = arm_iommu_unmap_sg, .sync_sg_for_cpu = arm_iommu_sync_sg_for_cpu, @@ -1157,7 +1175,7 @@ int arm_iommu_attach_device(struct device *dev, dma_addr_t base, size_t size, in mapping->base = base; mapping->bits = bitmap_size; mapping->order = order; - mutex_init(&mapping->lock); + spin_lock_init(&mapping->lock);
mapping->domain = iommu_domain_alloc(); if (!mapping->domain)
Hello,
On Monday, October 10, 2011 11:57 PM Krishna Reddy wrote:
Marek, Here is a patch that has fixes to get SDHC driver work as a DMA IOMMU client. Here is the overview of changes.
- Converted the mutex to spinlock to handle atomic context calls and used spinlock in
necessary places. 2. Implemented arm_iommu_map_page and arm_iommu_unmap_page, which are used by MMC host stack. 3. Fixed the bugs identified during testing with SDHC driver.
Thanks for your work! I agree that spinlock protection is the correct approach. However I have some comments on your changes. Please see the code.
From: Krishna Reddy vdumpa@nvidia.com Date: Fri, 7 Oct 2011 17:25:59 -0700 Subject: [PATCH] ARM: dma-mapping: Implement arm_iommu_map_page/unmap_page and fix issues.
Change-Id: I47a1a0065538fa0a161dd6d551b38079bd8f84fd
arch/arm/include/asm/dma-iommu.h | 3 +- arch/arm/mm/dma-mapping.c | 182 +++++++++++++++++++++----------------- 2 files changed, 102 insertions(+), 83 deletions(-)
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h index 0b2677e..ad1a4d9 100644 --- a/arch/arm/include/asm/dma-iommu.h +++ b/arch/arm/include/asm/dma-iommu.h @@ -7,6 +7,7 @@ #include <linux/scatterlist.h> #include <linux/dma-debug.h> #include <linux/kmemcheck.h> +#include <linux/spinlock_types.h>
#include <asm/memory.h>
@@ -19,7 +20,7 @@ struct dma_iommu_mapping { unsigned int order; dma_addr_t base;
- struct mutex lock;
- spinlock_t lock;
};
int arm_iommu_attach_device(struct device *dev, dma_addr_t base, diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 020bde1..0befd88 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -739,32 +739,42 @@ fs_initcall(dma_debug_do_init);
/* IOMMU */
-static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, size_t size) +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
size_t size)
{
- unsigned int order = get_order(size); unsigned int align = 0; unsigned int count, start;
- unsigned long flags;
- if (order > mapping->order)
align = (1 << (order - mapping->order)) - 1;
- count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
(1 << mapping->order) - 1) >> mapping->order;
- count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order;
- start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0, count, align);
- if (start > mapping->bits)
spin_lock_irqsave(&mapping->lock, flags);
start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits,
0, count, align);
if (start > mapping->bits) {
spin_unlock_irqrestore(&mapping->lock, flags);
return ~0;
}
bitmap_set(mapping->bitmap, start, count);
spin_unlock_irqrestore(&mapping->lock, flags);
return mapping->base + (start << (mapping->order + PAGE_SHIFT));
}
-static inline void __free_iova(struct dma_iommu_mapping *mapping, dma_addr_t addr, size_t size) +static inline void __free_iova(struct dma_iommu_mapping *mapping,
dma_addr_t addr, size_t size)
{
- unsigned int start = (addr - mapping->base) >> (mapping->order + PAGE_SHIFT);
- unsigned int count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping-
order;
unsigned int start = (addr - mapping->base) >>
(mapping->order + PAGE_SHIFT);
unsigned int count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) +
(1 << mapping->order) - 1) >> mapping->order;
unsigned long flags;
spin_lock_irqsave(&mapping->lock, flags); bitmap_clear(mapping->bitmap, start, count);
spin_unlock_irqrestore(&mapping->lock, flags);
}
static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp) @@ -867,7 +877,7 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size) { struct dma_iommu_mapping *mapping = dev->archdata.mapping;
- unsigned int count = size >> PAGE_SHIFT;
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; dma_addr_t dma_addr, iova; int i, ret = ~0;
@@ -892,13 +902,12 @@ fail: static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size) { struct dma_iommu_mapping *mapping = dev->archdata.mapping;
- unsigned int count = size >> PAGE_SHIFT;
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; int i;
- for (i=0; i<count; i++) {
iommu_unmap(mapping->domain, iova, 0);
iova += PAGE_SIZE;
- }
- iova = iova & PAGE_MASK;
- for (i=0; i<count; i++)
__free_iova(mapping, iova, size); return 0;iommu_unmap(mapping->domain, iova + i * PAGE_SIZE, 0);
} @@ -906,7 +915,6 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) {
- struct dma_iommu_mapping *mapping = dev->archdata.mapping; pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); struct page **pages; void *addr = NULL;
@@ -914,11 +922,9 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, *handle = ~0; size = PAGE_ALIGN(size);
- mutex_lock(&mapping->lock);
- pages = __iommu_alloc_buffer(dev, size, gfp); if (!pages)
goto err_unlock;
goto exit;
*handle = __iommu_create_mapping(dev, pages, size); if (*handle == ~0)
@@ -928,15 +934,13 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, if (!addr) goto err_mapping;
- mutex_unlock(&mapping->lock); return addr;
err_mapping: __iommu_remove_mapping(dev, *handle, size); err_buffer: __iommu_free_buffer(dev, pages, size); -err_unlock:
- mutex_unlock(&mapping->lock);
+exit: return NULL; }
@@ -944,11 +948,9 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs) {
unsigned long user_size; struct arm_vmregion *c;
vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); if (c) {
@@ -981,11 +983,9 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) {
struct dma_iommu_mapping *mapping = dev->archdata.mapping; struct arm_vmregion *c; size = PAGE_ALIGN(size);
mutex_lock(&mapping->lock); c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); if (c) { struct page **pages = c->priv;
@@ -993,7 +993,6 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, __iommu_remove_mapping(dev, handle, size); __iommu_free_buffer(dev, pages, size); }
- mutex_unlock(&mapping->lock);
}
static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, @@ -1001,80 +1000,93 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, enum dma_data_direction dir) { struct dma_iommu_mapping *mapping = dev->archdata.mapping;
- dma_addr_t dma_addr, iova;
dma_addr_t iova; int ret = 0;
unsigned long i;
phys_addr_t phys = page_to_phys(sg_page(sg));
size = PAGE_ALIGN(size); *handle = ~0;
mutex_lock(&mapping->lock);
iova = dma_addr = __alloc_iova(mapping, size);
if (dma_addr == 0)
goto fail;
while (size) {
unsigned int phys = page_to_phys(sg_page(sg));
unsigned int len = sg->offset + sg->length;
- iova = __alloc_iova(mapping, size);
- if (iova == 0)
return -ENOMEM;
if (!arch_is_coherent())
__dma_page_cpu_to_dev(sg_page(sg), sg->offset, sg->length, dir);
while (len) {
ret = iommu_map(mapping->domain, iova, phys, 0, 0);
if (ret < 0)
goto fail;
iova += PAGE_SIZE;
len -= PAGE_SIZE;
size -= PAGE_SIZE;
}
sg = sg_next(sg);
- if (!arch_is_coherent())
__dma_page_cpu_to_dev(sg_page(sg), sg->offset,
sg->length, dir);
- for (i = 0; i < (size >> PAGE_SHIFT); i++) {
ret = iommu_map(mapping->domain, iova + i * PAGE_SIZE,
phys + i * PAGE_SIZE, 0, 0);
if (ret < 0)
}goto fail;
- *handle = dma_addr;
- mutex_unlock(&mapping->lock);
*handle = iova;
return 0;
fail:
- while (i--)
iommu_unmap(mapping->domain, iova + i * PAGE_SIZE, 0);
- __iommu_remove_mapping(dev, iova, size);
- mutex_unlock(&mapping->lock); return ret;
}
+static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
+{
- dma_addr_t dma_addr;
- if (!arch_is_coherent())
__dma_page_cpu_to_dev(page, offset, size, dir);
- BUG_ON((offset+size) > PAGE_SIZE);
- dma_addr = __iommu_create_mapping(dev, &page, PAGE_SIZE);
- return dma_addr + offset;
+}
+static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
+{
- struct dma_iommu_mapping *mapping = dev->archdata.mapping;
- phys_addr_t phys;
- phys = iommu_iova_to_phys(mapping->domain, handle);
- __iommu_remove_mapping(dev, handle, size);
- if (!arch_is_coherent())
__dma_page_dev_to_cpu(pfn_to_page(__phys_to_pfn(phys)),
phys & ~PAGE_MASK, size, dir);
+}
int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir, struct dma_attrs *attrs) {
- struct scatterlist *s = sg, *dma = sg, *start = sg;
- int i, count = 1;
- unsigned int offset = s->offset;
- unsigned int size = s->offset + s->length;
- struct scatterlist *s;
- unsigned int size;
- int i, count = 0;
- for (i = 1; i < nents; i++) {
- for_each_sg(sg, s, nents, i) { s->dma_address = ~0; s->dma_length = 0;
size = s->offset + s->length;
s = sg_next(s);
if (s->offset || (size & (PAGE_SIZE - 1))) {
if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0)
goto bad_mapping;
dma->dma_address += offset;
dma->dma_length = size;
if (__map_sg_chunk(dev, s, size, &s->dma_address, dir) < 0)
goto bad_mapping;
size = offset = s->offset;
start = s;
dma = sg_next(dma);
count += 1;
}
size += sg->length;
s->dma_address += s->offset;
s->dma_length = s->length;
}count++;
__map_sg_chunk(dev, start, size, &dma->dma_address, dir);
d->dma_address += offset;
return count;
bad_mapping:
- for_each_sg(sg, s, count-1, i)
__iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
- for_each_sg(sg, s, count, i) {
__iommu_remove_mapping(dev, sg_dma_address(s),
PAGE_ALIGN(sg_dma_len(s)));
- } return 0;
}
It looks that You have simplified arm_iommu_map_sg() function too much. The main advantage of the iommu is to map scattered memory pages into contiguous dma address space. DMA-mapping is allowed to merge consecutive entries in the scatter list if hardware supports that. With IOMMU a call to map_sg() might create only one dma element if the memory described by the scatter list can be seen as contiguous (all chunks start and end on page boundary). This means that arm_iommu_map_sg() should map all pages into dma address returned in sg_dma_address(sg[0]) sg_dma_len(sg[0]) pair. I'm also not convinced that this is the best approach, but that's how I was told to implement it:
http://article.gmane.org/gmane.linux.kernel/1128416
I agree that the API will be a bit cleaner if there is separate function to map a scatter list into dma address space chunk-by-chunk (like dma_map_sg() does it in most cases) and a function to map a scatter list into one contiguous dma address. This would also simplify the buffer management in the device drivers.
I'm not sure if mmc drivers are aware of coalescing the SG entries together. If not the code must be updated to use dma_sg_len() and the dma entries number returned from dma_map_sg() call.
(snipped)
Best regards
Marek,
It looks that You have simplified arm_iommu_map_sg() function too much. The main advantage of the iommu is to map scattered memory pages into contiguous dma address space. DMA-mapping is allowed to merge consecutive entries in the scatter list if hardware supports that. http://article.gmane.org/gmane.linux.kernel/1128416
I would update arm_iommu_map_sg() back to coalesce the sg list.
I'm not sure if mmc drivers are aware of coalescing the SG entries together. If not the code must be updated to use dma_sg_len() and the dma entries number returned from dma_map_sg() call.
MMC drivers seem to be aware of coalescing the SG entries together as they are using dma_sg_len().
Let me test and update the patch.
-- nvpublic
It looks that You have simplified arm_iommu_map_sg() function too much. The main advantage of the iommu is to map scattered memory pages into contiguous dma address space. DMA-mapping is allowed to merge consecutive entries in the scatter list if hardware supports that. http://article.gmane.org/gmane.linux.kernel/1128416
I would update arm_iommu_map_sg() back to coalesce the sg list.
MMC drivers seem to be aware of coalescing the SG entries together as they are using dma_sg_len().
I have updated the arm_iommu_map_sg() back to coalesce and fixed the issues with it. During testing, I found out that mmc host driver doesn't support buffers bigger than 64K. To get the device working, I had to break the sg entries coalesce when dma_length is about to go beyond 64KB. Looks like Mmc host driver(sdhci.c) need to be fixed to handle buffers bigger than 64KB. Should the clients be forced to handle bigger buffers or is there any better way to handle these kind of issues?
-- nvpublic
Hello,
On Wednesday, October 12, 2011 3:35 AM Krishna Reddy wrote:
It looks that You have simplified arm_iommu_map_sg() function too much. The main advantage of the iommu is to map scattered memory pages into contiguous dma address space. DMA-mapping is allowed to merge consecutive entries in the scatter list if hardware supports that. http://article.gmane.org/gmane.linux.kernel/1128416
I would update arm_iommu_map_sg() back to coalesce the sg list.
MMC drivers seem to be aware of coalescing the SG entries together as they are using
dma_sg_len().
I have updated the arm_iommu_map_sg() back to coalesce and fixed the issues with it. During testing, I found out that mmc host driver doesn't support buffers bigger than 64K. To get the device working, I had to break the sg entries coalesce when dma_length is about to go beyond 64KB. Looks like Mmc host driver(sdhci.c) need to be fixed to handle buffers bigger than 64KB. Should the clients be forced to handle bigger buffers or is there any better way to handle these kind of issues?
There is struct device_dma_parameters *dma_parms member of struct device. You can specify maximum segment size for the dma_map_sg function. This will of course complicate this function even more...
Best regards
It looks that You have simplified arm_iommu_map_sg() function too much. The main advantage of the iommu is to map scattered memory pages into contiguous dma address space. DMA-mapping is allowed to merge consecutive entries in the scatter list if hardware supports that. http://article.gmane.org/gmane.linux.kernel/1128416
I would update arm_iommu_map_sg() back to coalesce the sg list.
MMC drivers seem to be aware of coalescing the SG entries together as they are using
dma_sg_len().
I have updated the arm_iommu_map_sg() back to coalesce and fixed the issues with it. During testing, I found out that mmc host driver doesn't support buffers bigger than 64K. To get the device working, I had to break the sg entries coalesce when dma_length is about to go beyond 64KB. Looks like Mmc host driver(sdhci.c) need to be fixed to handle buffers bigger than 64KB. Should the clients be forced to handle bigger buffers or is there any better way to handle these kind of issues?
There is struct device_dma_parameters *dma_parms member of struct device. You can specify maximum segment size for the dma_map_sg function. This will of course complicate this function even more...
dma_get_max_seg_size() seem to take care of this issue already. This returns default max_seg_size as 64K unless device has defined its own size.
Best regards
Here a patch v2 that has updates/fixes to DMA IOMMU code. With these changes, the nvidia device is able to boot with all its platform drivers as DMA IOMMU clients.
Here is the overview of changes.
1. Converted the mutex to spinlock to handle atomic context calls and used spinlock in necessary places. 2. Implemented arm_iommu_map_page and arm_iommu_unmap_page, which are used by MMC host stack. 3. Separated creation of dma_iommu_mapping from arm_iommu_attach_device in order to share mapping. 4. Fixed various bugs identified in DMA IOMMU code during testing.
[PATCH] ARM: dma-mapping: Add iommu map_page/unmap_page and fix issues.
Signed-off-by: Krishna Reddy vdumpa@nvidia.com --- arch/arm/include/asm/dma-iommu.h | 14 ++- arch/arm/mm/dma-mapping.c | 229 +++++++++++++++++++++++++------------- 2 files changed, 161 insertions(+), 82 deletions(-)
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h index 0b2677e..5f4e37f 100644 --- a/arch/arm/include/asm/dma-iommu.h +++ b/arch/arm/include/asm/dma-iommu.h @@ -7,6 +7,8 @@ #include <linux/scatterlist.h> #include <linux/dma-debug.h> #include <linux/kmemcheck.h> +#include <linux/spinlock_types.h> +#include <linux/kref.h>
#include <asm/memory.h>
@@ -19,11 +21,17 @@ struct dma_iommu_mapping { unsigned int order; dma_addr_t base;
- struct mutex lock; + spinlock_t lock; + struct kref kref; };
-int arm_iommu_attach_device(struct device *dev, dma_addr_t base, - dma_addr_t size, int order); +struct dma_iommu_mapping *arm_iommu_create_mapping(dma_addr_t base, + size_t size, int order); + +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping); + +int arm_iommu_attach_device(struct device *dev, + struct dma_iommu_mapping *mapping);
#endif /* __KERNEL__ */ #endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 020bde1..721b7c0 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -739,32 +739,42 @@ fs_initcall(dma_debug_do_init);
/* IOMMU */
-static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, size_t size) +static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping, + size_t size) { - unsigned int order = get_order(size); unsigned int align = 0; unsigned int count, start; + unsigned long flags;
- if (order > mapping->order) - align = (1 << (order - mapping->order)) - 1; + count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) + + (1 << mapping->order) - 1) >> mapping->order;
- count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order; - - start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, 0, count, align); - if (start > mapping->bits) + spin_lock_irqsave(&mapping->lock, flags); + start = bitmap_find_next_zero_area(mapping->bitmap, mapping->bits, + 0, count, align); + if (start > mapping->bits) { + spin_unlock_irqrestore(&mapping->lock, flags); return ~0; + }
bitmap_set(mapping->bitmap, start, count); + spin_unlock_irqrestore(&mapping->lock, flags);
return mapping->base + (start << (mapping->order + PAGE_SHIFT)); }
-static inline void __free_iova(struct dma_iommu_mapping *mapping, dma_addr_t addr, size_t size) +static inline void __free_iova(struct dma_iommu_mapping *mapping, + dma_addr_t addr, size_t size) { - unsigned int start = (addr - mapping->base) >> (mapping->order + PAGE_SHIFT); - unsigned int count = ((size >> PAGE_SHIFT) + (1 << mapping->order) - 1) >> mapping->order; + unsigned int start = (addr - mapping->base) >> + (mapping->order + PAGE_SHIFT); + unsigned int count = ((PAGE_ALIGN(size) >> PAGE_SHIFT) + + (1 << mapping->order) - 1) >> mapping->order; + unsigned long flags;
+ spin_lock_irqsave(&mapping->lock, flags); bitmap_clear(mapping->bitmap, start, count); + spin_unlock_irqrestore(&mapping->lock, flags); }
static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp) @@ -867,7 +877,7 @@ __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot) static dma_addr_t __iommu_create_mapping(struct device *dev, struct page **pages, size_t size) { struct dma_iommu_mapping *mapping = dev->archdata.mapping; - unsigned int count = size >> PAGE_SHIFT; + unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; dma_addr_t dma_addr, iova; int i, ret = ~0;
@@ -892,13 +902,12 @@ fail: static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size) { struct dma_iommu_mapping *mapping = dev->archdata.mapping; - unsigned int count = size >> PAGE_SHIFT; + unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; int i;
- for (i=0; i<count; i++) { - iommu_unmap(mapping->domain, iova, 0); - iova += PAGE_SIZE; - } + iova = iova & PAGE_MASK; + for (i = 0; i < count; i++) + iommu_unmap(mapping->domain, iova + (i << PAGE_SHIFT), 0); __free_iova(mapping, iova, size); return 0; } @@ -906,7 +915,6 @@ static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t si static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs) { - struct dma_iommu_mapping *mapping = dev->archdata.mapping; pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel); struct page **pages; void *addr = NULL; @@ -914,11 +922,9 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, *handle = ~0; size = PAGE_ALIGN(size);
- mutex_lock(&mapping->lock); - pages = __iommu_alloc_buffer(dev, size, gfp); if (!pages) - goto err_unlock; + goto exit;
*handle = __iommu_create_mapping(dev, pages, size); if (*handle == ~0) @@ -928,15 +934,13 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, if (!addr) goto err_mapping;
- mutex_unlock(&mapping->lock); return addr;
err_mapping: __iommu_remove_mapping(dev, *handle, size); err_buffer: __iommu_free_buffer(dev, pages, size); -err_unlock: - mutex_unlock(&mapping->lock); +exit: return NULL; }
@@ -944,11 +948,9 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs) { - unsigned long user_size; struct arm_vmregion *c;
vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); - user_size = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); if (c) { @@ -981,11 +983,9 @@ static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, struct dma_attrs *attrs) { - struct dma_iommu_mapping *mapping = dev->archdata.mapping; struct arm_vmregion *c; size = PAGE_ALIGN(size);
- mutex_lock(&mapping->lock); c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); if (c) { struct page **pages = c->priv; @@ -993,7 +993,6 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, __iommu_remove_mapping(dev, handle, size); __iommu_free_buffer(dev, pages, size); } - mutex_unlock(&mapping->lock); }
static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, @@ -1001,80 +1000,118 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, enum dma_data_direction dir) { struct dma_iommu_mapping *mapping = dev->archdata.mapping; - dma_addr_t dma_addr, iova; + dma_addr_t iova; int ret = 0; + unsigned int count, i; + struct scatterlist *s;
+ size = PAGE_ALIGN(size); *handle = ~0; - mutex_lock(&mapping->lock);
- iova = dma_addr = __alloc_iova(mapping, size); - if (dma_addr == 0) - goto fail; + iova = __alloc_iova(mapping, size); + if (iova == 0) + return -ENOMEM;
- while (size) { - unsigned int phys = page_to_phys(sg_page(sg)); - unsigned int len = sg->offset + sg->length; + for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { + phys_addr_t phys = page_to_phys(sg_page(s)); + unsigned int len = PAGE_ALIGN(s->offset + s->length);
if (!arch_is_coherent()) - __dma_page_cpu_to_dev(sg_page(sg), sg->offset, sg->length, dir); + __dma_page_cpu_to_dev(sg_page(s), s->offset, + s->length, dir);
- while (len) { - ret = iommu_map(mapping->domain, iova, phys, 0, 0); + for (i = 0; i < (len >> PAGE_SHIFT); i++) { + ret = iommu_map(mapping->domain, + iova + (count << PAGE_SHIFT), + phys + (i << PAGE_SHIFT), 0, 0); if (ret < 0) goto fail; - iova += PAGE_SIZE; - len -= PAGE_SIZE; - size -= PAGE_SIZE; + count++; } - sg = sg_next(sg); } - - *handle = dma_addr; - mutex_unlock(&mapping->lock); + *handle = iova;
return 0; fail: + while (count--) + iommu_unmap(mapping->domain, iova + count * PAGE_SIZE, 0); __iommu_remove_mapping(dev, iova, size); - mutex_unlock(&mapping->lock); return ret; }
+static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + dma_addr_t dma_addr; + + if (!arch_is_coherent()) + __dma_page_cpu_to_dev(page, offset, size, dir); + + BUG_ON((offset+size) > PAGE_SIZE); + dma_addr = __iommu_create_mapping(dev, &page, PAGE_SIZE); + return dma_addr + offset; +} + +static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + struct dma_iommu_mapping *mapping = dev->archdata.mapping; + phys_addr_t phys; + + phys = iommu_iova_to_phys(mapping->domain, handle); + __iommu_remove_mapping(dev, handle, size); + if (!arch_is_coherent()) + __dma_page_dev_to_cpu(pfn_to_page(__phys_to_pfn(phys)), + phys & ~PAGE_MASK, size, dir); +} + int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir, struct dma_attrs *attrs) { struct scatterlist *s = sg, *dma = sg, *start = sg; - int i, count = 1; + int i, count = 0; unsigned int offset = s->offset; unsigned int size = s->offset + s->length;
+ s->dma_address = ~0; + s->dma_length = 0; + for (i = 1; i < nents; i++) { + s = sg_next(s); s->dma_address = ~0; s->dma_length = 0;
- s = sg_next(s); - - if (s->offset || (size & (PAGE_SIZE - 1))) { - if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0) + if (s->offset || size & ~PAGE_MASK || + size + s->length > dma_get_max_seg_size(dev)) { + if (__map_sg_chunk(dev, start, size, + &dma->dma_address, dir) < 0) goto bad_mapping;
dma->dma_address += offset; - dma->dma_length = size; + dma->dma_length = size - offset;
size = offset = s->offset; start = s; dma = sg_next(dma); - count += 1; + count++; } - size += sg->length; + size += s->length; } - __map_sg_chunk(dev, start, size, &dma->dma_address, dir); - d->dma_address += offset;
- return count; + if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir) < 0) + goto bad_mapping; + dma->dma_address += offset; + dma->dma_length = size - offset; + + return ++count;
bad_mapping: - for_each_sg(sg, s, count-1, i) - __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); + for_each_sg(sg, s, count, i) { + __iommu_remove_mapping(dev, sg_dma_address(s), + PAGE_ALIGN(sg_dma_len(s))); + } return 0; }
@@ -1086,9 +1123,11 @@ void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
for_each_sg(sg, s, nents, i) { if (sg_dma_len(s)) - __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s)); + __iommu_remove_mapping(dev, sg_dma_address(s), + sg_dma_len(s)); if (!arch_is_coherent()) - __dma_page_dev_to_cpu(sg_page(sg), sg->offset, sg->length, dir); + __dma_page_dev_to_cpu(sg_page(s), s->offset, + s->length, dir); } }
@@ -1108,7 +1147,8 @@ void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
for_each_sg(sg, s, nents, i) if (!arch_is_coherent()) - __dma_page_dev_to_cpu(sg_page(sg), sg->offset, sg->length, dir); + __dma_page_dev_to_cpu(sg_page(s), s->offset, + s->length, dir); }
/** @@ -1126,20 +1166,24 @@ void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
for_each_sg(sg, s, nents, i) if (!arch_is_coherent()) - __dma_page_cpu_to_dev(sg_page(sg), sg->offset, sg->length, dir); + __dma_page_cpu_to_dev(sg_page(s), s->offset, + s->length, dir); }
struct dma_map_ops iommu_ops = { .alloc = arm_iommu_alloc_attrs, .free = arm_iommu_free_attrs, .mmap = arm_iommu_mmap_attrs, + .map_page = arm_iommu_map_page, + .unmap_page = arm_iommu_unmap_page, .map_sg = arm_iommu_map_sg, .unmap_sg = arm_iommu_unmap_sg, .sync_sg_for_cpu = arm_iommu_sync_sg_for_cpu, .sync_sg_for_device = arm_iommu_sync_sg_for_device, };
-int arm_iommu_attach_device(struct device *dev, dma_addr_t base, size_t size, int order) +struct dma_iommu_mapping *arm_iommu_create_mapping(dma_addr_t base, + size_t size, int order) { unsigned int count = (size >> PAGE_SHIFT) - order; unsigned int bitmap_size = BITS_TO_LONGS(count) * sizeof(long); @@ -1157,30 +1201,57 @@ int arm_iommu_attach_device(struct device *dev, dma_addr_t base, size_t size, in mapping->base = base; mapping->bits = bitmap_size; mapping->order = order; - mutex_init(&mapping->lock); + spin_lock_init(&mapping->lock);
mapping->domain = iommu_domain_alloc(); if (!mapping->domain) goto err3;
- err = iommu_attach_device(mapping->domain, dev); - if (err != 0) - goto err4; - - dev->archdata.mapping = mapping; - set_dma_ops(dev, &iommu_ops); - - printk(KERN_INFO "Attached IOMMU controller to %s device.\n", dev_name(dev)); - return 0; + kref_init(&mapping->kref); + return mapping;
-err4: - iommu_domain_free(mapping->domain); err3: kfree(mapping->bitmap); err2: kfree(mapping); err: - return -ENOMEM; + return ERR_PTR(err); +} +EXPORT_SYMBOL(arm_iommu_create_mapping); + +static void release_iommu_mapping(struct kref *kref) +{ + struct dma_iommu_mapping *mapping = + container_of(kref, struct dma_iommu_mapping, kref); + + iommu_domain_free(mapping->domain); + kfree(mapping->bitmap); + kfree(mapping); +} + +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping) +{ + if (mapping) + kref_put(&mapping->kref, release_iommu_mapping); +} +EXPORT_SYMBOL(arm_iommu_release_mapping); + +int arm_iommu_attach_device(struct device *dev, + struct dma_iommu_mapping *mapping) +{ + int err; + + err = iommu_attach_device(mapping->domain, dev); + if (err) + return err; + + kref_get(&mapping->kref); + dev->archdata.mapping = mapping; + set_dma_ops(dev, &iommu_ops); + + printk(KERN_INFO "*****Attached IOMMU controller to %s device.\n", + dev_name(dev)); + return 0; } EXPORT_SYMBOL(arm_iommu_attach_device);
-- 1.7.0.4
-- nvpublic
Hello Krishna,
On Thursday, October 13, 2011 2:18 AM You wrote:
Here a patch v2 that has updates/fixes to DMA IOMMU code. With these changes, the nvidia device is able to boot with all its platform drivers as DMA IOMMU clients.
Here is the overview of changes.
- Converted the mutex to spinlock to handle atomic context calls and used spinlock in
necessary places. 2. Implemented arm_iommu_map_page and arm_iommu_unmap_page, which are used by MMC host stack. 3. Separated creation of dma_iommu_mapping from arm_iommu_attach_device in order to share mapping. 4. Fixed various bugs identified in DMA IOMMU code during testing.
The code looks much better now. The only problem is the fact that your mailed has changed all tabs into spaces making the patch really hard to apply. Could you resend it correctly?
Best regards
From: Andrzej Pietrasiewicz andrzej.p@samsung.com
This patch performs a complete rewrite of sysmmu driver for Samsung platform: - simplified the resource management: no more single platform device with 32 resources is needed, better fits into linux driver model, each sysmmu instance has it's own resource definition - the new version uses kernel wide common iommu api defined in include/iommu.h - cleaned support for sysmmu clocks - added support for automatic registration together with client device - added support for newly introduced dma-mapping interface
Signed-off-by: Andrzej Pietrasiewicz andrzej.p@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com [m.szyprowski: rebased onto v3.1-rc4, added automatic IOMMU device registration and support for proof-of-concept ARM DMA-mapping IOMMU mapper, added support for runtime_pm] Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- arch/arm/mach-exynos4/Kconfig | 5 - arch/arm/mach-exynos4/Makefile | 2 +- arch/arm/mach-exynos4/clock.c | 47 +- arch/arm/mach-exynos4/dev-sysmmu.c | 609 +++++++++++------ arch/arm/mach-exynos4/include/mach/irqs.h | 34 +- arch/arm/mach-exynos4/include/mach/sysmmu.h | 46 -- arch/arm/plat-s5p/Kconfig | 21 +- arch/arm/plat-s5p/include/plat/sysmmu.h | 119 ++-- arch/arm/plat-s5p/sysmmu.c | 855 ++++++++++++++++++------ arch/arm/plat-samsung/include/plat/devs.h | 1 - arch/arm/plat-samsung/include/plat/fimc-core.h | 25 + 11 files changed, 1197 insertions(+), 567 deletions(-) delete mode 100644 arch/arm/mach-exynos4/include/mach/sysmmu.h
diff --git a/arch/arm/mach-exynos4/Kconfig b/arch/arm/mach-exynos4/Kconfig index 0c77ab9..3b3029b 100644 --- a/arch/arm/mach-exynos4/Kconfig +++ b/arch/arm/mach-exynos4/Kconfig @@ -36,11 +36,6 @@ config EXYNOS4_DEV_PD help Compile in platform device definitions for Power Domain
-config EXYNOS4_DEV_SYSMMU - bool - help - Common setup code for SYSTEM MMU in EXYNOS4 - config EXYNOS4_DEV_DWMCI bool help diff --git a/arch/arm/mach-exynos4/Makefile b/arch/arm/mach-exynos4/Makefile index b7fe1d7..0ab09da 100644 --- a/arch/arm/mach-exynos4/Makefile +++ b/arch/arm/mach-exynos4/Makefile @@ -36,7 +36,7 @@ obj-$(CONFIG_MACH_NURI) += mach-nuri.o obj-y += dev-audio.o obj-$(CONFIG_EXYNOS4_DEV_AHCI) += dev-ahci.o obj-$(CONFIG_EXYNOS4_DEV_PD) += dev-pd.o -obj-$(CONFIG_EXYNOS4_DEV_SYSMMU) += dev-sysmmu.o +obj-$(CONFIG_S5P_SYSTEM_MMU) += dev-sysmmu.o obj-$(CONFIG_EXYNOS4_DEV_DWMCI) += dev-dwmci.o
obj-$(CONFIG_EXYNOS4_SETUP_FIMC) += setup-fimc.o diff --git a/arch/arm/mach-exynos4/clock.c b/arch/arm/mach-exynos4/clock.c index 851dea0..2ee1143 100644 --- a/arch/arm/mach-exynos4/clock.c +++ b/arch/arm/mach-exynos4/clock.c @@ -23,7 +23,6 @@
#include <mach/map.h> #include <mach/regs-clock.h> -#include <mach/sysmmu.h>
static struct clk clk_sclk_hdmi27m = { .name = "sclk_hdmi27m", @@ -581,59 +580,77 @@ static struct clk init_clocks_off[] = { .enable = exynos4_clk_ip_peril_ctrl, .ctrlbit = (1 << 13), }, { - .name = "SYSMMU_MDMA", + .name = "sysmmu", + .devname = "s5p-sysmmu.0", .enable = exynos4_clk_ip_image_ctrl, .ctrlbit = (1 << 5), }, { - .name = "SYSMMU_FIMC0", + .name = "sysmmu", + .devname = "s5p-sysmmu.2", .enable = exynos4_clk_ip_cam_ctrl, + .parent = &init_clocks_off[3], .ctrlbit = (1 << 7), }, { - .name = "SYSMMU_FIMC1", + .name = "sysmmu", + .devname = "s5p-sysmmu.3", .enable = exynos4_clk_ip_cam_ctrl, + .parent = &init_clocks_off[4], .ctrlbit = (1 << 8), }, { - .name = "SYSMMU_FIMC2", + .name = "sysmmu", + .devname = "s5p-sysmmu.4", .enable = exynos4_clk_ip_cam_ctrl, + .parent = &init_clocks_off[5], .ctrlbit = (1 << 9), }, { - .name = "SYSMMU_FIMC3", + .name = "sysmmu", + .devname = "s5p-sysmmu.5", .enable = exynos4_clk_ip_cam_ctrl, + .parent = &init_clocks_off[6], .ctrlbit = (1 << 10), }, { - .name = "SYSMMU_JPEG", + .name = "sysmmu", + .devname = "s5p-sysmmu.6", .enable = exynos4_clk_ip_cam_ctrl, .ctrlbit = (1 << 11), }, { - .name = "SYSMMU_FIMD0", + .name = "sysmmu", + .devname = "s5p-sysmmu.7", .enable = exynos4_clk_ip_lcd0_ctrl, .ctrlbit = (1 << 4), }, { - .name = "SYSMMU_FIMD1", + .name = "sysmmu", + .devname = "s5p-sysmmu.8", .enable = exynos4_clk_ip_lcd1_ctrl, .ctrlbit = (1 << 4), }, { - .name = "SYSMMU_PCIe", + .name = "sysmmu", + .devname = "s5p-sysmmu.9", .enable = exynos4_clk_ip_fsys_ctrl, .ctrlbit = (1 << 18), }, { - .name = "SYSMMU_G2D", + .name = "sysmmu", + .devname = "s5p-sysmmu.10", .enable = exynos4_clk_ip_image_ctrl, .ctrlbit = (1 << 3), }, { - .name = "SYSMMU_ROTATOR", + .name = "sysmmu", + .devname = "s5p-sysmmu.11", .enable = exynos4_clk_ip_image_ctrl, .ctrlbit = (1 << 4), }, { - .name = "SYSMMU_TV", + .name = "sysmmu", + .devname = "s5p-sysmmu.13", .enable = exynos4_clk_ip_tv_ctrl, .ctrlbit = (1 << 4), }, { - .name = "SYSMMU_MFC_L", + .name = "sysmmu", + .devname = "s5p-sysmmu.14", .enable = exynos4_clk_ip_mfc_ctrl, .ctrlbit = (1 << 1), }, { - .name = "SYSMMU_MFC_R", + .name = "sysmmu", + .devname = "s5p-sysmmu.15", .enable = exynos4_clk_ip_mfc_ctrl, .ctrlbit = (1 << 2), } diff --git a/arch/arm/mach-exynos4/dev-sysmmu.c b/arch/arm/mach-exynos4/dev-sysmmu.c index 3b7cae0..b49c922 100644 --- a/arch/arm/mach-exynos4/dev-sysmmu.c +++ b/arch/arm/mach-exynos4/dev-sysmmu.c @@ -13,220 +13,427 @@ #include <linux/platform_device.h> #include <linux/dma-mapping.h>
+#include <asm/dma-iommu.h> + #include <mach/map.h> #include <mach/irqs.h> -#include <mach/sysmmu.h> -#include <plat/s5p-clock.h> - -/* These names must be equal to the clock names in mach-exynos4/clock.c */ -const char *sysmmu_ips_name[EXYNOS4_SYSMMU_TOTAL_IPNUM] = { - "SYSMMU_MDMA" , - "SYSMMU_SSS" , - "SYSMMU_FIMC0" , - "SYSMMU_FIMC1" , - "SYSMMU_FIMC2" , - "SYSMMU_FIMC3" , - "SYSMMU_JPEG" , - "SYSMMU_FIMD0" , - "SYSMMU_FIMD1" , - "SYSMMU_PCIe" , - "SYSMMU_G2D" , - "SYSMMU_ROTATOR", - "SYSMMU_MDMA2" , - "SYSMMU_TV" , - "SYSMMU_MFC_L" , - "SYSMMU_MFC_R" , -};
-static struct resource exynos4_sysmmu_resource[] = { - [0] = { - .start = EXYNOS4_PA_SYSMMU_MDMA, - .end = EXYNOS4_PA_SYSMMU_MDMA + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [1] = { - .start = IRQ_SYSMMU_MDMA0_0, - .end = IRQ_SYSMMU_MDMA0_0, - .flags = IORESOURCE_IRQ, - }, - [2] = { - .start = EXYNOS4_PA_SYSMMU_SSS, - .end = EXYNOS4_PA_SYSMMU_SSS + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [3] = { - .start = IRQ_SYSMMU_SSS_0, - .end = IRQ_SYSMMU_SSS_0, - .flags = IORESOURCE_IRQ, - }, - [4] = { - .start = EXYNOS4_PA_SYSMMU_FIMC0, - .end = EXYNOS4_PA_SYSMMU_FIMC0 + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [5] = { - .start = IRQ_SYSMMU_FIMC0_0, - .end = IRQ_SYSMMU_FIMC0_0, - .flags = IORESOURCE_IRQ, - }, - [6] = { - .start = EXYNOS4_PA_SYSMMU_FIMC1, - .end = EXYNOS4_PA_SYSMMU_FIMC1 + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [7] = { - .start = IRQ_SYSMMU_FIMC1_0, - .end = IRQ_SYSMMU_FIMC1_0, - .flags = IORESOURCE_IRQ, - }, - [8] = { - .start = EXYNOS4_PA_SYSMMU_FIMC2, - .end = EXYNOS4_PA_SYSMMU_FIMC2 + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [9] = { - .start = IRQ_SYSMMU_FIMC2_0, - .end = IRQ_SYSMMU_FIMC2_0, - .flags = IORESOURCE_IRQ, - }, - [10] = { - .start = EXYNOS4_PA_SYSMMU_FIMC3, - .end = EXYNOS4_PA_SYSMMU_FIMC3 + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [11] = { - .start = IRQ_SYSMMU_FIMC3_0, - .end = IRQ_SYSMMU_FIMC3_0, - .flags = IORESOURCE_IRQ, - }, - [12] = { - .start = EXYNOS4_PA_SYSMMU_JPEG, - .end = EXYNOS4_PA_SYSMMU_JPEG + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [13] = { - .start = IRQ_SYSMMU_JPEG_0, - .end = IRQ_SYSMMU_JPEG_0, - .flags = IORESOURCE_IRQ, - }, - [14] = { - .start = EXYNOS4_PA_SYSMMU_FIMD0, - .end = EXYNOS4_PA_SYSMMU_FIMD0 + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [15] = { - .start = IRQ_SYSMMU_LCD0_M0_0, - .end = IRQ_SYSMMU_LCD0_M0_0, - .flags = IORESOURCE_IRQ, - }, - [16] = { - .start = EXYNOS4_PA_SYSMMU_FIMD1, - .end = EXYNOS4_PA_SYSMMU_FIMD1 + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [17] = { - .start = IRQ_SYSMMU_LCD1_M1_0, - .end = IRQ_SYSMMU_LCD1_M1_0, - .flags = IORESOURCE_IRQ, - }, - [18] = { - .start = EXYNOS4_PA_SYSMMU_PCIe, - .end = EXYNOS4_PA_SYSMMU_PCIe + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [19] = { - .start = IRQ_SYSMMU_PCIE_0, - .end = IRQ_SYSMMU_PCIE_0, - .flags = IORESOURCE_IRQ, - }, - [20] = { - .start = EXYNOS4_PA_SYSMMU_G2D, - .end = EXYNOS4_PA_SYSMMU_G2D + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [21] = { - .start = IRQ_SYSMMU_2D_0, - .end = IRQ_SYSMMU_2D_0, - .flags = IORESOURCE_IRQ, - }, - [22] = { - .start = EXYNOS4_PA_SYSMMU_ROTATOR, - .end = EXYNOS4_PA_SYSMMU_ROTATOR + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [23] = { - .start = IRQ_SYSMMU_ROTATOR_0, - .end = IRQ_SYSMMU_ROTATOR_0, - .flags = IORESOURCE_IRQ, - }, - [24] = { - .start = EXYNOS4_PA_SYSMMU_MDMA2, - .end = EXYNOS4_PA_SYSMMU_MDMA2 + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [25] = { - .start = IRQ_SYSMMU_MDMA1_0, - .end = IRQ_SYSMMU_MDMA1_0, - .flags = IORESOURCE_IRQ, - }, - [26] = { - .start = EXYNOS4_PA_SYSMMU_TV, - .end = EXYNOS4_PA_SYSMMU_TV + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [27] = { - .start = IRQ_SYSMMU_TV_M0_0, - .end = IRQ_SYSMMU_TV_M0_0, - .flags = IORESOURCE_IRQ, - }, - [28] = { - .start = EXYNOS4_PA_SYSMMU_MFC_L, - .end = EXYNOS4_PA_SYSMMU_MFC_L + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [29] = { - .start = IRQ_SYSMMU_MFC_M0_0, - .end = IRQ_SYSMMU_MFC_M0_0, - .flags = IORESOURCE_IRQ, - }, - [30] = { - .start = EXYNOS4_PA_SYSMMU_MFC_R, - .end = EXYNOS4_PA_SYSMMU_MFC_R + SZ_64K - 1, - .flags = IORESOURCE_MEM, - }, - [31] = { - .start = IRQ_SYSMMU_MFC_M1_0, - .end = IRQ_SYSMMU_MFC_M1_0, - .flags = IORESOURCE_IRQ, +#include <plat/devs.h> +#include <plat/cpu.h> +#include <plat/sysmmu.h> + +#include <plat/fimc-core.h> + +#define EXYNOS4_NUM_RESOURCES (2) + +static struct resource exynos4_sysmmu_resource[][EXYNOS4_NUM_RESOURCES] = { + [S5P_SYSMMU_MDMA] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_MDMA, + .end = EXYNOS4_PA_SYSMMU_MDMA + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_MDMA0, + .end = IRQ_SYSMMU_MDMA0, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_SSS] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_SSS, + .end = EXYNOS4_PA_SYSMMU_SSS + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_SSS, + .end = IRQ_SYSMMU_SSS, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_FIMC0] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_FIMC0, + .end = EXYNOS4_PA_SYSMMU_FIMC0 + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_FIMC0, + .end = IRQ_SYSMMU_FIMC0, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_FIMC1] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_FIMC1, + .end = EXYNOS4_PA_SYSMMU_FIMC1 + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_FIMC1, + .end = IRQ_SYSMMU_FIMC1, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_FIMC2] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_FIMC2, + .end = EXYNOS4_PA_SYSMMU_FIMC2 + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_FIMC2, + .end = IRQ_SYSMMU_FIMC2, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_FIMC3] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_FIMC3, + .end = EXYNOS4_PA_SYSMMU_FIMC3 + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_FIMC3, + .end = IRQ_SYSMMU_FIMC3, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_JPEG] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_JPEG, + .end = EXYNOS4_PA_SYSMMU_JPEG + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_JPEG, + .end = IRQ_SYSMMU_JPEG, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_FIMD0] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_FIMD0, + .end = EXYNOS4_PA_SYSMMU_FIMD0 + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_FIMD0, + .end = IRQ_SYSMMU_FIMD0, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_FIMD1] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_FIMD1, + .end = EXYNOS4_PA_SYSMMU_FIMD1 + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_FIMD1, + .end = IRQ_SYSMMU_FIMD1, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_PCIe] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_PCIe, + .end = EXYNOS4_PA_SYSMMU_PCIe + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_PCIE, + .end = IRQ_SYSMMU_PCIE, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_G2D] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_G2D, + .end = EXYNOS4_PA_SYSMMU_G2D + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_G2D, + .end = IRQ_SYSMMU_G2D, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_ROTATOR] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_ROTATOR, + .end = EXYNOS4_PA_SYSMMU_ROTATOR + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_ROTATOR, + .end = IRQ_SYSMMU_ROTATOR, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_MDMA2] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_MDMA2, + .end = EXYNOS4_PA_SYSMMU_MDMA2 + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_MDMA1, + .end = IRQ_SYSMMU_MDMA1, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_TV] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_TV, + .end = EXYNOS4_PA_SYSMMU_TV + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_TV, + .end = IRQ_SYSMMU_TV, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_MFC_L] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_MFC_L, + .end = EXYNOS4_PA_SYSMMU_MFC_L + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_MFC_L, + .end = IRQ_SYSMMU_MFC_L, + .flags = IORESOURCE_IRQ, + }, + }, + [S5P_SYSMMU_MFC_R] = { + [0] = { + .start = EXYNOS4_PA_SYSMMU_MFC_R, + .end = EXYNOS4_PA_SYSMMU_MFC_R + SZ_4K - 1, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = IRQ_SYSMMU_MFC_R, + .end = IRQ_SYSMMU_MFC_R, + .flags = IORESOURCE_IRQ, + }, }, };
-struct platform_device exynos4_device_sysmmu = { - .name = "s5p-sysmmu", - .id = 32, - .num_resources = ARRAY_SIZE(exynos4_sysmmu_resource), - .resource = exynos4_sysmmu_resource, +static u64 exynos4_sysmmu_dma_mask = DMA_BIT_MASK(32); + +struct platform_device exynos4_device_sysmmu[] = { + [S5P_SYSMMU_MDMA] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_MDMA, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_MDMA], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_SSS] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_SSS, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_SSS], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_FIMC0] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_FIMC0, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_FIMC0], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_FIMC1] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_FIMC1, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_FIMC1], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_FIMC2] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_FIMC2, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_FIMC2], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_FIMC3] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_FIMC3, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_FIMC3], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_JPEG] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_JPEG, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_JPEG], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_FIMD0] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_FIMD0, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_FIMD0], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_FIMD1] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_FIMD1, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_FIMD1], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_PCIe] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_PCIe, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_PCIe], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_G2D] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_G2D, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_G2D], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_ROTATOR] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_ROTATOR, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_ROTATOR], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_MDMA2] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_MDMA2, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_MDMA2], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_TV] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_TV, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_TV], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_MFC_L] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_MFC_L, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_MFC_L], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, + [S5P_SYSMMU_MFC_R] = { + .name = "s5p-sysmmu", + .id = S5P_SYSMMU_MFC_R, + .num_resources = EXYNOS4_NUM_RESOURCES, + .resource = exynos4_sysmmu_resource[S5P_SYSMMU_MFC_R], + .dev = { + .dma_mask = &exynos4_sysmmu_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }, }; -EXPORT_SYMBOL(exynos4_device_sysmmu);
-static struct clk *sysmmu_clk[S5P_SYSMMU_TOTAL_IPNUM]; -void sysmmu_clk_init(struct device *dev, sysmmu_ips ips) +static void __init s5p_register_sysmmu(struct platform_device *pdev, + struct device *client) { - sysmmu_clk[ips] = clk_get(dev, sysmmu_ips_name[ips]); - if (IS_ERR(sysmmu_clk[ips])) - sysmmu_clk[ips] = NULL; - else - clk_put(sysmmu_clk[ips]); + if (!client) + return; + if (client->parent) + pdev->dev.parent = client->parent; + client->parent = &pdev->dev; + platform_device_register(pdev); + s5p_sysmmu_assign_dev(client, pdev); }
-void sysmmu_clk_enable(sysmmu_ips ips) +/** + * s5p_sysmmu_core_init + * Register respective SYSMMU controller platform device and assign it to + * client device. + * Must be called before client device is registered by the board code. + */ +static int __init s5p_sysmmu_core_init(void) { - if (sysmmu_clk[ips]) - clk_enable(sysmmu_clk[ips]); + struct platform_device *pdev; + int i; + + for (i=0; i < S5P_MAX_FIMC_NUM; i++) { + pdev = &exynos4_device_sysmmu[S5P_SYSMMU_FIMC0 + i]; + s5p_register_sysmmu(pdev, s3c_fimc_getdevice(i)); + } + + return 0; } +core_initcall(s5p_sysmmu_core_init);
-void sysmmu_clk_disable(sysmmu_ips ips) +/** + * s5p_sysmmu_late_init + * Register all client devices to IOMMU aware DMA-mapping subsystem. + * Must be called after SYSMMU driver is registered in the system. + */ +static int __init s5p_sysmmu_late_init(void) { - if (sysmmu_clk[ips]) - clk_disable(sysmmu_clk[ips]); + int i; + + for (i=0; i < S5P_MAX_FIMC_NUM; i++) { + struct device *client = s3c_fimc_getdevice(i); + if (!client) + continue; + arm_iommu_attach_device(client, 0x20000000, SZ_128M, 4); + } + + return 0; } +arch_initcall(s5p_sysmmu_late_init); diff --git a/arch/arm/mach-exynos4/include/mach/irqs.h b/arch/arm/mach-exynos4/include/mach/irqs.h index 934d2a4..89889a1 100644 --- a/arch/arm/mach-exynos4/include/mach/irqs.h +++ b/arch/arm/mach-exynos4/include/mach/irqs.h @@ -120,23 +120,23 @@ #define COMBINER_GROUP(x) ((x) * MAX_IRQ_IN_COMBINER + IRQ_SPI(128)) #define COMBINER_IRQ(x, y) (COMBINER_GROUP(x) + y)
-#define IRQ_SYSMMU_MDMA0_0 COMBINER_IRQ(4, 0) -#define IRQ_SYSMMU_SSS_0 COMBINER_IRQ(4, 1) -#define IRQ_SYSMMU_FIMC0_0 COMBINER_IRQ(4, 2) -#define IRQ_SYSMMU_FIMC1_0 COMBINER_IRQ(4, 3) -#define IRQ_SYSMMU_FIMC2_0 COMBINER_IRQ(4, 4) -#define IRQ_SYSMMU_FIMC3_0 COMBINER_IRQ(4, 5) -#define IRQ_SYSMMU_JPEG_0 COMBINER_IRQ(4, 6) -#define IRQ_SYSMMU_2D_0 COMBINER_IRQ(4, 7) - -#define IRQ_SYSMMU_ROTATOR_0 COMBINER_IRQ(5, 0) -#define IRQ_SYSMMU_MDMA1_0 COMBINER_IRQ(5, 1) -#define IRQ_SYSMMU_LCD0_M0_0 COMBINER_IRQ(5, 2) -#define IRQ_SYSMMU_LCD1_M1_0 COMBINER_IRQ(5, 3) -#define IRQ_SYSMMU_TV_M0_0 COMBINER_IRQ(5, 4) -#define IRQ_SYSMMU_MFC_M0_0 COMBINER_IRQ(5, 5) -#define IRQ_SYSMMU_MFC_M1_0 COMBINER_IRQ(5, 6) -#define IRQ_SYSMMU_PCIE_0 COMBINER_IRQ(5, 7) +#define IRQ_SYSMMU_MDMA0 COMBINER_IRQ(4, 0) +#define IRQ_SYSMMU_SSS COMBINER_IRQ(4, 1) +#define IRQ_SYSMMU_FIMC0 COMBINER_IRQ(4, 2) +#define IRQ_SYSMMU_FIMC1 COMBINER_IRQ(4, 3) +#define IRQ_SYSMMU_FIMC2 COMBINER_IRQ(4, 4) +#define IRQ_SYSMMU_FIMC3 COMBINER_IRQ(4, 5) +#define IRQ_SYSMMU_JPEG COMBINER_IRQ(4, 6) +#define IRQ_SYSMMU_G2D COMBINER_IRQ(4, 7) + +#define IRQ_SYSMMU_ROTATOR COMBINER_IRQ(5, 0) +#define IRQ_SYSMMU_MDMA1 COMBINER_IRQ(5, 1) +#define IRQ_SYSMMU_FIMD0 COMBINER_IRQ(5, 2) +#define IRQ_SYSMMU_FIMD1 COMBINER_IRQ(5, 3) +#define IRQ_SYSMMU_TV COMBINER_IRQ(5, 4) +#define IRQ_SYSMMU_MFC_L COMBINER_IRQ(5, 5) +#define IRQ_SYSMMU_MFC_R COMBINER_IRQ(5, 6) +#define IRQ_SYSMMU_PCIE COMBINER_IRQ(5, 7)
#define IRQ_FIMD0_FIFO COMBINER_IRQ(11, 0) #define IRQ_FIMD0_VSYNC COMBINER_IRQ(11, 1) diff --git a/arch/arm/mach-exynos4/include/mach/sysmmu.h b/arch/arm/mach-exynos4/include/mach/sysmmu.h deleted file mode 100644 index 6a5fbb5..0000000 --- a/arch/arm/mach-exynos4/include/mach/sysmmu.h +++ /dev/null @@ -1,46 +0,0 @@ -/* linux/arch/arm/mach-exynos4/include/mach/sysmmu.h - * - * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd. - * http://www.samsung.com - * - * Samsung sysmmu driver for EXYNOS4 - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. -*/ - -#ifndef __ASM_ARM_ARCH_SYSMMU_H -#define __ASM_ARM_ARCH_SYSMMU_H __FILE__ - -enum exynos4_sysmmu_ips { - SYSMMU_MDMA, - SYSMMU_SSS, - SYSMMU_FIMC0, - SYSMMU_FIMC1, - SYSMMU_FIMC2, - SYSMMU_FIMC3, - SYSMMU_JPEG, - SYSMMU_FIMD0, - SYSMMU_FIMD1, - SYSMMU_PCIe, - SYSMMU_G2D, - SYSMMU_ROTATOR, - SYSMMU_MDMA2, - SYSMMU_TV, - SYSMMU_MFC_L, - SYSMMU_MFC_R, - EXYNOS4_SYSMMU_TOTAL_IPNUM, -}; - -#define S5P_SYSMMU_TOTAL_IPNUM EXYNOS4_SYSMMU_TOTAL_IPNUM - -extern const char *sysmmu_ips_name[EXYNOS4_SYSMMU_TOTAL_IPNUM]; - -typedef enum exynos4_sysmmu_ips sysmmu_ips; - -void sysmmu_clk_init(struct device *dev, sysmmu_ips ips); -void sysmmu_clk_enable(sysmmu_ips ips); -void sysmmu_clk_disable(sysmmu_ips ips); - -#endif /* __ASM_ARM_ARCH_SYSMMU_H */ diff --git a/arch/arm/plat-s5p/Kconfig b/arch/arm/plat-s5p/Kconfig index 9843c95..9013cb3 100644 --- a/arch/arm/plat-s5p/Kconfig +++ b/arch/arm/plat-s5p/Kconfig @@ -43,14 +43,6 @@ config S5P_HRT help Use the High Resolution timer support
-comment "System MMU" - -config S5P_SYSTEM_MMU - bool "S5P SYSTEM MMU" - depends on ARCH_EXYNOS4 - help - Say Y here if you want to enable System MMU - config S5P_DEV_FIMC0 bool help @@ -105,3 +97,16 @@ config S5P_SETUP_MIPIPHY bool help Compile in common setup code for MIPI-CSIS and MIPI-DSIM devices + +comment "System MMU" + +config IOMMU_API + bool + +config S5P_SYSTEM_MMU + bool "S5P SYSTEM MMU" + depends on ARCH_EXYNOS4 + select IOMMU_API + select ARM_DMA_USE_IOMMU + help + Say Y here if you want to enable System MMU diff --git a/arch/arm/plat-s5p/include/plat/sysmmu.h b/arch/arm/plat-s5p/include/plat/sysmmu.h index bf5283c..91e9293 100644 --- a/arch/arm/plat-s5p/include/plat/sysmmu.h +++ b/arch/arm/plat-s5p/include/plat/sysmmu.h @@ -2,6 +2,7 @@ * * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd. * http://www.samsung.com + * Author: Andrzej Pietrasiewicz andrzej.p@samsung.com * * Samsung System MMU driver for S5P platform * @@ -13,83 +14,57 @@ #ifndef __ASM__PLAT_SYSMMU_H #define __ASM__PLAT_SYSMMU_H __FILE__
-enum S5P_SYSMMU_INTERRUPT_TYPE { - SYSMMU_PAGEFAULT, - SYSMMU_AR_MULTIHIT, - SYSMMU_AW_MULTIHIT, - SYSMMU_BUSERROR, - SYSMMU_AR_SECURITY, - SYSMMU_AR_ACCESS, - SYSMMU_AW_SECURITY, - SYSMMU_AW_PROTECTION, /* 7 */ - SYSMMU_FAULTS_NUM -}; - -#ifdef CONFIG_S5P_SYSTEM_MMU - -#include <mach/sysmmu.h> - -/** - * s5p_sysmmu_enable() - enable system mmu of ip - * @ips: The ip connected system mmu. - * #pgd: Base physical address of the 1st level page table - * - * This function enable system mmu to transfer address - * from virtual address to physical address - */ -void s5p_sysmmu_enable(sysmmu_ips ips, unsigned long pgd); - -/** - * s5p_sysmmu_disable() - disable sysmmu mmu of ip - * @ips: The ip connected system mmu. - * - * This function disable system mmu to transfer address - * from virtual address to physical address - */ -void s5p_sysmmu_disable(sysmmu_ips ips); +struct device;
/** - * s5p_sysmmu_set_tablebase_pgd() - set page table base address to refer page table - * @ips: The ip connected system mmu. - * @pgd: The page table base address. - * - * This function set page table base address - * When system mmu transfer address from virtaul address to physical address, - * system mmu refer address information from page table + * enum s5p_sysmmu_ip - integrated peripherals identifiers + * @S5P_SYSMMU_MDMA: MDMA + * @S5P_SYSMMU_SSS: SSS + * @S5P_SYSMMU_FIMC0: FIMC0 + * @S5P_SYSMMU_FIMC1: FIMC1 + * @S5P_SYSMMU_FIMC2: FIMC2 + * @S5P_SYSMMU_FIMC3: FIMC3 + * @S5P_SYSMMU_JPEG: JPEG + * @S5P_SYSMMU_FIMD0: FIMD0 + * @S5P_SYSMMU_FIMD1: FIMD1 + * @S5P_SYSMMU_PCIe: PCIe + * @S5P_SYSMMU_G2D: G2D + * @S5P_SYSMMU_ROTATOR: ROTATOR + * @S5P_SYSMMU_MDMA2: MDMA2 + * @S5P_SYSMMU_TV: TV + * @S5P_SYSMMU_MFC_L: MFC_L + * @S5P_SYSMMU_MFC_R: MFC_R */ -void s5p_sysmmu_set_tablebase_pgd(sysmmu_ips ips, unsigned long pgd); +enum s5p_sysmmu_ip { + S5P_SYSMMU_MDMA, + S5P_SYSMMU_SSS, + S5P_SYSMMU_FIMC0, + S5P_SYSMMU_FIMC1, + S5P_SYSMMU_FIMC2, + S5P_SYSMMU_FIMC3, + S5P_SYSMMU_JPEG, + S5P_SYSMMU_FIMD0, + S5P_SYSMMU_FIMD1, + S5P_SYSMMU_PCIe, + S5P_SYSMMU_G2D, + S5P_SYSMMU_ROTATOR, + S5P_SYSMMU_MDMA2, + S5P_SYSMMU_TV, + S5P_SYSMMU_MFC_L, + S5P_SYSMMU_MFC_R, + S5P_SYSMMU_TOTAL_IP_NUM, +};
/** - * s5p_sysmmu_tlb_invalidate() - flush all TLB entry in system mmu - * @ips: The ip connected system mmu. - * - * This function flush all TLB entry in system mmu + * s5p_sysmmu_assign_dev() - assign sysmmu controller to client device + * @dev: client device + * @iommu_pdev: platform device of sysmmu controller */ -void s5p_sysmmu_tlb_invalidate(sysmmu_ips ips); +static inline void s5p_sysmmu_assign_dev(struct device *dev, + struct platform_device *iommu_pdev) +{ + BUG_ON(dev->archdata.iommu_priv); + dev->archdata.iommu_priv = iommu_pdev; +}
-/** s5p_sysmmu_set_fault_handler() - Fault handler for System MMUs - * @itype: type of fault. - * @pgtable_base: the physical address of page table base. This is 0 if @ips is - * SYSMMU_BUSERROR. - * @fault_addr: the device (virtual) address that the System MMU tried to - * translated. This is 0 if @ips is SYSMMU_BUSERROR. - * Called when interrupt occurred by the System MMUs - * The device drivers of peripheral devices that has a System MMU can implement - * a fault handler to resolve address translation fault by System MMU. - * The meanings of return value and parameters are described below. - - * return value: non-zero if the fault is correctly resolved. - * zero if the fault is not handled. - */ -void s5p_sysmmu_set_fault_handler(sysmmu_ips ips, - int (*handler)(enum S5P_SYSMMU_INTERRUPT_TYPE itype, - unsigned long pgtable_base, - unsigned long fault_addr)); -#else -#define s5p_sysmmu_enable(ips, pgd) do { } while (0) -#define s5p_sysmmu_disable(ips) do { } while (0) -#define s5p_sysmmu_set_tablebase_pgd(ips, pgd) do { } while (0) -#define s5p_sysmmu_tlb_invalidate(ips) do { } while (0) -#define s5p_sysmmu_set_fault_handler(ips, handler) do { } while (0) -#endif #endif /* __ASM_PLAT_SYSMMU_H */ diff --git a/arch/arm/plat-s5p/sysmmu.c b/arch/arm/plat-s5p/sysmmu.c index e1cbc72..b537e1c 100644 --- a/arch/arm/plat-s5p/sysmmu.c +++ b/arch/arm/plat-s5p/sysmmu.c @@ -1,312 +1,765 @@ /* linux/arch/arm/plat-s5p/sysmmu.c * - * Copyright (c) 2010 Samsung Electronics Co., Ltd. + * Copyright (c) 2010-2011 Samsung Electronics Co., Ltd. * http://www.samsung.com * + * Author: Andrzej Pietrasiewicz andrzej.p@samsung.com + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. */
-#include <linux/io.h> -#include <linux/interrupt.h> +#include <linux/gfp.h> +#include <linux/kernel.h> +#include <linux/string.h> #include <linux/platform_device.h> - -#include <asm/pgtable.h> +#include <linux/slab.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/spinlock.h> +#include <linux/mm.h> +#include <linux/pagemap.h> +#include <linux/module.h> +#include <linux/clk.h> +#include <linux/pm_runtime.h> +#include <linux/iommu.h> + +#include <asm/memory.h> + +#include <plat/irqs.h> +#include <plat/devs.h> +#include <plat/cpu.h> +#include <plat/sysmmu.h>
#include <mach/map.h> #include <mach/regs-sysmmu.h> -#include <plat/sysmmu.h>
-#define CTRL_ENABLE 0x5 -#define CTRL_BLOCK 0x7 -#define CTRL_DISABLE 0x0 - -static struct device *dev; - -static unsigned short fault_reg_offset[SYSMMU_FAULTS_NUM] = { - S5P_PAGE_FAULT_ADDR, - S5P_AR_FAULT_ADDR, - S5P_AW_FAULT_ADDR, - S5P_DEFAULT_SLAVE_ADDR, - S5P_AR_FAULT_ADDR, - S5P_AR_FAULT_ADDR, - S5P_AW_FAULT_ADDR, - S5P_AW_FAULT_ADDR +static int debug; +module_param(debug, int, 0644); + +#define sysmmu_debug(level, fmt, arg...) \ + do { \ + if (debug >= level) \ + printk(KERN_DEBUG "[%s] " fmt, __func__, ## arg);\ + } while (0) + +#define FLPT_ENTRIES 4096 +#define FLPT_4K_64K_MASK (~0x3FF) +#define FLPT_1M_MASK (~0xFFFFF) +#define FLPT_16M_MASK (~0xFFFFFF) +#define SLPT_4K_MASK (~0xFFF) +#define SLPT_64K_MASK (~0xFFFF) +#define PAGE_4K_64K 0x1 +#define PAGE_1M 0x2 +#define PAGE_16M 0x40002 +#define PAGE_4K 0x2 +#define PAGE_64K 0x1 +#define FLPT_IDX_SHIFT 20 +#define FLPT_IDX_MASK 0xFFF +#define FLPT_OFFS_SHIFT (FLPT_IDX_SHIFT - 2) +#define FLPT_OFFS_MASK (FLPT_IDX_MASK << 2) +#define SLPT_IDX_SHIFT 12 +#define SLPT_IDX_MASK 0xFF +#define SLPT_OFFS_SHIFT (SLPT_IDX_SHIFT - 2) +#define SLPT_OFFS_MASK (SLPT_IDX_MASK << 2) + +#define deref_va(va) (*((unsigned long *)(va))) + +#define generic_extract(l, s, entry) \ + ((entry) & l##LPT_##s##_MASK) +#define flpt_get_1m(entry) generic_extract(F, 1M, deref_va(entry)) +#define flpt_get_16m(entry) generic_extract(F, 16M, deref_va(entry)) +#define slpt_get_4k(entry) generic_extract(S, 4K, deref_va(entry)) +#define slpt_get_64k(entry) generic_extract(S, 64K, deref_va(entry)) + +#define generic_entry(l, s, entry) \ + (generic_extract(l, s, entry) | PAGE_##s) +#define flpt_ent_4k_64k(entry) generic_entry(F, 4K_64K, entry) +#define flpt_ent_1m(entry) generic_entry(F, 1M, entry) +#define flpt_ent_16m(entry) generic_entry(F, 16M, entry) +#define slpt_ent_4k(entry) generic_entry(S, 4K, entry) +#define slpt_ent_64k(entry) generic_entry(S, 64K, entry) + +#define page_4k_64k(entry) (deref_va(entry) & PAGE_4K_64K) +#define page_1m(entry) (deref_va(entry) & PAGE_1M) +#define page_16m(entry) ((deref_va(entry) & PAGE_16M) == PAGE_16M) +#define page_4k(entry) (deref_va(entry) & PAGE_4K) +#define page_64k(entry) (deref_va(entry) & PAGE_64K) + +#define generic_pg_offs(l, s, va) \ + (va & ~l##LPT_##s##_MASK) +#define pg_offs_1m(va) generic_pg_offs(F, 1M, va) +#define pg_offs_16m(va) generic_pg_offs(F, 16M, va) +#define pg_offs_4k(va) generic_pg_offs(S, 4K, va) +#define pg_offs_64k(va) generic_pg_offs(S, 64K, va) + +#define flpt_index(va) (((va) >> FLPT_IDX_SHIFT) & FLPT_IDX_MASK) + +#define generic_offset(l, va) (((va) >> l##LPT_OFFS_SHIFT) & l##LPT_OFFS_MASK) +#define flpt_offs(va) generic_offset(F, va) +#define slpt_offs(va) generic_offset(S, va) + +#define invalidate_slpt_ent(slpt_va) (deref_va(slpt_va) = 0UL) + +struct s5p_sysmmu_info { + struct resource *ioarea; + void __iomem *regs; + unsigned int irq; + struct clk *clk; + bool enabled; + enum s5p_sysmmu_ip ip; + + struct device *dev; + struct s5p_sysmmu_domain *domain; };
-static char *sysmmu_fault_name[SYSMMU_FAULTS_NUM] = { - "PAGE FAULT", - "AR MULTI-HIT FAULT", - "AW MULTI-HIT FAULT", - "BUS ERROR", - "AR SECURITY PROTECTION FAULT", - "AR ACCESS PROTECTION FAULT", - "AW SECURITY PROTECTION FAULT", - "AW ACCESS PROTECTION FAULT" -}; - -static int (*fault_handlers[S5P_SYSMMU_TOTAL_IPNUM])( - enum S5P_SYSMMU_INTERRUPT_TYPE itype, - unsigned long pgtable_base, - unsigned long fault_addr); +static struct s5p_sysmmu_info *sysmmu_table[S5P_SYSMMU_TOTAL_IP_NUM]; +static DEFINE_SPINLOCK(sysmmu_slock);
/* - * If adjacent 2 bits are true, the system MMU is enabled. - * The system MMU is disabled, otherwise. + * iommu domain is a virtual address space of an I/O device driver. + * It contains kernel virtual and physical addresses of the first level + * page table and owns the memory in which the page tables are stored. + * It contains a table of kernel virtual addresses of second level + * page tables. + * + * In order to be used the iommu domain must be bound to an iommu device. + * This is accomplished with s5p_sysmmu_attach_dev, which is called through + * s5p_sysmmu_ops by drivers/base/iommu.c. */ -static unsigned long sysmmu_states; +struct s5p_sysmmu_domain { + unsigned long flpt; + void *flpt_va; + void **slpt_va; + unsigned short *refcount; + struct s5p_sysmmu_info *sysmmu; +};
-static inline void set_sysmmu_active(sysmmu_ips ips) -{ - sysmmu_states |= 3 << (ips * 2); -} +static struct kmem_cache *slpt_cache;
-static inline void set_sysmmu_inactive(sysmmu_ips ips) +static void flush_cache(const void *start, unsigned long size) { - sysmmu_states &= ~(3 << (ips * 2)); + dmac_flush_range(start, start + size); + outer_flush_range(virt_to_phys(start), virt_to_phys(start + size)); }
-static inline int is_sysmmu_active(sysmmu_ips ips) +static int s5p_sysmmu_domain_init(struct iommu_domain *domain) { - return sysmmu_states & (3 << (ips * 2)); -} + struct s5p_sysmmu_domain *s5p_domain;
-static void __iomem *sysmmusfrs[S5P_SYSMMU_TOTAL_IPNUM]; + s5p_domain = kzalloc(sizeof(struct s5p_sysmmu_domain), GFP_KERNEL); + if (!s5p_domain) { + sysmmu_debug(3, "no memory for state\n"); + return -ENOMEM; + } + domain->priv = s5p_domain; + + /* + * first-level page table holds + * 4k second-level descriptors == 16kB == 4 pages + */ + s5p_domain->flpt_va = kzalloc(FLPT_ENTRIES * sizeof(unsigned long), + GFP_KERNEL); + if (!s5p_domain->flpt_va) + return -ENOMEM; + s5p_domain->flpt = virt_to_phys(s5p_domain->flpt_va); + + s5p_domain->refcount = kzalloc(FLPT_ENTRIES * sizeof(u16), GFP_KERNEL); + if (!s5p_domain->refcount) { + kfree(s5p_domain->flpt_va); + return -ENOMEM; + }
-static inline void sysmmu_block(sysmmu_ips ips) -{ - __raw_writel(CTRL_BLOCK, sysmmusfrs[ips] + S5P_MMU_CTRL); - dev_dbg(dev, "%s is blocked.\n", sysmmu_ips_name[ips]); + s5p_domain->slpt_va = kzalloc(FLPT_ENTRIES * sizeof(void *), + GFP_KERNEL); + if (!s5p_domain->slpt_va) { + kfree(s5p_domain->refcount); + kfree(s5p_domain->flpt_va); + return -ENOMEM; + } + flush_cache(s5p_domain->flpt_va, 4 * PAGE_SIZE); + return 0; }
-static inline void sysmmu_unblock(sysmmu_ips ips) +static void s5p_sysmmu_domain_destroy(struct iommu_domain *domain) { - __raw_writel(CTRL_ENABLE, sysmmusfrs[ips] + S5P_MMU_CTRL); - dev_dbg(dev, "%s is unblocked.\n", sysmmu_ips_name[ips]); + struct s5p_sysmmu_domain *s5p_domain = domain->priv; + int i; + for (i = FLPT_ENTRIES - 1; i >= 0; --i) + if (s5p_domain->refcount[i]) + kmem_cache_free(slpt_cache, s5p_domain->slpt_va[i]); + + kfree(s5p_domain->slpt_va); + kfree(s5p_domain->refcount); + kfree(s5p_domain->flpt_va); + kfree(domain->priv); + domain->priv = NULL; }
-static inline void __sysmmu_tlb_invalidate(sysmmu_ips ips) +static void s5p_enable_iommu(struct s5p_sysmmu_info *sysmmu) { - __raw_writel(0x1, sysmmusfrs[ips] + S5P_MMU_FLUSH); - dev_dbg(dev, "TLB of %s is invalidated.\n", sysmmu_ips_name[ips]); + struct s5p_sysmmu_domain *s5p_domain = sysmmu->domain; + u32 reg; + WARN_ON(sysmmu->enabled); + + clk_enable(sysmmu->clk); + + /* configure first level page table base address */ + writel(s5p_domain->flpt, sysmmu->regs + S5P_PT_BASE_ADDR); + + reg = readl(sysmmu->regs + S5P_MMU_CFG); + reg |= (0x1<<0); /* replacement policy : LRU */ + writel(reg, sysmmu->regs + S5P_MMU_CFG); + + reg = readl(sysmmu->regs + S5P_MMU_CTRL); + reg |= ((0x1<<2)|(0x1<<0)); /* Enable interrupt, Enable MMU */ + writel(reg, sysmmu->regs + S5P_MMU_CTRL); + + sysmmu->enabled = true; }
-static inline void __sysmmu_set_ptbase(sysmmu_ips ips, unsigned long pgd) +static void s5p_disable_iommu(struct s5p_sysmmu_info *sysmmu) { - if (unlikely(pgd == 0)) { - pgd = (unsigned long)ZERO_PAGE(0); - __raw_writel(0x20, sysmmusfrs[ips] + S5P_MMU_CFG); /* 4KB LV1 */ - } else { - __raw_writel(0x0, sysmmusfrs[ips] + S5P_MMU_CFG); /* 16KB LV1 */ - } + u32 reg; + WARN_ON(!sysmmu->domain);
- __raw_writel(pgd, sysmmusfrs[ips] + S5P_PT_BASE_ADDR); + reg = readl(sysmmu->regs + S5P_MMU_CTRL); + reg &= ~(0x1); /* Disable MMU */ + writel(reg, sysmmu->regs + S5P_MMU_CTRL);
- dev_dbg(dev, "Page table base of %s is initialized with 0x%08lX.\n", - sysmmu_ips_name[ips], pgd); - __sysmmu_tlb_invalidate(ips); + clk_disable(sysmmu->clk); + sysmmu->enabled = false; }
-void sysmmu_set_fault_handler(sysmmu_ips ips, - int (*handler)(enum S5P_SYSMMU_INTERRUPT_TYPE itype, - unsigned long pgtable_base, - unsigned long fault_addr)) +static int s5p_sysmmu_attach_dev(struct iommu_domain *domain, + struct device *dev) { - BUG_ON(!((ips >= SYSMMU_MDMA) && (ips < S5P_SYSMMU_TOTAL_IPNUM))); - fault_handlers[ips] = handler; + struct s5p_sysmmu_domain *s5p_domain = domain->priv; + struct platform_device *iommu_dev; + struct s5p_sysmmu_info *sysmmu; + + iommu_dev = dev->archdata.iommu_priv; + BUG_ON(!iommu_dev); + + sysmmu = platform_get_drvdata(iommu_dev); + BUG_ON(!sysmmu); + + s5p_domain->sysmmu = sysmmu; + sysmmu->domain = s5p_domain; + + return 0; }
-static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id) +static void s5p_sysmmu_detach_dev(struct iommu_domain *domain, + struct device *dev) { - /* SYSMMU is in blocked when interrupt occurred. */ - unsigned long base = 0; - sysmmu_ips ips = (sysmmu_ips)dev_id; - enum S5P_SYSMMU_INTERRUPT_TYPE itype; + struct platform_device *pdev = + container_of(dev, struct platform_device, dev); + struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev); + struct s5p_sysmmu_domain *s5p_domain = domain->priv; + + s5p_disable_iommu(sysmmu); + s5p_domain->sysmmu = NULL; + sysmmu->domain = NULL; +}
- itype = (enum S5P_SYSMMU_INTERRUPT_TYPE) - __ffs(__raw_readl(sysmmusfrs[ips] + S5P_INT_STATUS)); +#define bug_mapping_prohibited(iova, len) \ + s5p_mapping_prohibited_impl(iova, len, __FILE__, __LINE__)
- BUG_ON(!((itype >= 0) && (itype < 8))); +static void s5p_mapping_prohibited_impl(unsigned long iova, size_t len, + const char *file, int line) +{ + sysmmu_debug(3, "%s:%d Attempting to map %d@0x%lx over existing\ +mapping\n", file, line, len, iova); + BUG(); +}
- dev_alert(dev, "%s occurred by %s.\n", sysmmu_fault_name[itype], - sysmmu_ips_name[ips]); +/* + * Map an area of length corresponding to gfp_order, starting at iova. + * gfp_order is an order of units of 4kB: 0 -> 1 unit, 1 -> 2 units, + * 2 -> 4 units, 3 -> 8 units and so on. + * + * The act of mapping is all about deciding how to interpret in the MMU the + * virtual addresses belonging to the mapped range. Mapping can be done with + * 4kB, 64kB, 1MB and 16MB pages, so only orders of 0, 4, 8, 12 are valid. + * + * iova must be aligned on a 4kB, 64kB, 1MB and 16MB boundaries, respectively. + */ +static int s5p_sysmmu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, int gfp_order, int prot) +{ + struct s5p_sysmmu_domain *s5p_domain = domain->priv; + int flpt_idx = flpt_index(iova); + size_t len = 0x1000UL << gfp_order; + void *flpt_va, *slpt_va; + + if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) { + sysmmu_debug(3, "bad order: %d\n", gfp_order); + return -EINVAL; + }
- if (fault_handlers[ips]) { - unsigned long addr; + flpt_va = s5p_domain->flpt_va + flpt_offs(iova); + + if (SZ_1M == len) { + if (deref_va(flpt_va)) + bug_mapping_prohibited(iova, len); + deref_va(flpt_va) = flpt_ent_1m(paddr); + flush_cache(flpt_va, 4); /* one 4-byte entry */ + + return 0; + } else if (SZ_16M == len) { + int i = 0; + /* first loop to verify mapping allowed */ + for (i = 0; i < 16; ++i) + if (deref_va(flpt_va + 4 * i)) + bug_mapping_prohibited(iova, len); + /* actually map only if allowed */ + for (i = 0; i < 16; ++i) + deref_va(flpt_va + 4 * i) = flpt_ent_16m(paddr); + flush_cache(flpt_va, 4 * 16); /* 16 4-byte entries */ + + return 0; + }
- base = __raw_readl(sysmmusfrs[ips] + S5P_PT_BASE_ADDR); - addr = __raw_readl(sysmmusfrs[ips] + fault_reg_offset[itype]); + /* for 4K and 64K pages only */ + if (page_1m(flpt_va) || page_16m(flpt_va)) + bug_mapping_prohibited(iova, len);
- if (fault_handlers[ips](itype, base, addr)) { - __raw_writel(1 << itype, - sysmmusfrs[ips] + S5P_INT_CLEAR); - dev_notice(dev, "%s from %s is resolved." - " Retrying translation.\n", - sysmmu_fault_name[itype], sysmmu_ips_name[ips]); - } else { - base = 0; + /* need to allocate a new second level page table */ + if (0 == deref_va(flpt_va)) { + void *slpt = kmem_cache_zalloc(slpt_cache, GFP_KERNEL); + if (!slpt) { + sysmmu_debug(3, "cannot allocate slpt\n"); + return -ENOMEM; + } + + s5p_domain->slpt_va[flpt_idx] = slpt; + deref_va(flpt_va) = flpt_ent_4k_64k(virt_to_phys(slpt)); + flush_cache(flpt_va, 4); + } + slpt_va = s5p_domain->slpt_va[flpt_idx] + slpt_offs(iova); + + if (SZ_4K == len) { + if (deref_va(slpt_va)) + bug_mapping_prohibited(iova, len); + deref_va(slpt_va) = slpt_ent_4k(paddr); + flush_cache(slpt_va, 4); /* one 4-byte entry */ + s5p_domain->refcount[flpt_idx]++; + } else { + int i; + /* first loop to verify mapping allowed */ + for (i = 0; i < 16; ++i) + if (deref_va(slpt_va + 4 * i)) + bug_mapping_prohibited(iova, len); + /* actually map only if allowed */ + for (i = 0; i < 16; ++i) { + deref_va(slpt_va + 4 * i) = slpt_ent_64k(paddr); + s5p_domain->refcount[flpt_idx]++; } + flush_cache(slpt_va, 4 * 16); /* 16 4-byte entries */ }
- sysmmu_unblock(ips); + return 0; +}
- if (!base) - dev_notice(dev, "%s from %s is not handled.\n", - sysmmu_fault_name[itype], sysmmu_ips_name[ips]); +static void s5p_tlb_invalidate(struct s5p_sysmmu_domain *domain) +{ + unsigned int reg; + void __iomem *regs;
- return IRQ_HANDLED; + if (!domain->sysmmu) + return; + + if (!domain->sysmmu->enabled) + return; + + regs = domain->sysmmu->regs; + + /* TLB invalidate */ + reg = readl(regs + S5P_MMU_CTRL); + reg |= (0x1<<1); /* Block MMU */ + writel(reg, regs + S5P_MMU_CTRL); + + writel(0x1, regs + S5P_MMU_FLUSH); + /* Flush_entry */ + + reg = readl(regs + S5P_MMU_CTRL); + reg &= ~(0x1<<1); /* Un-block MMU */ + writel(reg, regs + S5P_MMU_CTRL); }
-void s5p_sysmmu_set_tablebase_pgd(sysmmu_ips ips, unsigned long pgd) +#define bug_unmapping_prohibited(iova, len) \ + s5p_unmapping_prohibited_impl(iova, len, __FILE__, __LINE__) + +static void s5p_unmapping_prohibited_impl(unsigned long iova, size_t len, + const char *file, int line) { - if (is_sysmmu_active(ips)) { - sysmmu_block(ips); - __sysmmu_set_ptbase(ips, pgd); - sysmmu_unblock(ips); - } else { - dev_dbg(dev, "%s is disabled. " - "Skipping initializing page table base.\n", - sysmmu_ips_name[ips]); - } + sysmmu_debug(3, "%s:%d Attempting to unmap different size or \ +non-existing mapping %d@0x%lx\n", file, line, len, iova); + BUG(); }
-void s5p_sysmmu_enable(sysmmu_ips ips, unsigned long pgd) +static int s5p_sysmmu_unmap(struct iommu_domain *domain, unsigned long iova, + int gfp_order) { - if (!is_sysmmu_active(ips)) { - sysmmu_clk_enable(ips); + struct s5p_sysmmu_domain *s5p_domain = domain->priv; + int flpt_idx = flpt_index(iova); + size_t len = 0x1000UL << gfp_order; + void *flpt_va, *slpt_va; + + if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) { + sysmmu_debug(3, "bad order: %d\n", gfp_order); + return -EINVAL; + } + + flpt_va = s5p_domain->flpt_va + flpt_offs(iova); + + /* check if there is any mapping at all */ + if (!deref_va(flpt_va)) + bug_unmapping_prohibited(iova, len); + + if (SZ_1M == len) { + if (!page_1m(flpt_va)) + bug_unmapping_prohibited(iova, len); + deref_va(flpt_va) = 0; + flush_cache(flpt_va, 4); /* one 4-byte entry */ + s5p_tlb_invalidate(s5p_domain); + + return 0; + } else if (SZ_16M == len) { + int i; + /* first loop to verify it actually is 16M mapping */ + for (i = 0; i < 16; ++i) + if (!page_16m(flpt_va + 4 * i)) + bug_unmapping_prohibited(iova, len); + /* actually unmap */ + for (i = 0; i < 16; ++i) + deref_va(flpt_va + 4 * i) = 0; + flush_cache(flpt_va, 4 * 16); /* 16 4-byte entries */ + s5p_tlb_invalidate(s5p_domain); + + return 0; + }
- __sysmmu_set_ptbase(ips, pgd); + if (!page_4k_64k(flpt_va)) + bug_unmapping_prohibited(iova, len);
- __raw_writel(CTRL_ENABLE, sysmmusfrs[ips] + S5P_MMU_CTRL); + slpt_va = s5p_domain->slpt_va[flpt_idx] + slpt_offs(iova);
- set_sysmmu_active(ips); - dev_dbg(dev, "%s is enabled.\n", sysmmu_ips_name[ips]); + /* verify that we attempt to unmap a matching mapping */ + if (SZ_4K == len) { + if (!page_4k(slpt_va)) + bug_unmapping_prohibited(iova, len); + } else if (SZ_64K == len) { + int i; + for (i = 0; i < 16; ++i) + if (!page_64k(slpt_va + 4 * i)) + bug_unmapping_prohibited(iova, len); + } + + if (SZ_64K == len) + s5p_domain->refcount[flpt_idx] -= 15; + + if (--s5p_domain->refcount[flpt_idx]) { + if (SZ_4K == len) { + invalidate_slpt_ent(slpt_va); + flush_cache(slpt_va, 4); + } else { + int i; + for (i = 0; i < 16; ++i) + invalidate_slpt_ent(slpt_va + 4 * i); + flush_cache(slpt_va, 4 * 16); + } } else { - dev_dbg(dev, "%s is already enabled.\n", sysmmu_ips_name[ips]); + kmem_cache_free(slpt_cache, s5p_domain->slpt_va[flpt_idx]); + s5p_domain->slpt_va[flpt_idx] = 0; + memset(flpt_va, 0, 4); + flush_cache(flpt_va, 4); } + + s5p_tlb_invalidate(s5p_domain); + + return 0; }
-void s5p_sysmmu_disable(sysmmu_ips ips) +phys_addr_t s5p_iova_to_phys(struct iommu_domain *domain, unsigned long iova) { - if (is_sysmmu_active(ips)) { - __raw_writel(CTRL_DISABLE, sysmmusfrs[ips] + S5P_MMU_CTRL); - set_sysmmu_inactive(ips); - sysmmu_clk_disable(ips); - dev_dbg(dev, "%s is disabled.\n", sysmmu_ips_name[ips]); - } else { - dev_dbg(dev, "%s is already disabled.\n", sysmmu_ips_name[ips]); - } + struct s5p_sysmmu_domain *s5p_domain = domain->priv; + int flpt_idx = flpt_index(iova); + unsigned long flpt_va, slpt_va; + + flpt_va = (unsigned long)s5p_domain->flpt_va + flpt_offs(iova); + + if (!deref_va(flpt_va)) + return 0; + + if (page_16m(flpt_va)) + return flpt_get_16m(flpt_va) | pg_offs_16m(iova); + else if (page_1m(flpt_va)) + return flpt_get_1m(flpt_va) | pg_offs_1m(iova); + + if (!page_4k_64k(flpt_va)) + return 0; + + slpt_va = (unsigned long)s5p_domain->slpt_va[flpt_idx] + + slpt_offs(iova); + + if (!deref_va(slpt_va)) + return 0; + + if (page_4k(slpt_va)) + return slpt_get_4k(slpt_va) | pg_offs_4k(iova); + else if (page_64k(slpt_va)) + return slpt_get_64k(slpt_va) | pg_offs_64k(iova); + + return 0; }
-void s5p_sysmmu_tlb_invalidate(sysmmu_ips ips) +static struct iommu_ops s5p_sysmmu_ops = { + .domain_init = s5p_sysmmu_domain_init, + .domain_destroy = s5p_sysmmu_domain_destroy, + .attach_dev = s5p_sysmmu_attach_dev, + .detach_dev = s5p_sysmmu_detach_dev, + .map = s5p_sysmmu_map, + .unmap = s5p_sysmmu_unmap, + .iova_to_phys = s5p_iova_to_phys, +}; + +static irqreturn_t s5p_sysmmu_irq(int irq, void *dev_id) { - if (is_sysmmu_active(ips)) { - sysmmu_block(ips); - __sysmmu_tlb_invalidate(ips); - sysmmu_unblock(ips); - } else { - dev_dbg(dev, "%s is disabled. " - "Skipping invalidating TLB.\n", sysmmu_ips_name[ips]); + struct s5p_sysmmu_info *sysmmu = dev_id; + unsigned int reg_INT_STATUS; + unsigned long fault; + + if (false == sysmmu->enabled) + return IRQ_HANDLED; + + reg_INT_STATUS = readl(sysmmu->regs + S5P_INT_STATUS); + if (reg_INT_STATUS & 0xFF) { + switch (reg_INT_STATUS & 0xFF) { + case 0x1: + /* page fault */ + fault = readl(sysmmu->regs + S5P_PAGE_FAULT_ADDR); + sysmmu_debug(3, "Faulting virtual address: 0x%08lx\n", + fault); + break; + case 0x2: + /* AR multi-hit fault */ + sysmmu_debug(3, "irq:ar multi hit\n"); + break; + case 0x4: + /* AW multi-hit fault */ + sysmmu_debug(3, "irq:aw multi hit\n"); + break; + case 0x8: + /* bus error */ + sysmmu_debug(3, "irq:bus error\n"); + break; + case 0x10: + /* AR security protection fault */ + sysmmu_debug(3, "irq:ar security protection fault\n"); + break; + case 0x20: + /* AR access protection fault */ + sysmmu_debug(3, "irq:ar access protection fault\n"); + break; + case 0x40: + /* AW security protection fault */ + sysmmu_debug(3, "irq:aw security protection fault\n"); + break; + case 0x80: + /* AW access protection fault */ + sysmmu_debug(3, "irq:aw access protection fault\n"); + break; + } + writel(reg_INT_STATUS, sysmmu->regs + S5P_INT_CLEAR); } + return IRQ_HANDLED; }
static int s5p_sysmmu_probe(struct platform_device *pdev) { - int i, ret; - struct resource *res, *mem; + struct s5p_sysmmu_info *sysmmu; + struct resource *res; + int ret; + unsigned long flags; + + sysmmu = kzalloc(sizeof(struct s5p_sysmmu_info), GFP_KERNEL); + if (!sysmmu) { + dev_err(&pdev->dev, "no memory for state\n"); + return -ENOMEM; + }
- dev = &pdev->dev; + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (NULL == res) { + dev_err(&pdev->dev, "cannot find IO resource\n"); + ret = -ENOENT; + goto err_s5p_sysmmu_info_allocated; + }
- for (i = 0; i < S5P_SYSMMU_TOTAL_IPNUM; i++) { - int irq; + sysmmu->ioarea = request_mem_region(res->start, resource_size(res), + pdev->name);
- sysmmu_clk_init(dev, i); - sysmmu_clk_disable(i); + if (NULL == sysmmu->ioarea) { + dev_err(&pdev->dev, "cannot request IO\n"); + ret = -ENXIO; + goto err_s5p_sysmmu_info_allocated; + }
- res = platform_get_resource(pdev, IORESOURCE_MEM, i); - if (!res) { - dev_err(dev, "Failed to get the resource of %s.\n", - sysmmu_ips_name[i]); - ret = -ENODEV; - goto err_res; - } + sysmmu->regs = ioremap(res->start, resource_size(res));
- mem = request_mem_region(res->start, resource_size(res), - pdev->name); - if (!mem) { - dev_err(dev, "Failed to request the memory region of %s.\n", - sysmmu_ips_name[i]); - ret = -EBUSY; - goto err_res; - } + if (NULL == sysmmu->regs) { + dev_err(&pdev->dev, "cannot map IO\n"); + ret = -ENXIO; + goto err_ioarea_requested; + }
- sysmmusfrs[i] = ioremap(res->start, resource_size(res)); - if (!sysmmusfrs[i]) { - dev_err(dev, "Failed to ioremap() for %s.\n", - sysmmu_ips_name[i]); - ret = -ENXIO; - goto err_reg; - } + dev_dbg(&pdev->dev, "registers %p (%p, %p)\n", + sysmmu->regs, sysmmu->ioarea, res);
- irq = platform_get_irq(pdev, i); - if (irq <= 0) { - dev_err(dev, "Failed to get the IRQ resource of %s.\n", - sysmmu_ips_name[i]); - ret = -ENOENT; - goto err_map; - } + sysmmu->irq = ret = platform_get_irq(pdev, 0); + if (ret <= 0) { + dev_err(&pdev->dev, "cannot find IRQ\n"); + goto err_iomap_done; + }
- if (request_irq(irq, s5p_sysmmu_irq, IRQF_DISABLED, - pdev->name, (void *)i)) { - dev_err(dev, "Failed to request IRQ for %s.\n", - sysmmu_ips_name[i]); - ret = -ENOENT; - goto err_map; - } + ret = request_irq(sysmmu->irq, s5p_sysmmu_irq, 0, + dev_name(&pdev->dev), sysmmu); + + if (ret != 0) { + dev_err(&pdev->dev, "cannot claim IRQ %d\n", sysmmu->irq); + goto err_iomap_done; }
+ sysmmu->clk = clk_get(&pdev->dev, "sysmmu"); + if (IS_ERR_OR_NULL(sysmmu->clk)) { + dev_err(&pdev->dev, "cannot get clock\n"); + ret = -ENOENT; + goto err_request_irq_done; + } + dev_dbg(&pdev->dev, "clock source %p\n", sysmmu->clk); + sysmmu->ip = pdev->id; + + spin_lock_irqsave(&sysmmu_slock, flags); + sysmmu_table[pdev->id] = sysmmu; + spin_unlock_irqrestore(&sysmmu_slock, flags); + + sysmmu->dev = &pdev->dev; + + platform_set_drvdata(pdev, sysmmu); + + pm_runtime_set_active(&pdev->dev); + pm_runtime_enable(&pdev->dev); + + dev_info(&pdev->dev, "Samsung S5P SYSMMU (IOMMU)\n"); return 0;
-err_map: - iounmap(sysmmusfrs[i]); -err_reg: - release_mem_region(mem->start, resource_size(mem)); -err_res: +err_request_irq_done: + free_irq(sysmmu->irq, sysmmu); + +err_iomap_done: + iounmap(sysmmu->regs); + +err_ioarea_requested: + release_resource(sysmmu->ioarea); + kfree(sysmmu->ioarea); + +err_s5p_sysmmu_info_allocated: + kfree(sysmmu); return ret; }
static int s5p_sysmmu_remove(struct platform_device *pdev) { + struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev); + unsigned long flags; + + pm_runtime_disable(sysmmu->dev); + + spin_lock_irqsave(&sysmmu_slock, flags); + sysmmu_table[pdev->id] = NULL; + spin_unlock_irqrestore(&sysmmu_slock, flags); + + clk_put(sysmmu->clk); + + free_irq(sysmmu->irq, sysmmu); + + iounmap(sysmmu->regs); + + release_resource(sysmmu->ioarea); + kfree(sysmmu->ioarea); + + kfree(sysmmu); + return 0; } -int s5p_sysmmu_runtime_suspend(struct device *dev) + +static int s5p_sysmmu_runtime_suspend(struct device *dev) { + struct platform_device *pdev = to_platform_device(dev); + struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev); + + if (sysmmu->domain) + s5p_disable_iommu(sysmmu); + return 0; }
-int s5p_sysmmu_runtime_resume(struct device *dev) +static int s5p_sysmmu_runtime_resume(struct device *dev) { + struct platform_device *pdev = to_platform_device(dev); + struct s5p_sysmmu_info *sysmmu = platform_get_drvdata(pdev); + + if (sysmmu->domain) + s5p_enable_iommu(sysmmu); + return 0; }
-const struct dev_pm_ops s5p_sysmmu_pm_ops = { - .runtime_suspend = s5p_sysmmu_runtime_suspend, - .runtime_resume = s5p_sysmmu_runtime_resume, +static const struct dev_pm_ops s5p_sysmmu_pm_ops = { + .runtime_suspend = s5p_sysmmu_runtime_suspend, + .runtime_resume = s5p_sysmmu_runtime_resume, };
static struct platform_driver s5p_sysmmu_driver = { - .probe = s5p_sysmmu_probe, - .remove = s5p_sysmmu_remove, - .driver = { - .owner = THIS_MODULE, - .name = "s5p-sysmmu", - .pm = &s5p_sysmmu_pm_ops, - } + .probe = s5p_sysmmu_probe, + .remove = s5p_sysmmu_remove, + .driver = { + .owner = THIS_MODULE, + .name = "s5p-sysmmu", + .pm = &s5p_sysmmu_pm_ops, + }, };
-static int __init s5p_sysmmu_init(void) +static int __init s5p_sysmmu_register(void) { - return platform_driver_register(&s5p_sysmmu_driver); + int ret; + + sysmmu_debug(3, "Registering sysmmu driver...\n"); + + slpt_cache = kmem_cache_create("slpt_cache", 1024, 1024, + SLAB_HWCACHE_ALIGN, NULL); + if (!slpt_cache) { + printk(KERN_ERR + "%s: failed to allocated slpt cache\n", __func__); + return -ENOMEM; + } + + ret = platform_driver_register(&s5p_sysmmu_driver); + + if (ret) { + printk(KERN_ERR + "%s: failed to register sysmmu driver\n", __func__); + return -EINVAL; + } + + register_iommu(&s5p_sysmmu_ops); + + return ret; } -arch_initcall(s5p_sysmmu_init); +postcore_initcall(s5p_sysmmu_register); + +MODULE_AUTHOR("Andrzej Pietrasiewicz andrzej.p@samsung.com"); +MODULE_DESCRIPTION("Samsung System MMU (IOMMU) driver"); +MODULE_LICENSE("GPL"); diff --git a/arch/arm/plat-samsung/include/plat/devs.h b/arch/arm/plat-samsung/include/plat/devs.h index 24ebb1e..4506902 100644 --- a/arch/arm/plat-samsung/include/plat/devs.h +++ b/arch/arm/plat-samsung/include/plat/devs.h @@ -147,7 +147,6 @@ extern struct platform_device s5p_device_mipi_csis1;
extern struct platform_device s5p_device_ehci;
-extern struct platform_device exynos4_device_sysmmu;
/* s3c2440 specific devices */
diff --git a/arch/arm/plat-samsung/include/plat/fimc-core.h b/arch/arm/plat-samsung/include/plat/fimc-core.h index 945a99d..a5dfb82 100644 --- a/arch/arm/plat-samsung/include/plat/fimc-core.h +++ b/arch/arm/plat-samsung/include/plat/fimc-core.h @@ -46,4 +46,29 @@ static inline void s3c_fimc_setname(int id, char *name) } }
+static inline struct device *s3c_fimc_getdevice(int id) +{ + switch (id) { +#ifdef CONFIG_S5P_DEV_FIMC0 + case 0: + return &s5p_device_fimc0.dev; +#endif +#ifdef CONFIG_S5P_DEV_FIMC1 + case 1: + return &s5p_device_fimc1.dev; +#endif +#ifdef CONFIG_S5P_DEV_FIMC2 + case 2: + return &s5p_device_fimc2.dev; +#endif +#ifdef CONFIG_S5P_DEV_FIMC3 + case 3: + return &s5p_device_fimc3.dev; +#endif + } + return NULL; +} + +#define S5P_MAX_FIMC_NUM (4) + #endif /* __ASM_PLAT_FIMC_CORE_H */
Hi Marek,
On Fri, Sep 2, 2011 at 4:56 PM, Marek Szyprowski m.szyprowski@samsung.com wrote: ...
arch/arm/plat-s5p/Kconfig | 21 +- arch/arm/plat-s5p/include/plat/sysmmu.h | 119 ++-- arch/arm/plat-s5p/sysmmu.c | 855 ++++++++++++++++++------
Please move the driver to drivers/iommu/, where all other IOMMU API users sit.
...
diff --git a/arch/arm/plat-s5p/Kconfig b/arch/arm/plat-s5p/Kconfig
...
+config IOMMU_API
- bool
You don't need this anymore: this is already part of drivers/iommu/Kconfig.
+static int s5p_sysmmu_unmap(struct iommu_domain *domain, unsigned long iova,
- int gfp_order)
{
...
- if (SZ_1M == len) {
- if (!page_1m(flpt_va))
- bug_unmapping_prohibited(iova, len);
..
- } else if (SZ_16M == len) {
- int i;
- /* first loop to verify it actually is 16M mapping */
- for (i = 0; i < 16; ++i)
- if (!page_16m(flpt_va + 4 * i))
- bug_unmapping_prohibited(iova, len);
Actually these are not bugs; iommu drivers need to unmap the page they find in iova, and return the page size that was actually unmapped: you may well receive a page size that is different from the page that maps iova.
...
- return 0;
On success, need to return the size (in page order) of the page that was unmapped.
Regards, Ohad.
Hi.
On Fri, Sep 2, 2011 at 10:56 PM, Marek Szyprowski m.szyprowski@samsung.com wrote:
- iova must be aligned on a 4kB, 64kB, 1MB and 16MB boundaries, respectively.
- */
Actually, iova is just needed to be aligned by 4KiB because it is minimum requirement. I think IOMMU driver is capable of mapping a group of page frames that is aligned by 1MiB with an iova that is aligned by 4KB if the iova is large enough to map the given page frames.
+static int s5p_sysmmu_map(struct iommu_domain *domain, unsigned long iova,
- phys_addr_t paddr, int gfp_order, int prot)
+{
- struct s5p_sysmmu_domain *s5p_domain = domain->priv;
- int flpt_idx = flpt_index(iova);
- size_t len = 0x1000UL << gfp_order;
- void *flpt_va, *slpt_va;
- if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) {
- sysmmu_debug(3, "bad order: %d\n", gfp_order);
- return -EINVAL;
- }
Likewise, I think this driver need to support mapping 128KiB aligned, 128KiB physical memory, for example.
Otherwise, it is somewhat restrictive than we expect.
Thank you.
Cho KyongHo.
On Tue, Sep 6, 2011 at 1:27 PM, KyongHo Cho pullip.cho@samsung.com wrote:
On Fri, Sep 2, 2011 at 10:56 PM, Marek Szyprowski
+static int s5p_sysmmu_map(struct iommu_domain *domain, unsigned long iova,
- phys_addr_t paddr, int gfp_order, int prot)
+{
- struct s5p_sysmmu_domain *s5p_domain = domain->priv;
- int flpt_idx = flpt_index(iova);
- size_t len = 0x1000UL << gfp_order;
- void *flpt_va, *slpt_va;
- if (len != SZ_16M && len != SZ_1M && len != SZ_64K && len != SZ_4K) {
- sysmmu_debug(3, "bad order: %d\n", gfp_order);
- return -EINVAL;
- }
Likewise, I think this driver need to support mapping 128KiB aligned, 128KiB physical memory, for example.
Otherwise, it is somewhat restrictive than we expect.
That's actually OK, because the IOMMU core will split physically contiguous memory regions to pages on behalf of its drivers (drivers will just have to advertise the page sizes their hardware supports); this way you don't duplicate the logic in every IOMMU driver.
Take a look:
linaro-mm-sig@lists.linaro.org