On Thursday 19 January 2012, Hiroshi Doyu wrote:
The main advantage of the IOMMU API is that it is able to separate multiple contexts, per user, so one application that is able to program a DMA engine cannot access memory that another application has mapped. Is that what you do with ion_iommu_heap_create(), i.e. that it gets called for each GPU context?
Yes. The concept of this part cames from dma-mapping API code.
I don't understand. The dma-mapping code on top of IOMMU does *not* use one iommu context per GPU context, it uses one IOMMU context for everything together.
+static int ion_buffer_allocate(struct ion_buffer *buf) +{
int i, npages = NUM_PAGES(buf);
buf->pages = kmalloc(npages * sizeof(*buf->pages), GFP_KERNEL);
if (!buf->pages)
goto err_pages;
buf->sglist = vzalloc(npages * sizeof(*buf->sglist));
if (!buf->sglist)
goto err_sgl;
Using vzalloc here probably goes a bit too far, it's fairly expensive to use compared with kzalloc, not only do you have to maintain the page table entries for the new mapping, it also means you use up precious space in the vmalloc area and have to use small pages for accessing the data in it.
Should I use "kzalloc()" instead here?
Yes, that would be better, at least if you can prove that there is a reasonable upper bound on the size of the allocation. kmalloc/kzalloc usually work ok up to 16kb or at most 128kb. If you think the total allocation might be larger than that, you have to use vzalloc anyway but that should come with a long comment explaining what is going on.
More likely if you end up needing vzalloc because of the size of the allocation, you should see that as a sign that you are doing something wrong and you should use larger chunks than single allocations in order to cut down the number of sglist entries.
Another option would be to just use sg_alloc_table(), which was made for this exact purpuse ;-)
+static void *iommu_heap_map_kernel(struct ion_heap *heap,
struct ion_buffer *buf)
+{
int npages = NUM_PAGES(buf);
BUG_ON(!buf->pages);
buf->vaddr = vm_map_ram(buf->pages, npages, -1,
pgprot_noncached(pgprot_kernel));
pr_debug("va:%p\n", buf->vaddr);
WARN_ON(!buf->vaddr);
return buf->vaddr;
+}
You can't do this on ARMv6 or ARMv7: There is already a cacheable mapping of all entries in buf->pages, so you must not install a second noncached mapping. Either make sure you have only noncached pages, or only cached ones.
Ok, any example to maintain both mappings?
The dma_mapping implementation on ARM takes care of this by unmapping any allocation that goes through dma_alloc_coherent() from the linear mapping, at least in the CMA version that Marek did. In order to take advantage of this, you either need to call dma_alloc_coherent on a kernel that provides CMA, or call the CMA allocator directly (not generally recommended). Without CMA support, your only option is to use memory that was never part of the linear kernel mapping, but that would remove the main advantage of your patch, which is aimed at removing the need to take out memory from the linear mapping.
Using only cacheable memory would be valid here, but is probably not desired for performance reasons, or even allowed in ION because it requires explicit cache management functions for flushing the caches while handing data back and forth between software and the device.
Arnd