[RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

List overview All Threads
Download

newer

older

[PATCHv20 00/15] Contiguous Memory...

dma_buf presentation at fosdem in...

Hiroshi Doyu

18 Jan 2012 18 Jan '12

7:40 a.m.

Hi,

Recently we've implemented IOMMU heap as an attachment which is one of the ION memory manager(*1) heap/backend. This implementation is completely independent of any SoC, and this can be used for other SoC as well. If our implementation is not totally wrong, it would be nice to share some experience/code here since Ion is still not so clear to me yet.

I found that Linaro also seems to have started some ION work(*2). I think that some of Ion feature could be supported/replaced with Linaro UMM. For example, presently "ion_iommu_heap" is implemented with the standard IOMMU API, but it could be also implemented with the coming DMA API? Also DMABUF can be used in Ion core part as well, I guess.

Currently there's no Ion memmgr code in the upstream "drivers/staging/android"(*3). Is there any plan to support this? Or is this something considered as a completely _temporary_ solution, and never going to be added?

It would be nice if we can share some of our effort here since not small Android users need Ion, even temporary.

Any comment would be really appreciated.

Hiroshi DOYU

*1: https://android.googlesource.com/kernel/common.git $ git clone https://android.googlesource.com/kernel/common.git $ cd common $ git checkout -b android origin/android-3.0 $ git grep -e "<linux/ion.h>" drivers/

drivers/gpu/ion/ion.c:#include <linux/ion.h> drivers/gpu/ion/ion_carveout_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_priv.h:#include <linux/ion.h> drivers/gpu/ion/ion_system_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_system_mapper.c:#include <linux/ion.h> drivers/gpu/ion/tegra/tegra_ion.c:#include <linux/ion.h>

*2: https://blueprints.launchpad.net/linaro-mm-sig/+spec/linaro-mmwg-cma-ion *3: http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git%3Ba=tree%3Bf=d...

Attachments:

0001-gpu-ion-Add-IOMMU-heap-allocator-with-IOMMU-API.patch (text/x-patch — 13.2 KB)

Show replies by date

Rebecca Schultz Zavin

18 Jan 18 Jan

9:58 p.m.

New subject: [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

Hi,

I'm just ramping back up after being on maternity leave so I'm pretty buried under on the lists. Please do continue to CC me directly on ION related work.

On Tue, Jan 17, 2012 at 11:40 PM, Hiroshi Doyu hdoyu@nvidia.com wrote:

...

Hi,

Recently we've implemented IOMMU heap as an attachment which is one of the ION memory manager(*1) heap/backend. This implementation is completely independent of any SoC, and this can be used for other SoC as well. If our implementation is not totally wrong, it would be nice to share some experience/code here since Ion is still not so clear to me yet.

I will take a look at your review this code as soon as I can. The intention is to take contributions like these so keep them coming.

...

I found that Linaro also seems to have started some ION work(*2). I think that some of Ion feature could be supported/replaced with Linaro UMM. For example, presently "ion_iommu_heap" is implemented with the standard IOMMU API, but it could be also implemented with the coming DMA API? Also DMABUF can be used in Ion core part as well, I guess.

It is my intention to leverage the DMABUF api as much as I can. I'll be going back over the work on the DMA api over the next few weeks and thinking about how to integrate the two solutions.

...

Currently there's no Ion memmgr code in the upstream "drivers/staging/android"(*3). Is there any plan to support this? Or is this something considered as a completely _temporary_ solution, and never going to be added?

I expect this is an oversight that occurred when the android drivers were added back to staging. It's not intended to be a temporary solution. That being said, I'm not sure I want to push for it in staging. I'd rather give it a little more time to bake and then at some post it as a patch set.

...

It would be nice if we can share some of our effort here since not small Android users need Ion, even temporary.

Agreed! Keep sending things my way and I'll get feedback to you as quickly as I can.

...

Any comment would be really appreciated.

Hiroshi DOYU

*1: https://android.googlesource.com/kernel/common.git $ git clone https://android.googlesource.com/kernel/common.git $ cd common $ git checkout -b android origin/android-3.0 $ git grep -e "<linux/ion.h>" drivers/

drivers/gpu/ion/ion.c:#include <linux/ion.h> drivers/gpu/ion/ion_carveout_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_priv.h:#include <linux/ion.h> drivers/gpu/ion/ion_system_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_system_mapper.c:#include <linux/ion.h> drivers/gpu/ion/tegra/tegra_ion.c:#include <linux/ion.h>

*2: https://blueprints.launchpad.net/linaro-mm-sig/+spec/linaro-mmwg-cma-ion *3: http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git%3Ba=tree%3Bf=d...

Hiroshi Doyu

19 Jan 19 Jan

7:47 a.m.

New subject: [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

Hi Rebecca,

From: Rebecca Schultz Zavin rebecca@android.com Subject: Re: [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API Date: Wed, 18 Jan 2012 22:58:47 +0100 Message-ID: CALJcvx58x60HvHMCLkR48rHuZWTYL-aYc43hmW7Kk8=qYC1Jgw@mail.gmail.com

...

Hi,

I'm just ramping back up after being on maternity leave so I'm pretty buried under on the lists. Please do continue to CC me directly on ION related work.

Congrats! and I'll put you on Cc:.

...

...
On Tue, Jan 17, 2012 at 11:40 PM, Hiroshi Doyu <hdoyu@nvidia.commailto:hdoyu@nvidia.com> wrote: Hi,

Recently we've implemented IOMMU heap as an attachment which is one of the ION memory manager(*1) heap/backend. This implementation is completely independent of any SoC, and this can be used for other SoC as well. If our implementation is not totally wrong, it would be nice to share some experience/code here since Ion is still not so clear to me yet.

I will take a look at your review this code as soon as I can. The intention is to take contributions like these so keep them coming.

Thanks.

...

...
I found that Linaro also seems to have started some ION work(*2). I think that some of Ion feature could be supported/replaced with Linaro UMM. For example, presently "ion_iommu_heap" is implemented with the standard IOMMU API, but it could be also implemented with the coming DMA API? Also DMABUF can be used in Ion core part as well, I guess.

It is my intention to leverage the DMABUF api as much as I can. I'll be going back over the work on the DMA api over the next few weeks and thinking about how to integrate the two solutions.

Good.

...

...
Currently there's no Ion memmgr code in the upstream "drivers/staging/android"(*3). Is there any plan to support this? Or is this something considered as a completely _temporary_ solution, and never going to be added?

I expect this is an oversight that occurred when the android drivers were added back to staging. It's not intended to be a temporary solution. That being said, I'm not sure I want to push for it in staging. I'd rather give it a little more time to bake and then at some post it as a patch set.

...
It would be nice if we can share some of our effort here since not small Android users need Ion, even temporary.

Agreed! Keep sending things my way and I'll get feedback to you as quickly as I can.

The following patch may be helpful for this review. This Ion IOMMU heap patch depends on the following iommu_ops, "GART" and "SMMU" patches, which is the IOMMU API backend for Tegra20/30.

https://lkml.org/lkml/2012/1/5/25

So currently as long as the standard IOMMU API is usable in any SoC, this Ion IOMMU heap could work. I think that DMA API could replace this eventually.

...

Any comment would be really appreciated.

Hiroshi DOYU

*1: https://android.googlesource.com/kernel/common.git $ git clone https://android.googlesource.com/kernel/common.git $ cd common $ git checkout -b android origin/android-3.0 $ git grep -e "<linux/ion.h>" drivers/

drivers/gpu/ion/ion.c:#include <linux/ion.h> drivers/gpu/ion/ion_carveout_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_priv.h:#include <linux/ion.h> drivers/gpu/ion/ion_system_heap.c:#include <linux/ion.h> drivers/gpu/ion/ion_system_mapper.c:#include <linux/ion.h> drivers/gpu/ion/tegra/tegra_ion.c:#include <linux/ion.h>

*2: https://blueprints.launchpad.net/linaro-mm-sig/+spec/linaro-mmwg-cma-ion *3: http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git%3Ba=tree%3Bf=d...

Arnd Bergmann

12:04 p.m.

New subject: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

On Wednesday 18 January 2012, Hiroshi Doyu wrote:

...

Hi,

Recently we've implemented IOMMU heap as an attachment which is one of the ION memory manager(*1) heap/backend. This implementation is completely independent of any SoC, and this can be used for other SoC as well. If our implementation is not totally wrong, it would be nice to share some experience/code here since Ion is still not so clear to me yet.

I found that Linaro also seems to have started some ION work(*2). I think that some of Ion feature could be supported/replaced with Linaro UMM. For example, presently "ion_iommu_heap" is implemented with the standard IOMMU API, but it could be also implemented with the coming DMA API? Also DMABUF can be used in Ion core part as well, I guess.

Currently there's no Ion memmgr code in the upstream "drivers/staging/android"(*3). Is there any plan to support this? Or is this something considered as a completely _temporary_ solution, and never going to be added?

It would be nice if we can share some of our effort here since not small Android users need Ion, even temporary.

Any comment would be really appreciated.

Hi,

Your implemention looks quite nice overall, but on the high level, I don't see what the benefit of using the IOMMU API is here instead of the dma-mapping API. I believe that all device drivers should by default use the dma-mapping interfaces, which may in turn be based on linear mapping (possibly using CMA) or an IOMMU.

The main advantage of the IOMMU API is that it is able to separate multiple contexts, per user, so one application that is able to program a DMA engine cannot access memory that another application has mapped. Is that what you do with ion_iommu_heap_create(), i.e. that it gets called for each GPU context?

...

+#define NUM_PAGES(buf) (PAGE_ALIGN((buf)->size) >> PAGE_SHIFT)

+#define GFP_ION (GFP_KERNEL | __GFP_HIGHMEM | __GFP_NOWARN)

I find macros like these more confusing than helpful because to the casual reader of one driver they don't mean anything. It's better to just open-code them.

...

+static int ion_buffer_allocate(struct ion_buffer *buf) +{

```
  int i, npages = NUM_PAGES(buf);
```

  buf->pages = kmalloc(npages * sizeof(*buf->pages), GFP_KERNEL);

```
  if (!buf->pages)
```
```
          goto err_pages;
```

  buf->sglist = vzalloc(npages * sizeof(*buf->sglist));

```
  if (!buf->sglist)
```
```
          goto err_sgl;
```

Using vzalloc here probably goes a bit too far, it's fairly expensive to use compared with kzalloc, not only do you have to maintain the page table entries for the new mapping, it also means you use up precious space in the vmalloc area and have to use small pages for accessing the data in it.

Why do you need two arrays here? The sglist already contains pointers to each page, right?

...

```
  sg_init_table(buf->sglist, npages);
```
```
  for (i = 0; i < npages; i++) {
```
```
          struct page *page;
```
```
          phys_addr_t pa;
```
```
          page = alloc_page(GFP_ION);
```
```
          if (!page)
```
```
                  goto err_pgalloc;
```
```
          pa = page_to_phys(page);
```

          sg_set_page(&buf->sglist[i], page, PAGE_SIZE, 0);

```
          flush_dcache_page(page);
```

          outer_flush_range(pa, pa + PAGE_SIZE);

This allocates pages that are contiguous only by page. Note that a lot of IOMMUs work only (or at least much better) with larger allocations, such as 64KB at a time.

Further, the code you have here is completely arm specific: outer_flush_range() is only available on ARM, and most architectures don't require flushing caches here. In fact even on ARM I see no reason to flush a newly allocated page -- invalidating should be enough.

...

+static void *iommu_heap_map_kernel(struct ion_heap *heap,

                             struct ion_buffer *buf)

```
  int npages = NUM_PAGES(buf);
```
```
  BUG_ON(!buf->pages);
```

  buf->vaddr = vm_map_ram(buf->pages, npages, -1,

                          pgprot_noncached(pgprot_kernel));

```
  pr_debug("va:%p\n", buf->vaddr);
```
```
  WARN_ON(!buf->vaddr);
```
```
  return buf->vaddr;
```

You can't do this on ARMv6 or ARMv7: There is already a cacheable mapping of all entries in buf->pages, so you must not install a second noncached mapping. Either make sure you have only noncached pages, or only cached ones.

...

+static int iommu_heap_map_user(struct ion_heap *mapper,

                         struct ion_buffer *buf,

                         struct vm_area_struct *vma)

```
  int i = vma->vm_pgoff >> PAGE_SHIFT;
```
```
  unsigned long uaddr = vma->vm_start;
```

  unsigned long usize = vma->vm_end - vma->vm_start;

  pr_debug("vma:%08lx-%08lx\n", vma->vm_start, vma->vm_end);

```
  BUG_ON(!buf->pages);
```

  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

```
  do {
```
```
          int ret;
```

          struct page *page = buf->pages[i++];

          ret = vm_insert_page(vma, uaddr, page);

```
          if (ret)
```
```
                  return ret;
```

Same here: If you want to have an uncached mapping in user space, the pages must not be in a cacheable kernel mapping.

Arnd

Hiroshi Doyu

2:15 p.m.

New subject: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

Hi Arnd,

Thank you for your reivew. Some questions are inlined.

From: Arnd Bergmann arnd@arndb.de Subject: Re: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API Date: Thu, 19 Jan 2012 13:04:34 +0100 Message-ID: 201201191204.35067.arnd@arndb.de

...

On Wednesday 18 January 2012, Hiroshi Doyu wrote:

...
Hi,

Recently we've implemented IOMMU heap as an attachment which is one of the ION memory manager(*1) heap/backend. This implementation is completely independent of any SoC, and this can be used for other SoC as well. If our implementation is not totally wrong, it would be nice to share some experience/code here since Ion is still not so clear to me yet.

I found that Linaro also seems to have started some ION work(*2). I think that some of Ion feature could be supported/replaced with Linaro UMM. For example, presently "ion_iommu_heap" is implemented with the standard IOMMU API, but it could be also implemented with the coming DMA API? Also DMABUF can be used in Ion core part as well, I guess.

Currently there's no Ion memmgr code in the upstream "drivers/staging/android"(*3). Is there any plan to support this? Or is this something considered as a completely _temporary_ solution, and never going to be added?

It would be nice if we can share some of our effort here since not small Android users need Ion, even temporary.

Any comment would be really appreciated.

Hi,

Your implemention looks quite nice overall, but on the high level, I don't see what the benefit of using the IOMMU API is here instead of the dma-mapping API. I believe that all device drivers should by default use the dma-mapping interfaces, which may in turn be based on linear mapping (possibly using CMA) or an IOMMU.

Using dma-mapping API could be left for v2 of this patch.

...

The main advantage of the IOMMU API is that it is able to separate multiple contexts, per user, so one application that is able to program a DMA engine cannot access memory that another application has mapped. Is that what you do with ion_iommu_heap_create(), i.e. that it gets called for each GPU context?

Yes. The concept of this part cames from dma-mapping API code.

...

...
+#define NUM_PAGES(buf) (PAGE_ALIGN((buf)->size) >> PAGE_SHIFT)

+#define GFP_ION (GFP_KERNEL | __GFP_HIGHMEM | __GFP_NOWARN)

I find macros like these more confusing than helpful because to the casual reader of one driver they don't mean anything. It's better to just open-code them.

Ok.

...

...
+static int ion_buffer_allocate(struct ion_buffer *buf) +{
  int i, npages = NUM_PAGES(buf);
  buf->pages = kmalloc(npages * sizeof(*buf->pages), GFP_KERNEL);
  if (!buf->pages)
          goto err_pages;
  buf->sglist = vzalloc(npages * sizeof(*buf->sglist));
  if (!buf->sglist)
          goto err_sgl;
Using vzalloc here probably goes a bit too far, it's fairly expensive to use compared with kzalloc, not only do you have to maintain the page table entries for the new mapping, it also means you use up precious space in the vmalloc area and have to use small pages for accessing the data in it.

Should I use "kzalloc()" instead here?

...

Why do you need two arrays here? The sglist already contains pointers to each page, right?

This is one of the "FIXME". We wanted to have **pages to use "vm_map_ram(**page,..)" in "iommu_heap_map_kernel()" later. This "**pages" are residual.

...

```
  sg_init_table(buf->sglist, npages);
```
```
  for (i = 0; i < npages; i++) {
```
```
          struct page *page;
```
```
          phys_addr_t pa;
```
```
          page = alloc_page(GFP_ION);
```
```
          if (!page)
```
```
                  goto err_pgalloc;
```
```
          pa = page_to_phys(page);
```

          sg_set_page(&buf->sglist[i], page, PAGE_SIZE, 0);

```
          flush_dcache_page(page);
```

          outer_flush_range(pa, pa + PAGE_SIZE);

This allocates pages that are contiguous only by page. Note that a lot of IOMMUs work only (or at least much better) with larger allocations, such as 64KB at a time.

Ok, we need to add support multiple page sizes(4KB & 4MB).

...

Further, the code you have here is completely arm specific: outer_flush_range() is only available on ARM, and most architectures don't require flushing caches here. In fact even on ARM I see no reason to flush a newly allocated page -- invalidating should be enough.

Right.

Presently Ion API doesn't support cache maintenance API for performance optimization. This API can be considered to be added later.

...

...
+static void *iommu_heap_map_kernel(struct ion_heap *heap,
                             struct ion_buffer *buf)
+{
  int npages = NUM_PAGES(buf);
  BUG_ON(!buf->pages);
  buf->vaddr = vm_map_ram(buf->pages, npages, -1,
                          pgprot_noncached(pgprot_kernel));
  pr_debug("va:%p\n", buf->vaddr);
  WARN_ON(!buf->vaddr);
  return buf->vaddr;
+}
You can't do this on ARMv6 or ARMv7: There is already a cacheable mapping of all entries in buf->pages, so you must not install a second noncached mapping. Either make sure you have only noncached pages, or only cached ones.

Ok, any example to maintain both mappings?

...

+static int iommu_heap_map_user(struct ion_heap *mapper,

                         struct ion_buffer *buf,

                         struct vm_area_struct *vma)

```
  int i = vma->vm_pgoff >> PAGE_SHIFT;
```
```
  unsigned long uaddr = vma->vm_start;
```

  unsigned long usize = vma->vm_end - vma->vm_start;

  pr_debug("vma:%08lx-%08lx\n", vma->vm_start, vma->vm_end);

```
  BUG_ON(!buf->pages);
```

  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

```
  do {
```
```
          int ret;
```

          struct page *page = buf->pages[i++];

          ret = vm_insert_page(vma, uaddr, page);

```
          if (ret)
```
```
                  return ret;
```

Same here: If you want to have an uncached mapping in user space, the pages must not be in a cacheable kernel mapping.

Arnd

Arnd Bergmann

4:27 p.m.

New subject: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

On Thursday 19 January 2012, Hiroshi Doyu wrote:

...

...
The main advantage of the IOMMU API is that it is able to separate multiple contexts, per user, so one application that is able to program a DMA engine cannot access memory that another application has mapped. Is that what you do with ion_iommu_heap_create(), i.e. that it gets called for each GPU context?

Yes. The concept of this part cames from dma-mapping API code.

I don't understand. The dma-mapping code on top of IOMMU does *not* use one iommu context per GPU context, it uses one IOMMU context for everything together.

...

...
...
+static int ion_buffer_allocate(struct ion_buffer *buf) +{
  int i, npages = NUM_PAGES(buf);
  buf->pages = kmalloc(npages * sizeof(*buf->pages), GFP_KERNEL);
  if (!buf->pages)
          goto err_pages;
  buf->sglist = vzalloc(npages * sizeof(*buf->sglist));
  if (!buf->sglist)
          goto err_sgl;
Using vzalloc here probably goes a bit too far, it's fairly expensive to use compared with kzalloc, not only do you have to maintain the page table entries for the new mapping, it also means you use up precious space in the vmalloc area and have to use small pages for accessing the data in it.
Should I use "kzalloc()" instead here?

Yes, that would be better, at least if you can prove that there is a reasonable upper bound on the size of the allocation. kmalloc/kzalloc usually work ok up to 16kb or at most 128kb. If you think the total allocation might be larger than that, you have to use vzalloc anyway but that should come with a long comment explaining what is going on.

More likely if you end up needing vzalloc because of the size of the allocation, you should see that as a sign that you are doing something wrong and you should use larger chunks than single allocations in order to cut down the number of sglist entries.

Another option would be to just use sg_alloc_table(), which was made for this exact purpuse ;-)

...

...
...
+static void *iommu_heap_map_kernel(struct ion_heap *heap,
                             struct ion_buffer *buf)
+{
  int npages = NUM_PAGES(buf);
  BUG_ON(!buf->pages);
  buf->vaddr = vm_map_ram(buf->pages, npages, -1,
                          pgprot_noncached(pgprot_kernel));
  pr_debug("va:%p\n", buf->vaddr);
  WARN_ON(!buf->vaddr);
  return buf->vaddr;
+}
You can't do this on ARMv6 or ARMv7: There is already a cacheable mapping of all entries in buf->pages, so you must not install a second noncached mapping. Either make sure you have only noncached pages, or only cached ones.
Ok, any example to maintain both mappings?

The dma_mapping implementation on ARM takes care of this by unmapping any allocation that goes through dma_alloc_coherent() from the linear mapping, at least in the CMA version that Marek did. In order to take advantage of this, you either need to call dma_alloc_coherent on a kernel that provides CMA, or call the CMA allocator directly (not generally recommended). Without CMA support, your only option is to use memory that was never part of the linear kernel mapping, but that would remove the main advantage of your patch, which is aimed at removing the need to take out memory from the linear mapping.

Using only cacheable memory would be valid here, but is probably not desired for performance reasons, or even allowed in ION because it requires explicit cache management functions for flushing the caches while handing data back and forth between software and the device.

Arnd

Krishna Reddy

6:45 p.m.

New subject: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

...

...
...
The main advantage of the IOMMU API is that it is able to separate multiple contexts, per user, so one application that is able to program a DMA engine cannot access memory that another application has mapped. Is that what you do with ion_iommu_heap_create(), i.e. that it gets called for each GPU context?

Yes. The concept of this part cames from dma-mapping API code.

I don't understand. The dma-mapping code on top of IOMMU does *not* use one iommu context per GPU context, it uses one IOMMU context for everything together.

Let me add more details on how IOMMU heap allocator is intended to use. IOMMU heap allocator would be a part of ION memory manager. Clients from user space call Ion memory manger ioctl to allocate memory from Ion IOMMU heap. IOMMU heap allocator would allocate the non-contiguous pages based on the alloc size requested. The user space clients can pass the Ion memory handle to kernel stack/drivers. Kernel drivers would get sg list from Ion memory manager api based on the mem handle received. The sg list would be used to sync memory, map the physical pages memory into desired iommu space using dma-mapping API's before passing memory to H/W module.

The allocation logic is IOMMU heap allocator can use dma-mapping api to allocate, if possible. As of now, it doesn't seem possible. The dma alloc apis tries to alloc virtual space in kernel along with phys memory during alloc. The user can alloc arbitrarily large amounts of memory, which would not fit into virtual space of kernel. So, the IOMMU should alloc pages itself and IOMMU heap allocator should take care of conflicting mapping issue.

The Ion IOMMU Heap allocator shouldn't be doing any IOMMU mapping using direct IOMMU API's. Moreover, IOMMU heap allocator has no info on which dev is going to access the mapping. So, it can't do proper mapping.

...

...
...
Using vzalloc here probably goes a bit too far, it's fairly expensive to use compared with kzalloc, not only do you have to maintain the page table entries for the new mapping, it also means you use up precious space in the vmalloc area and have to use small pages for accessing the data in it.

Should I use "kzalloc()" instead here?

Yes, that would be better, at least if you can prove that there is a reasonable upper bound on the size of the allocation. kmalloc/kzalloc usually work ok up to 16kb or at most 128kb. If you think the total allocation might be larger than that, you have to use vzalloc anyway but that should come with a long comment explaining what is going on.

Kzalloc doesn't guarantee any size bigger than PAGE_SIZE when the system memory Is fully fragmented. It should only be used for sizes less than or equal to PAGE_SIZE. Otherwise vzalloc() should be used to avoid failures due to fragmentation.

-KR

--nvpublic

Hiroshi Doyu

7:47 p.m.

New subject: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

Hi Arnd,

From: Arnd Bergmann arnd@arndb.de Subject: Re: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API Date: Thu, 19 Jan 2012 17:27:19 +0100 Message-ID: 201201191627.19732.arnd@arndb.de

...

On Thursday 19 January 2012, Hiroshi Doyu wrote:

...
...
The main advantage of the IOMMU API is that it is able to separate multiple contexts, per user, so one application that is able to program a DMA engine cannot access memory that another application has mapped. Is that what you do with ion_iommu_heap_create(), i.e. that it gets called for each GPU context?

Yes. The concept of this part cames from dma-mapping API code.

I don't understand. The dma-mapping code on top of IOMMU does *not* use one iommu context per GPU context, it uses one IOMMU context for everything together.

Tegra30 has a IOMMU("SMMU") which has 4 ASID(domain)(*1) and currently each "smmu_iommu_domain_init()" call allocates a ASID. "smmu_iommu_attach_dev()" assigns a client device to an allocated domain. For example, I think that the following "dev[i]" could be per GPU/device context(?), and our SMMU "iommu_ops" supports multiple device to be attached to a domain. This is configurable in SMMU.

for (i = 0; i < 4; i++) { map[i] = arm_iommu_create_mapping(); arm_iommu_attach_device(&dev[i], map[i]); }

Here, map[i] == asid == domain dev[i] == client

So I thought that theoretically DMA mapping API could support one iommu context(domain) per GPU/device context. 1-to-1 and 1-to-n too. Also there's the simple version of DMA mapping test(*2).

Anyway, my previous answer was wrong. ion_iommu_heap does NOT support multiple contexts but only single right now. It needs something more to support multiple context.

*1 https://lkml.org/lkml/2012/1/5/27 *2 https://lkml.org/lkml/2012/1/5/28

...

...
...
...
+static int ion_buffer_allocate(struct ion_buffer *buf) +{
  int i, npages = NUM_PAGES(buf);
  buf->pages = kmalloc(npages * sizeof(*buf->pages), GFP_KERNEL);
  if (!buf->pages)
          goto err_pages;
  buf->sglist = vzalloc(npages * sizeof(*buf->sglist));
  if (!buf->sglist)
          goto err_sgl;
Using vzalloc here probably goes a bit too far, it's fairly expensive to use compared with kzalloc, not only do you have to maintain the page table entries for the new mapping, it also means you use up precious space in the vmalloc area and have to use small pages for accessing the data in it.
Should I use "kzalloc()" instead here?
Yes, that would be better, at least if you can prove that there is a reasonable upper bound on the size of the allocation. kmalloc/kzalloc usually work ok up to 16kb or at most 128kb. If you think the total allocation might be larger than that, you have to use vzalloc anyway but that should come with a long comment explaining what is going on.

More likely if you end up needing vzalloc because of the size of the allocation, you should see that as a sign that you are doing something wrong and you should use larger chunks than single allocations in order to cut down the number of sglist entries.

Another option would be to just use sg_alloc_table(), which was made for this exact purpuse ;-)

Good to know;)

...

...
...
...
+static void *iommu_heap_map_kernel(struct ion_heap *heap,
                             struct ion_buffer *buf)
+{
  int npages = NUM_PAGES(buf);
  BUG_ON(!buf->pages);
  buf->vaddr = vm_map_ram(buf->pages, npages, -1,
                          pgprot_noncached(pgprot_kernel));
  pr_debug("va:%p\n", buf->vaddr);
  WARN_ON(!buf->vaddr);
  return buf->vaddr;
+}
You can't do this on ARMv6 or ARMv7: There is already a cacheable mapping of all entries in buf->pages, so you must not install a second noncached mapping. Either make sure you have only noncached pages, or only cached ones.
Ok, any example to maintain both mappings?
The dma_mapping implementation on ARM takes care of this by unmapping any allocation that goes through dma_alloc_coherent() from the linear mapping, at least in the CMA version that Marek did. In order to take advantage of this, you either need to call dma_alloc_coherent on a kernel that provides CMA, or call the CMA allocator directly (not generally recommended). Without CMA support, your only option is to use memory that was never part of the linear kernel mapping, but that would remove the main advantage of your patch, which is aimed at removing the need to take out memory from the linear mapping.

Using only cacheable memory would be valid here, but is probably not desired for performance reasons, or even allowed in ION because it requires explicit cache management functions for flushing the caches while handing data back and forth between software and the device.

Arnd

Hiroshi Doyu

8 Feb 8 Feb

8:19 a.m.

New subject: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API

From: Hiroshi Doyu hdoyu@nvidia.com Subject: Re: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API Date: Thu, 19 Jan 2012 16:15:38 +0200 (EET) Message-ID: 20120119.161538.1686847422348460522.hdoyu@nvidia.com

...

Hi Arnd,

Thank you for your reivew. Some questions are inlined.

From: Arnd Bergmann arnd@arndb.de Subject: Re: [Ce-android-mainline] [RFC][PATCH 1/1] gpu: ion: Add IOMMU heap allocator with IOMMU API Date: Thu, 19 Jan 2012 13:04:34 +0100 Message-ID: 201201191204.35067.arnd@arndb.de

...
On Wednesday 18 January 2012, Hiroshi Doyu wrote:

...
Hi,

Recently we've implemented IOMMU heap as an attachment which is one of the ION memory manager(*1) heap/backend. This implementation is completely independent of any SoC, and this can be used for other SoC as well. If our implementation is not totally wrong, it would be nice to share some experience/code here since Ion is still not so clear to me yet.

I found that Linaro also seems to have started some ION work(*2). I think that some of Ion feature could be supported/replaced with Linaro UMM. For example, presently "ion_iommu_heap" is implemented with the standard IOMMU API, but it could be also implemented with the coming DMA API? Also DMABUF can be used in Ion core part as well, I guess.

Currently there's no Ion memmgr code in the upstream "drivers/staging/android"(*3). Is there any plan to support this? Or is this something considered as a completely _temporary_ solution, and never going to be added?

It would be nice if we can share some of our effort here since not small Android users need Ion, even temporary.

Any comment would be really appreciated.

Hi,

Your implemention looks quite nice overall, but on the high level, I don't see what the benefit of using the IOMMU API is here instead of the dma-mapping API. I believe that all device drivers should by default use the dma-mapping interfaces, which may in turn be based on linear mapping (possibly using CMA) or an IOMMU.

Using dma-mapping API could be left for v2 of this patch.

FYI:

I've updated the patch with DMA API(v5). This is just kind of proot-of-concept that Ion can do mapping with DMA API. This patch may be changed a lot if DMABUF is introduced into Ion.

5013

days inactive

5034

days old

linaro-mm-sig@lists.linaro.org

8 comments

participants

tags (0)

participants (4)

Arnd Bergmann
Hiroshi Doyu
Krishna Reddy
Rebecca Schultz Zavin