Hi all,
The followings are Samsung S.LSI's requirement for unified memory manager.
1. User space API 1.1. New memory management(MM) features should includes followings to the user space.: UMP A. user space API for memory allocation from system memory: UMP Any user process can allocate memory from kernel space by new MM model. B. user space API for cache operations: flush, clean, invalidate Any user process can do cache operation on the allocated memory. C. user space API for mapping memory attribute as cacheable When the system memory mapped into the user space, user process can set its property as cacheable. D. user space API for mapping memory attribute as non-cacheable When the system memory mapped into the user space, user process can set its property as non-cacheable.
1.2. Inter-process memory sharing: UMP New MM features should provide memory sharing between user process.
A. Memory allocated by user space can be shared between user processes. B. Memory allocated by kernel space can be shared between user processes.
2. Kernel space API New MM features should includes followings to the kernel space.: CMA, VCMM
2-1. Physically memory allocator A. kernel space API for contiguous memory allocation: CMA(*) B. kernel space API for non-contiguous memory allocation: VCMM (*) C. start address alignment: CMA, VCMM D. selectable allocating region: CMA *refer to the bottom's extension.
2-2. Device virtual address management: VCMM New MM features should provide the way of managing device virtual memory address as like followings:
A. IOMMU(System MMU) support IOMMU is a kind of memory MMU, but IOMMU is dedicated for each device. B. device virtual address mapping for each device C. virtual memory allocation D. mapping / remapping between phys and device virtual address E. dedicated device virtual address space for each device F. address translation between address space
U.V / \ K.V - P.A \ / D.V
U.V: User space address K.A: Kernel space address P.A: Physical address D.V: Device virtual address
3. Extensions A. extension for custom physical memory allocator B. extension for custom MMU controller
------------------------------------------------------------------------- You can find the implementation in the following git repository. http://git.kernel.org/?p=linux/kernel/git/kki_ap/linux-2.6- samsung.git;a=tree;hb=refs/heads/2.6.36-samsung
1. UMP (Unified Memory Provider) - The UMP is an auxiliary component which enables memory to be shared across different applications, drivers and hardware components. - http://blogs.arm.com/multimedia/249-making-the-mali-gpu-device-driver- open-source/page__cid__133__show__newcomment/ - Suggested by ARM, Not submitted yet. - implementation drivers/media/video/samsung/ump/*
2. VCMM (Virtual Contiguous Memory Manager) - The VCMM is a framework to deal with multiple IOMMUs in a system with intuitive and abstract objects - Submitted by Michal Nazarewicz @Samsung-SPRC - Also submitted by KyongHo Cho @Samsung-SYS.LSI - http://article.gmane.org/gmane.linux.kernel.mm/56912/match=vcm - implementation include/linux/vcm.h include/linux/vcm-drv.h mm/vcm.c arch/arm/plat-s5p/s5p-vcm.c arch/amr/plat-s5p/include/plat/s5p-vcm.h
3. CMA (Contiguous Memory Allocator) - The Contiguous Memory Allocator (CMA) is a framework, which allows setting up a machine-specific configuration for physically-contiguous memory management. Memory for devices is then allocated according to that configuration. - http://lwn.net/Articles/396702/ - http://www.spinics.net/lists/linux-media/msg26486.html - Submitted by Michal Nazarewicz @Samsung-SPRC - implementation mm/cma.c include/linux/cma.h
4. SYS.MMU - System MMU supports address transition from VA to PA. - http://thread.gmane.org/gmane.linux.kernel.samsung-soc/3909 - Submitted by Sangbeom Kim - Merged by Kukjin Kim, ARM/S5P ARM ARCHITECTURES maintainer - implementation arch/arm/plat-s5p/sysmmu.c arch/arm/plat-s5p/include/plat/sysmmu.h
On Monday 25 April 2011, 이상현 wrote:
Hi all,
The followings are Samsung S.LSI's requirement for unified memory manager.
- User space API
1.1. New memory management(MM) features should includes followings to the user space.: UMP A. user space API for memory allocation from system memory: UMP Any user process can allocate memory from kernel space by new MM model.
I don't think it can be that simple, you need to at least enforce some limits about the amount of nonswappable memory to be allocated to each unpriviledged process, or you need to limit which processes are allowed to allocate these buffers.
The alternative would be that regular user memory (e.g. from tmpfs) can be used for this and is only pinned inside of a DRM ioctl while passed to the hardware and released when returning from that, but that would not allow any long-running mappings to be mapped into user space.
B. user space API for cache operations: flush, clean, invalidate Any user process can do cache operation on the allocated memory.
IMHO these should be on a much higher level. For an architecture independent API, you can not assume that cache management operations are available or necessary. I suggest building these on top of the dma mapping operations we have: dma_sync_to_device and dma_sync_to_cpu.
On fully coherent architectures, or when you have uncached mappings, these will simply be ignored, while for noncoherent cached mappings, they would turn into flush or invalidate cache operations.
Do you see any scenarios that cannot be built on top of these interfaces?
C. user space API for mapping memory attribute as cacheable When the system memory mapped into the user space, user process can set its property as cacheable. D. user space API for mapping memory attribute as non-cacheable When the system memory mapped into the user space, user process can set its property as non-cacheable.
Again, I would define these on a higher level. In the kernel, we use dma_alloc_coherent and dma_alloc_noncoherent, where the first one implies that you don't need to issue the dma_sync_* commands while the second one requires you to use them.
1.2. Inter-process memory sharing: UMP New MM features should provide memory sharing between user process.
A. Memory allocated by user space can be shared between user processes.
How is this different from regular posix shared memory?
B. Memory allocated by kernel space can be shared between user processes.
- Kernel space API
New MM features should includes followings to the kernel space.: CMA, VCMM
What does that mean? These are just TLAs, not specific features.
2-1. Physically memory allocator A. kernel space API for contiguous memory allocation: CMA(*) B. kernel space API for non-contiguous memory allocation: VCMM (*) C. start address alignment: CMA, VCMM D. selectable allocating region: CMA *refer to the bottom's extension.
You are confusing requirements with the implementation. You cannot require something that has been rejected as a patch. Let us discuss first what the requirement is, then how we can implement that.
2-2. Device virtual address management: VCMM New MM features should provide the way of managing device virtual memory address as like followings:
A. IOMMU(System MMU) support IOMMU is a kind of memory MMU, but IOMMU is dedicated for each device.
we have the iommu API for this. do you need anything beyond that?
B. device virtual address mapping for each device
this seems to refer to the dma-mapping api
C. virtual memory allocation
vmalloc?
D. mapping / remapping between phys and device virtual address
What is the difference to B?
E. dedicated device virtual address space for each device
This is a hardware feature that some systems have, but others have not. I guess what you mean is that we need to be able to deal with both kinds of hardware, right?
You can find the implementation in the following git repository. http://git.kernel.org/?p=linux/kernel/git/kki_ap/linux-2.6- samsung.git;a=tree;hb=refs/heads/2.6.36-samsung
- UMP (Unified Memory Provider)
- The UMP is an auxiliary component which enables memory to be shared across different applications, drivers and hardware components.
- http://blogs.arm.com/multimedia/249-making-the-mali-gpu-device-driver-
open-source/page__cid__133__show__newcomment/
- Suggested by ARM, Not submitted yet.
- implementation drivers/media/video/samsung/ump/*
Last time I looked at UMP, it was in a rather bad shape, and I would not consider this worth using or building on top of. Has there been progress on this front?
- VCMM (Virtual Contiguous Memory Manager)
- The VCMM is a framework to deal with multiple IOMMUs in a system with intuitive and abstract objects
- Submitted by Michal Nazarewicz @Samsung-SPRC
- Also submitted by KyongHo Cho @Samsung-SYS.LSI
- http://article.gmane.org/gmane.linux.kernel.mm/56912/match=vcm
- implementation include/linux/vcm.h include/linux/vcm-drv.h mm/vcm.c arch/arm/plat-s5p/s5p-vcm.c arch/amr/plat-s5p/include/plat/s5p-vcm.h
This really needs to be integrated into the existing APIs. The code looks clean and well documented, but it duplicates functionality that is already available elsewhere, which is something that Marek is already working on.
- CMA (Contiguous Memory Allocator)
- The Contiguous Memory Allocator (CMA) is a framework, which allows setting up a machine-specific configuration for physically-contiguous memory management. Memory for devices is then allocated according to that configuration.
- http://lwn.net/Articles/396702/
- http://www.spinics.net/lists/linux-media/msg26486.html
- Submitted by Michal Nazarewicz @Samsung-SPRC
- implementation mm/cma.c include/linux/cma.h
This should probably be hidden behind the dma-mapping API as well, instead of having a interface visible to the device drivers.
- SYS.MMU
- System MMU supports address transition from VA to PA.
- http://thread.gmane.org/gmane.linux.kernel.samsung-soc/3909
- Submitted by Sangbeom Kim
- Merged by Kukjin Kim, ARM/S5P ARM ARCHITECTURES maintainer
- implementation arch/arm/plat-s5p/sysmmu.c arch/arm/plat-s5p/include/plat/sysmmu.h
About to go away once replaced with standard iommu code, hopefully.
Arnd
-----Original Message----- From: Arnd Bergmann [mailto:arnd@arndb.de] Sent: Wednesday, April 27, 2011 12:43 AM To: linaro-mm-sig@lists.linaro.org; sanghyun75.lee@samsung.com; Marek Szyprowski Subject: Re: [Linaro-mm-sig] Requirements for Memory Management
On Monday 25 April 2011, 이상현 wrote:
Hi all,
The followings are Samsung S.LSI's requirement for unified memory
manager.
- User space API
1.1. New memory management(MM) features should includes followings to the user space.: UMP A. user space API for memory allocation from system memory: UMP Any user process can allocate memory from kernel space by new MM model.
I don't think it can be that simple, you need to at least enforce some limits about the amount of nonswappable memory to be allocated to each unpriviledged process, or you need to limit which processes are allowed to allocate these buffers.
The alternative would be that regular user memory (e.g. from tmpfs) can be used for this and is only pinned inside of a DRM ioctl while passed to the hardware and released when returning from that, but that would not allow any long-running mappings to be mapped into user space.
Hi, my name is Sanghyun Lee. Thanks for your comments.
As you wrote above, it can't be a simple work, but we need it. we need to map user allocated memory(malloc) to the device which has a SYSTEM MMU(= IOMMU). At this time, we think that new API at user space. So, we have allocated memory at system space and mapped it to the user space by new API. Because some project does not based on DRM/DRI and X window-system. But, IOMMU APIs will be helpful for this, I guess.
B. user space API for cache operations: flush, clean, invalidate Any user process can do cache operation on the allocated memory.
IMHO these should be on a much higher level. For an architecture independent API, you can not assume that cache management operations are available or necessary. I suggest building these on top of the dma mapping operations we have: dma_sync_to_device and dma_sync_to_cpu.
On fully coherent architectures, or when you have uncached mappings, these will simply be ignored, while for noncoherent cached mappings, they would turn into flush or invalidate cache operations.
Do you see any scenarios that cannot be built on top of these interfaces?
Most of the use case, there is no scenario call cache APIs at user space, except 3D. Actually, I don't know why the 3D requires this API to memory management at user side. As for this, I'll check about the use case.
C. user space API for mapping memory attribute as cacheable When the system memory mapped into the user space, user process can set its property as cacheable. D. user space API for mapping memory attribute as non-cacheable When the system memory mapped into the user space, user process can set its property as non-cacheable.
Again, I would define these on a higher level. In the kernel, we use dma_alloc_coherent and dma_alloc_noncoherent, where the first one implies that you don't need to issue the dma_sync_* commands while the second one requires you to use them.
1.2. Inter-process memory sharing: UMP New MM features should provide memory sharing between user process.
A. Memory allocated by user space can be shared between user processes.
How is this different from regular posix shared memory?
It means user allocated memory should be mapped to another user processes.
B. Memory allocated by kernel space can be shared between user processes.
- Kernel space API
New MM features should includes followings to the kernel space.: CMA, VCMM
What does that mean? These are just TLAs, not specific features.
We need two memory allocators, one is for physically contiguous memory. Another one is for physically discontiguous memory.
2-1. Physically memory allocator A. kernel space API for contiguous memory allocation: CMA(*) B. kernel space API for non-contiguous memory allocation: VCMM (*) C. start address alignment: CMA, VCMM D. selectable allocating region: CMA *refer to the bottom's extension.
You are confusing requirements with the implementation. You cannot require something that has been rejected as a patch. Let us discuss first what the requirement is, then how we can implement that.
These implementations are our reference for memory allocation.
2-2. Device virtual address management: VCMM New MM features should provide the way of managing device virtual memory address as like followings:
A. IOMMU(System MMU) support IOMMU is a kind of memory MMU, but IOMMU is dedicated for each device.
we have the iommu API for this. do you need anything beyond that?
B. device virtual address mapping for each device
this seems to refer to the dma-mapping api
C. virtual memory allocation
vmalloc?
D. mapping / remapping between phys and device virtual address
What is the difference to B?
E. dedicated device virtual address space for each device
This is a hardware feature that some systems have, but others have not. I guess what you mean is that we need to be able to deal with both kinds of hardware, right?
All your comments on IOMMU are clear and will be useful for us. Though IOMMU is a hardware features that only some system have, this features needs to be considered at Linaro@UDS.
--- You can find the implementation in the following git repository. http://git.kernel.org/?p=linux/kernel/git/kki_ap/linux-2.6- samsung.git;a=tree;hb=refs/heads/2.6.36-samsung
- UMP (Unified Memory Provider)
- The UMP is an auxiliary component which enables memory to be shared across different applications, drivers and hardware components.
http://blogs.arm.com/multimedia/249-making-the-mali-gpu-device-driver- open-source/page__cid__133__show__newcomment/
- Suggested by ARM, Not submitted yet.
- implementation drivers/media/video/samsung/ump/*
Last time I looked at UMP, it was in a rather bad shape, and I would not consider this worth using or building on top of. Has there been progress on this front?
IMHO it needs some modifications. We just used UMP with as-is conditions.
- VCMM (Virtual Contiguous Memory Manager)
- The VCMM is a framework to deal with multiple IOMMUs in a system with intuitive and abstract objects
- Submitted by Michal Nazarewicz @Samsung-SPRC
- Also submitted by KyongHo Cho @Samsung-SYS.LSI
- http://article.gmane.org/gmane.linux.kernel.mm/56912/match=vcm
- implementation include/linux/vcm.h include/linux/vcm-drv.h mm/vcm.c arch/arm/plat-s5p/s5p-vcm.c arch/amr/plat-s5p/include/plat/s5p-vcm.h
This really needs to be integrated into the existing APIs. The code looks clean and well documented, but it duplicates functionality that is already available elsewhere, which is something that Marek is already working on.
I heard that Zach will present this issues on UDS.
- CMA (Contiguous Memory Allocator)
- The Contiguous Memory Allocator (CMA) is a framework, which allows setting up a machine-specific configuration for physically-contiguous memory management. Memory for devices is then allocated according to that configuration.
- http://lwn.net/Articles/396702/
- http://www.spinics.net/lists/linux-media/msg26486.html
- Submitted by Michal Nazarewicz @Samsung-SPRC
- implementation mm/cma.c include/linux/cma.h
This should probably be hidden behind the dma-mapping API as well, instead of having a interface visible to the device drivers.
As to this, Marek (from Samsung's Poland) will present CMA at UDS.
- SYS.MMU
- System MMU supports address transition from VA to PA.
- http://thread.gmane.org/gmane.linux.kernel.samsung-soc/3909
- Submitted by Sangbeom Kim
- Merged by Kukjin Kim, ARM/S5P ARM ARCHITECTURES maintainer
- implementation arch/arm/plat-s5p/sysmmu.c arch/arm/plat-s5p/include/plat/sysmmu.h
About to go away once replaced with standard iommu code, hopefully.
Arnd
- User space API
1.1. New memory management(MM) features should includes followings to the user space.: UMP A. user space API for memory allocation from system memory: UMP Any user process can allocate memory from kernel space by new MM model.
I don't think it can be that simple, you need to at least enforce some limits about the amount of nonswappable memory to be allocated to each unpriviledged process, or you need to limit which processes are allowed to allocate these buffers.
The alternative would be that regular user memory (e.g. from tmpfs) can be used for this and is only pinned inside of a DRM ioctl while passed to the hardware and released when returning from that, but that would not allow any long-running mappings to be mapped into user space.
Hi, my name is Sanghyun Lee. Thanks for your comments.
As you wrote above, it can't be a simple work, but we need it. we need to map user allocated memory(malloc) to the device which has a SYSTEM MMU(= IOMMU). At this time, we think that new API at user space. So, we have allocated memory at system space and mapped it to the user space by new API. Because some project does not based on DRM/DRI and X window-system. But, IOMMU APIs will be helpful for this, I guess.
There are a couple of things to keep in mind about IOMMU mappings of malloc'd memory. The first is that the demand paging system that supports malloc doesn't guarantee anything about the number and type of mappings used to satisfy a request. This can cause performance problems. With page alignment the few entries available in the TLBs will be quickly used up. On a TLB miss the IOMMU will stall the devices behind it while it pulls the new mapping into the TLB. If these mappings are mapped with 1 MB mappings instead of 4 KB mappings (page sized) then these misses happen much less frequently. This fine grained control means you really need a device aware driver that knows how memory should be mapped for a given device. Everyones been writing these, but no ones agreed on a common interface.
The other thing about malloc'd memory is that the kernel doesn't actually commit any physical memory to the request until someone writes to it (copy-on-write). On a write the kernel will fault in an anonymous page whether malloc use sbrk or mmap(). What this means is that an IOMMU map can't be set up until after the page has been mapped in. I can certainly see the allure of malloc'd memory to IOMMU mapping, but unless people are careful it can cause many problems and may limit performance. Plus it doesn't handle permission - the fd approach might handle that. Maybe we just collect all the drivers and all agree on a set of IOCTLs?
On Sunday 01 May 2011 20:11:59 Zach Pfeffer wrote:
- User space API
1.1. New memory management(MM) features should includes followings to the user space.: UMP A. user space API for memory allocation from system memory: UMP Any user process can allocate memory from kernel space by new MM model.
I don't think it can be that simple, you need to at least enforce some limits about the amount of nonswappable memory to be allocated to each unpriviledged process, or you need to limit which processes are allowed to allocate these buffers.
The alternative would be that regular user memory (e.g. from tmpfs) can be used for this and is only pinned inside of a DRM ioctl while passed to the hardware and released when returning from that, but that would not allow any long-running mappings to be mapped into user space.
Hi, my name is Sanghyun Lee. Thanks for your comments.
As you wrote above, it can't be a simple work, but we need it. we need to map user allocated memory(malloc) to the device which has a SYSTEM MMU(= IOMMU). At this time, we think that new API at user space. So, we have allocated memory at system space and mapped it to the user space by new API. Because some project does not based on DRM/DRI and X window-system. But, IOMMU APIs will be helpful for this, I guess.
There are a couple of things to keep in mind about IOMMU mappings of malloc'd memory. The first is that the demand paging system that supports malloc doesn't guarantee anything about the number and type of mappings used to satisfy a request. This can cause performance problems. With page alignment the few entries available in the TLBs will be quickly used up. On a TLB miss the IOMMU will stall the devices behind it while it pulls the new mapping into the TLB. If these mappings are mapped with 1 MB mappings instead of 4 KB mappings (page sized) then these misses happen much less frequently. This fine grained control means you really need a device aware driver that knows how memory should be mapped for a given device. Everyones been writing these, but no ones agreed on a common interface.
The other thing about malloc'd memory is that the kernel doesn't actually commit any physical memory to the request until someone writes to it (copy-on-write). On a write the kernel will fault in an anonymous page whether malloc use sbrk or mmap(). What this means is that an IOMMU map can't be set up until after the page has been mapped in.
The OMAP3 ISP (V4L2) driver supports that. It will call get_user_pages() on the buffer passed by userspace, so pages will be faulted in at that point.
I can certainly see the allure of malloc'd memory to IOMMU mapping, but unless people are careful it can cause many problems and may limit performance. Plus it doesn't handle permission - the fd approach might handle that. Maybe we just collect all the drivers and all agree on a set of IOCTLs?
On Tuesday 03 May 2011, Laurent Pinchart wrote:
The other thing about malloc'd memory is that the kernel doesn't actually commit any physical memory to the request until someone writes to it (copy-on-write). On a write the kernel will fault in an anonymous page whether malloc use sbrk or mmap(). What this means is that an IOMMU map can't be set up until after the page has been mapped in.
The OMAP3 ISP (V4L2) driver supports that. It will call get_user_pages() on the buffer passed by userspace, so pages will be faulted in at that point.
A lot of drivers do that, e.g. all of infiniband is built around this. It's not black magic, but I would usually prefer allocating memory in kernel to pass it to user space to avoid a lot of the potential issues.
For instance, when memory is already mapped into user space, you should no longer change its page attributes (cached/write-combined/...) and you have to deal with arbitrarily strange user pointers passed into the kernel.
Arnd
On 05/03/2011 09:35 AM, Arnd Bergmann wrote:
On Tuesday 03 May 2011, Laurent Pinchart wrote:
The other thing about malloc'd memory is that the kernel doesn't actually commit any physical memory to the request until someone writes to it (copy-on-write). On a write the kernel will fault in an anonymous page whether malloc use sbrk or mmap(). What this means is that an IOMMU map can't be set up until after the page has been mapped in.
The OMAP3 ISP (V4L2) driver supports that. It will call get_user_pages() on the buffer passed by userspace, so pages will be faulted in at that point.
A lot of drivers do that, e.g. all of infiniband is built around this. It's not black magic, but I would usually prefer allocating memory in kernel to pass it to user space to avoid a lot of the potential issues.
For instance, when memory is already mapped into user space, you should no longer change its page attributes (cached/write-combined/...) and you have to deal with arbitrarily strange user pointers passed into the kernel.
I would venture a guess that most of these graphics solutions that use user space based memory do so because there was no suitable kernel level API available at the time.
I've been trying to look at use cases and outside of the possibility of using shmem backed memory, I'm not really seeing anything that wouldn't be easier and safer with the memory allocated in the kernel.
It will have to be a longer term goal of this group that whatever solution we use comes with the necessary EGL extensions to facilitate creating and using the shared memory (e.g. EGL_MESA_drm_image) so that people will have less need to create their own one-off EGL image extensions to pass about questionable memory.
Jordan
On Sunday 01 May 2011, 이상현 wrote:
From: Arnd Bergmann [mailto:arnd@arndb.de] On Monday 25 April 2011, 이상현 wrote:
- User space API
1.1. New memory management(MM) features should includes followings to the user space.: UMP A. user space API for memory allocation from system memory: UMP Any user process can allocate memory from kernel space by new MM model.
I don't think it can be that simple, you need to at least enforce some limits about the amount of nonswappable memory to be allocated to each unpriviledged process, or you need to limit which processes are allowed to allocate these buffers.
The alternative would be that regular user memory (e.g. from tmpfs) can be used for this and is only pinned inside of a DRM ioctl while passed to the hardware and released when returning from that, but that would not allow any long-running mappings to be mapped into user space.
As you wrote above, it can't be a simple work, but we need it. we need to map user allocated memory(malloc) to the device which has a SYSTEM MMU(= IOMMU). At this time, we think that new API at user space. So, we have allocated memory at system space and mapped it to the user space by new API. Because some project does not based on DRM/DRI and X window-system. But, IOMMU APIs will be helpful for this, I guess.
The iommu API solves some of the low-level problems in the kernel, which we need to solve but the user space interface is another problem.
For the user space interface, my feeling is that DRM should be the first choice, and we'd need a really good reason to come up with something different. As you said, some projects today use DRM, while others don't. It's probably far less work to convert only the ones that don't use DRM to use it, rather than changing all applications that interact with the graphics subsystem to use something completely new.
C. user space API for mapping memory attribute as cacheable When the system memory mapped into the user space, user process can set its property as cacheable. D. user space API for mapping memory attribute as non-cacheable When the system memory mapped into the user space, user process can set its property as non-cacheable.
Again, I would define these on a higher level. In the kernel, we use dma_alloc_coherent and dma_alloc_noncoherent, where the first one implies that you don't need to issue the dma_sync_* commands while the second one requires you to use them.
1.2. Inter-process memory sharing: UMP New MM features should provide memory sharing between user process.
A. Memory allocated by user space can be shared between user processes.
How is this different from regular posix shared memory?
It means user allocated memory should be mapped to another user processes.
Oh. Why would you do that? I recommend being extremely careful with what kind of memory you allow to be shared. There are a lot of possibilities for deadlocks and other problems with certain kinds of memory. Things that I can see going really bad include:
* an mmapped file on a remote file system like NFS * hardware memory mapped into one process using strange attributes (huge pages, noncacheable, ...) * memory that a process has mapped from another process using the same interface. * raw sockets
Rather than having a blacklist that forbids the really awkward cases, I would make a very short whitelist of stuff that can back memory that you allow to be shared, e.g. only tmpfs mappings.
B. Memory allocated by kernel space can be shared between user processes.
- Kernel space API
New MM features should includes followings to the kernel space.: CMA, VCMM
What does that mean? These are just TLAs, not specific features.
We need two memory allocators, one is for physically contiguous memory. Another one is for physically discontiguous memory.
No. We already have allocators for these, they are called alloc_pages() and vmalloc() ;-)
Doing new infrastructure of course is extremely hard and introducing a completely new in-kernel interface for anything is usually a bad idea. Most importantly, when you design something new, it should fit into the related APIs that are already there, to make it easier for people to use the new infrastructure.
When there are deficiencies with the interfaces that are already there, we should put all the effort into fixing these interfaces, rather than doing something new. It's clear from the discussions that we had so far that the dma-mapping interface is not enough, but I think that is totally fixable, and I definitely encourage anyone to point out flaws and come up with patches to make that better.
In case of VCMM, have a look at how the powerpc implementation integrates virtually contiguous allocations into the dma-mapping API, at arch/powerpc/kernel/iommu.c.
For CMA, I've looked at the patches again, and they clearly do far too much at once. I agree that we need something better to back contiguous allocations, especially when we want them to be outside of the kernel linear mapping. See the discussion on "Memory region attribute bits and multiple mappings" for this. Once we have an agreement on where the memory should come from (reserved at boot time as in CMA, unmapped from the linear mapping, or from highmem), we can start thinking about to to assign the memory to devices, with an API that works both for the IOMMU and linear cases.
The question of how to export buffers to user space is completely unrelated to this, and should not at all be part of the allocation code. This belongs into the subsystems that deal with the memory (videobuf2 and DRM), and it's hard enough to get those to work together.
Arnd
Hello,
On Monday, May 02, 2011 10:23 PM Arnd Bergmann wrote:
(snipped)
For CMA, I've looked at the patches again, and they clearly do far too much at once. I agree that we need something better to back contiguous allocations, especially when we want them to be outside of the kernel linear mapping.
Please note that main (core) CMA functionality is just the following 4 functions:
cma_create()/cma_destroy() for creating CMA memory area and cm_alloc()/cm_free() for allocating and freeing a buffer.
cma-region.h API and respective vb2 allocator has been provided only for testing purposes as a drop in replacement for our older solution. It allowed us to test the latest version with multimedia drivers on a real hardware.
Best regards
linaro-mm-sig@lists.linaro.org