Hello eveyone,
On Linaro Memory Management meeting in Budapest (May 2011) we have
discussed about the design of DMA mapping framework. We tried to
identify the drawbacks and limitations as well as to provide some a
solution for them. The discussion was mainly about ARM architecture, but
some of the conclusions need to be applied to cross-architecture code.
The first issue we identified is the fact that on some platform (again,
mainly ARM) there are several functions for allocating DMA buffers:
dma_alloc_coherent, dma_alloc_writecombine and dma_alloc_noncoherent
(not functional now). For each of them there is a match dma_free_*
function. This gives us quite a lot of functions in the public API and
complicates things when we need to have several different
implementations for different devices selected in runtime (if IOMMU
controller is available only for a few devices in the system). Also the
drivers which use less common variants are less portable because of the
lacks of dma_alloc_writecombine on other architectures.
The solution we found is to introduce a new public dma mapping functions
with additional attributes argument: dma_alloc_attrs and
dma_free_attrs(). This way all different kinds of architecture specific
buffer mappings can be hidden behind the attributes without the need of
creating several versions of dma_alloc_ function.
dma_alloc_coherent() can be wrapped on top of new dma_alloc_attrs() with
NULL attrs parameter. dma_alloc_writecombine and dma_alloc_noncoherent
can be implemented as a simple wrappers which sets attributes to
DMA_ATTRS_WRITECOMBINE or DMA_ATTRS_NON_CONSISTENT respectively. These
new attributes will be implemented only on the architectures that really
support them, the others will simply ignore them defaulting to the
dma_alloc_coherent equivalent.
The next step in dma mapping framework update is the introduction of
dma_mmap/dma_mmap_attrs() function. There are a number of drivers
(mainly V4L2 and ALSA) that only exports the DMA buffers to user space.
Creating a userspace mapping with correct page attributes is not an easy
task for the driver. Also the DMA-mapping framework is the only place
where the complete information about the allocated pages is available,
especially if the implementation uses IOMMU controller to provide a
contiguous buffer in DMA address space which is scattered in physical
memory space.
Usually these drivers don't touch the buffer data at all, so the mapping
in kernel virtual address space is not needed. We can introduce
DMA_ATTRIB_NO_KERNEL_MAPPING attribute which lets kernel to skip/ignore
creation of kernel virtual mapping. This way we can save previous
vmalloc area and simply some mapping operation on a few architectures.
This patch series is a preparation for the above changes in the public
dma mapping API. The main goal is to modify dma_map_ops structure and
let all users to use for implementation of the new public funtions.
The proof-of-concept patches for ARM architecture have been already
posted a few times and now they are working resonably well. They perform
conversion to dma_map_ops based implementation and add support for
generic IOMMU-based dma mapping implementation. To get them merged we
first need to get acceptance for the changes in the common,
cross-architecture structures. More information about these patches can
be found in the following threads:
http://www.spinics.net/lists/linux-mm/msg19856.htmlhttp://www.spinics.net/lists/linux-mm/msg21241.htmlhttp://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000571.htmlhttp://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000577.htmlhttp://www.spinics.net/lists/linux-mm/msg25490.html
The patches are prepared on top of Linux Kernel v3.2-rc6. I would
appreciate any comments and help with getting this patch series into
linux-next tree.
The idea apllied in this patch set have been also presented during the
Kernel Summit 2011 and ELC-E 2011 in Prague, in the presentation 'ARM
DMA-Mapping Framework Redesign and IOMMU integration'.
I'm really sorry if I missed any of the relevant architecture mailing
lists. I've did my best to include everyone. Feel free to forward this
patchset to all interested developers and maintainers. I've already feel
like a nasty spammer.
Best regards
Marek Szyprowski
Samsung Poland R&D Center
Patch summary:
Andrzej Pietrasiewicz (9):
X86: adapt for dma_map_ops changes
MIPS: adapt for dma_map_ops changes
PowerPC: adapt for dma_map_ops changes
IA64: adapt for dma_map_ops changes
SPARC: adapt for dma_map_ops changes
Alpha: adapt for dma_map_ops changes
SH: adapt for dma_map_ops changes
Microblaze: adapt for dma_map_ops changes
Unicore32: adapt for dma_map_ops changes
Marek Szyprowski (5):
common: dma-mapping: introduce alloc_attrs and free_attrs methods
common: dma-mapping: remove old alloc_coherent and free_coherent
methods
common: dma-mapping: introduce mmap method
common: DMA-mapping: add WRITE_COMBINE attribute
common: DMA-mapping: add NON-CONSISTENT attribute
Documentation/DMA-attributes.txt | 19 +++++++++++++++++++
arch/alpha/include/asm/dma-mapping.h | 18 ++++++++++++------
arch/alpha/kernel/pci-noop.c | 10 ++++++----
arch/alpha/kernel/pci_iommu.c | 10 ++++++----
arch/ia64/hp/common/sba_iommu.c | 11 ++++++-----
arch/ia64/include/asm/dma-mapping.h | 18 ++++++++++++------
arch/ia64/kernel/pci-swiotlb.c | 9 +++++----
arch/ia64/sn/pci/pci_dma.c | 9 +++++----
arch/microblaze/include/asm/dma-mapping.h | 18 ++++++++++++------
arch/microblaze/kernel/dma.c | 10 ++++++----
arch/mips/include/asm/dma-mapping.h | 18 ++++++++++++------
arch/mips/mm/dma-default.c | 8 ++++----
arch/powerpc/include/asm/dma-mapping.h | 24 ++++++++++++++++--------
arch/powerpc/kernel/dma-iommu.c | 10 ++++++----
arch/powerpc/kernel/dma-swiotlb.c | 4 ++--
arch/powerpc/kernel/dma.c | 10 ++++++----
arch/powerpc/kernel/ibmebus.c | 10 ++++++----
arch/powerpc/platforms/cell/iommu.c | 16 +++++++++-------
arch/powerpc/platforms/ps3/system-bus.c | 13 +++++++------
arch/sh/include/asm/dma-mapping.h | 28 ++++++++++++++++++----------
arch/sh/kernel/dma-nommu.c | 4 ++--
arch/sh/mm/consistent.c | 6 ++++--
arch/sparc/include/asm/dma-mapping.h | 18 ++++++++++++------
arch/sparc/kernel/iommu.c | 10 ++++++----
arch/sparc/kernel/ioport.c | 18 ++++++++++--------
arch/sparc/kernel/pci_sun4v.c | 9 +++++----
arch/unicore32/include/asm/dma-mapping.h | 18 ++++++++++++------
arch/unicore32/mm/dma-swiotlb.c | 4 ++--
arch/x86/include/asm/dma-mapping.h | 26 ++++++++++++++++----------
arch/x86/kernel/amd_gart_64.c | 11 ++++++-----
arch/x86/kernel/pci-calgary_64.c | 9 +++++----
arch/x86/kernel/pci-dma.c | 3 ++-
arch/x86/kernel/pci-nommu.c | 6 +++---
arch/x86/kernel/pci-swiotlb.c | 12 +++++++-----
arch/x86/xen/pci-swiotlb-xen.c | 4 ++--
drivers/iommu/amd_iommu.c | 10 ++++++----
drivers/iommu/intel-iommu.c | 9 +++++----
drivers/xen/swiotlb-xen.c | 5 +++--
include/linux/dma-attrs.h | 2 ++
include/linux/dma-mapping.h | 13 +++++++++----
include/linux/swiotlb.h | 6 ++++--
include/xen/swiotlb-xen.h | 6 ++++--
lib/swiotlb.c | 5 +++--
43 files changed, 305 insertions(+), 182 deletions(-)
--
1.7.1.569.g6f426
Hello Everyone,
Post some discussion as an RFC, here is the patch for introducing
DMA buffer sharing mechanism - change history is in the changelog below.
Various subsystems - V4L2, GPU-accessors, DRI to name a few - have felt the
need to have a common mechanism to share memory buffers across different
devices - ARM, video hardware, GPU.
This need comes forth from a variety of use cases including cameras, image
processing, video recorders, sound processing, DMA engines, GPU and display
buffers, amongst others.
This patch attempts to define such a buffer sharing mechanism - it is the
result of discussions from a couple of memory-management mini-summits held by
Linaro to understand and address common needs around memory management. [1]
A new dma_buf buffer object is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- a new buffer object to be created with fixed size, associated with a file
pointer and allocator-defined operations for this buffer object. This
operation is called the 'export' operation.
- different devices to 'attach' themselves to this buffer object, to facilitate
backing storage negotiation, using dma_buf_attach() API.
- this exported buffer object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the buffer object's scatterlist using
map_dma_buf and unmap_dma_buf operations.
Documentation present in the patch-set gives more details.
For 1st version, dma-buf is marked as an EXPERIMENTAL driver, which we can
remove for later versions with additional usage and testing.
*IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
For this first version, A buffer shared using the dma_buf sharing API:
- *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
this framework.
- may be used *ONLY* by importers that do not need CPU access to the buffer.
This is based on design suggestions from many people at the mini-summits,
most notably from Arnd Bergmann <arnd(a)arndb.de>, Rob Clark <rob(a)ti.com> and
Daniel Vetter <daniel(a)ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws(a)samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
Some sample implementations and WIP for dma-buf users and exporters are
available at [3] and [4]. [These are not being submitted for discussion /
inclusion right now, but are for reference only]
References:
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
[3]: Dave Airlie's prime support:
http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-prime-dmabuf
[4]: Rob Clark's sharing between DRM and V4L2:
https://github.com/robclark/kernel-omap4/commits/drmplane-dmabuf
Patchset based on top of 3.2-rc7, the current version can be found at
http://git.linaro.org/gitweb?p=people/sumitsemwal/linux-3.x.git
Branch: dmabuf-patch-v1
Earlier versions:
RFC:
v3 at: https://lkml.org/lkml/2011/12/19/50
v2 at: https://lkml.org/lkml/2011/12/2/53
v1 at: https://lkml.org/lkml/2011/10/11/92
Wish you all happy vacations and a very happy, joyous and prosperous new year
2012 :)
Best regards,
~Sumit Semwal
History:
v4:
- Review comments incorporated:
- from Konrad Rzeszutek Wilk [https://lkml.org/lkml/2011/12/20/209]
- corrected language in some comments
- re-ordered struct definitions for readability
- added might_sleep() call in dma_buf_map_attachment() wrapper
- from Rob Clark [https://lkml.org/lkml/2011/12/23/196]
- Made dma-buf EXPERIMENTAL for 1st version.
v3:
- Review comments incorporated:
- from Konrad Rzeszutek Wilk [https://lkml.org/lkml/2011/12/3/45]
- replaced BUG_ON with WARN_ON - various places
- added some error-checks
- replaced EXPORT_SYMBOL with EXPORT_SYMBOL_GPL
- some cosmetic / documentation comments
- from Arnd Bergmann, Daniel Vetter, Rob Clark
[https://lkml.org/lkml/2011/12/5/321]
- removed mmap() fop and dma_buf_op, also the sg_sync* operations, and
documented that mmap is not allowed for exported buffer
- updated documentation to clearly state when migration is allowed
- changed kconfig
- some error code checks
- from Rob Clark [https://lkml.org/lkml/2011/12/5/572]
- update documentation to allow map_dma_buf to return -EINTR
v2:
- Review comments incorporated:
- from Tomasz Stanislawski [https://lkml.org/lkml/2011/10/14/136]
- kzalloc moved out of critical section
- corrected some in-code comments
- from Dave Airlie [https://lkml.org/lkml/2011/11/25/123]
- from Daniel Vetter and Rob Clark [https://lkml.org/lkml/2011/11/26/53]
- use struct sg_table in place of struct scatterlist
- rename {get,put}_scatterlist to {map,unmap}_dma_buf
- add new wrapper APIs dma_buf_{map,unmap}_attachment for ease of users
- documentation updates as per review comments from Randy Dunlap
[https://lkml.org/lkml/2011/10/12/439]
v1: original
Sumit Semwal (3):
dma-buf: Introduce dma buffer sharing mechanism
dma-buf: Documentation for buffer sharing framework
dma-buf: mark EXPERIMENTAL for 1st release.
Documentation/dma-buf-sharing.txt | 224 ++++++++++++++++++++++++++++
drivers/base/Kconfig | 11 ++
drivers/base/Makefile | 1 +
drivers/base/dma-buf.c | 291 +++++++++++++++++++++++++++++++++++++
include/linux/dma-buf.h | 176 ++++++++++++++++++++++
5 files changed, 703 insertions(+), 0 deletions(-)
create mode 100644 Documentation/dma-buf-sharing.txt
create mode 100644 drivers/base/dma-buf.c
create mode 100644 include/linux/dma-buf.h
--
1.7.5.4
Hello,
This is another update on my attempt on DMA-mapping framework redesign
for ARM architecture. It includes a few minor changes since last
version. We have focused mainly on IOMMU mapper, keeping the DMA-mapping
redesign patches almost unchanged.
All patches have been now rebased onto v3.2-rc4 kernel + IOMMU/next
branch to include latest changes from IOMMU kernel tree.
This series also contains support for mapping with pages larger than
4KiB using new, extended IOMMU API. This code has been provided by
Andrzej Pietrasiewicz.
All the code has been tested on Samsung Exynos4 'UniversalC210' board
with IOMMU driver posted by KyongHo Cho.
GIT tree will all the patches (including some Samsung Exynos4 stuff):
http://git.infradead.org/users/kmpark/linux-samsung/shortlog/refs/heads/3.2…git://git.infradead.org/users/kmpark/linux-samsung 3.2-rc4-dma-v5-samsung
History:
Initial version of the DMA-mapping redesign patches:
http://www.spinics.net/lists/linux-mm/msg21241.html
Second version of the patches:
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000571.htmlhttp://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000577.html
Third version of the patches:
http://www.spinics.net/lists/linux-mm/msg25490.html
TODO:
- start the discussion about chaning alloc_coherent into alloc_attrs in
dma_map_ops structure.
- start the discussion about dma_mmap function
- provide documentation for the new dma attributes
Best regards
--
Marek Szyprowski
Samsung Poland R&D Center
Patch summary:
Marek Szyprowski (8):
ARM: dma-mapping: remove offset parameter to prepare for generic
dma_ops
ARM: dma-mapping: use asm-generic/dma-mapping-common.h
ARM: dma-mapping: implement dma sg methods on top of any generic dma
ops
ARM: dma-mapping: move all dma bounce code to separate dma ops
structure
ARM: dma-mapping: remove redundant code and cleanup
common: dma-mapping: change alloc/free_coherent method to more
generic alloc/free_attrs
ARM: dma-mapping: use alloc, mmap, free from dma_ops
ARM: initial proof-of-concept IOMMU mapper for DMA-mapping
arch/arm/Kconfig | 9 +
arch/arm/common/dmabounce.c | 78 +++-
arch/arm/include/asm/device.h | 4 +
arch/arm/include/asm/dma-iommu.h | 36 ++
arch/arm/include/asm/dma-mapping.h | 404 +++++------------
arch/arm/mm/dma-mapping.c | 899 ++++++++++++++++++++++++++++++------
arch/arm/mm/vmregion.h | 2 +-
include/linux/dma-attrs.h | 1 +
include/linux/dma-mapping.h | 13 +-
9 files changed, 994 insertions(+), 452 deletions(-)
create mode 100644 arch/arm/include/asm/dma-iommu.h
--
1.7.1.569.g6f426
Welcome everyone last time in 2011!
We finally managed to finish (yet) another release of the Contiguous
Memory Allocator patches. This version resolves a lot of issues reported
in the previous version and improves the reliability of the memory
allocation.
The most important changes are code cleanup after a review from Mel
Gorman, fixing of the anoying bugs (like HIGHMEM crash on ARM) and
adding allocation retry procedure in case of temporary migration fail.
This version should finally solve all the issues that are a result of
changing the migration code base from memory hotplug to memory
compaction.
ARM integration code has not been changed since last two versions, it
provides implementation of all the ideas that has been discussed during
Linaro Sprint meeting. Here are the details:
This version provides a solution for complete integration of CMA to
DMA mapping subsystem on ARM architecture. The issue caused by double
dma pages mapping and possible aliasing in coherent memory mapping has
been finally resolved, both for GFP_ATOMIC case (allocations comes from
coherent memory pool) and non-GFP_ATOMIC case (allocations comes from
CMA managed areas).
For coherent, nommu, ARMv4 and ARMv5 systems the current DMA-mapping
implementation has been kept.
For ARMv6+ systems, CMA has been enabled and a special pool of coherent
memory for atomic allocations has been created. The size of this pool
defaults to DEFAULT_CONSISTEN_DMA_SIZE/8, but can be changed with
coherent_pool kernel parameter (if really required).
All atomic allocations are served from this pool. I've did a little
simplification here, because there is no separate pool for writecombine
memory - such requests are also served from coherent pool. I don't
think that such simplification is a problem here - I found no driver
that use dma_alloc_writecombine with GFP_ATOMIC flags.
All non-atomic allocation are served from CMA area. Kernel mapping is
updated to reflect required memory attributes changes. This is possible
because during early boot, all CMA area are remapped with 4KiB pages in
kernel low-memory.
This version have been tested on Samsung S5PC110 based Goni machine and
Exynos4 UniversalC210 board with various V4L2 multimedia drivers.
Coherent atomic allocations has been tested by manually enabling the dma
bounce for the s3c-sdhci device.
All patches are prepared for Linux Kernel v3.2-rc7.
A few words for these who see CMA for the first time:
The Contiguous Memory Allocator (CMA) makes it possible for device
drivers to allocate big contiguous chunks of memory after the system
has booted.
The main difference from the similar frameworks is the fact that CMA
allows to transparently reuse memory region reserved for the big
chunk allocation as a system memory, so no memory is wasted when no
big chunk is allocated. Once the alloc request is issued, the
framework will migrate system pages to create a required big chunk of
physically contiguous memory.
For more information you can refer to nice LWN articles:
http://lwn.net/Articles/447405/ and http://lwn.net/Articles/450286/
as well as links to previous versions of the CMA framework.
The CMA framework has been initially developed by Michal Nazarewicz
at Samsung Poland R&D Center. Since version 9, I've taken over the
development, because Michal has left the company.
TODO (optional):
- implement support for contiguous memory areas placed in HIGHMEM zone
- resolve issue with movable pages with pending io operations
I would also like to with everyone a Happy New Year! See You again in
2012!
Best regards
Marek Szyprowski
Samsung Poland R&D Center
Links to previous versions of the patchset:
v17: <http://www.spinics.net/lists/arm-kernel/msg148499.html>
v16: <http://www.spinics.net/lists/linux-mm/msg25066.html>
v15: <http://www.spinics.net/lists/linux-mm/msg23365.html>
v14: <http://www.spinics.net/lists/linux-media/msg36536.html>
v13: (internal, intentionally not released)
v12: <http://www.spinics.net/lists/linux-media/msg35674.html>
v11: <http://www.spinics.net/lists/linux-mm/msg21868.html>
v10: <http://www.spinics.net/lists/linux-mm/msg20761.html>
v9: <http://article.gmane.org/gmane.linux.kernel.mm/60787>
v8: <http://article.gmane.org/gmane.linux.kernel.mm/56855>
v7: <http://article.gmane.org/gmane.linux.kernel.mm/55626>
v6: <http://article.gmane.org/gmane.linux.kernel.mm/55626>
v5: (intentionally left out as CMA v5 was identical to CMA v4)
v4: <http://article.gmane.org/gmane.linux.kernel.mm/52010>
v3: <http://article.gmane.org/gmane.linux.kernel.mm/51573>
v2: <http://article.gmane.org/gmane.linux.kernel.mm/50986>
v1: <http://article.gmane.org/gmane.linux.kernel.mm/50669>
Changelog:
v18:
1. Addressed comments and suggestions from Mel Godman related to changes
in memory compaction code, most important points:
- removed "mm: page_alloc: handle MIGRATE_ISOLATE in free_pcppages_bulk()"
and moved all the logic to set_migratetype_isolate - see
"mm: page_alloc: set_migratetype_isolate: drain PCP prior to isolating"
patch
- code in "mm: compaction: introduce isolate_{free,migrate}pages_range()"
patch have been simplified and improved
- removed "mm: mmzone: introduce zone_pfn_same_memmap()" patch
2. Fixed crash on initialization if HIGHMEM is available on ARM platforms
3. Fixed problems with allocation of contiguous memory if all free pages
are occupied by page cache and reclaim is required.
4. Added a workaround for temporary migration failures (now CMA tries
to allocate different memory block in such case), what heavily increased
reliability of the CMA.
5. Minor cleanup here and there.
6. Rebased onto v3.2-rc7 kernel tree.
v17:
1. Replaced whole CMA core memory migration code to the new one kindly
provided by Michal Nazarewicz. The new code is based on memory
compaction framework not the memory hotplug, like it was before. This
change has been suggested by Mel Godman.
2. Addressed most of the comments from Andrew Morton and Mel Gorman in
the rest of the CMA code.
3. Fixed broken initialization on ARM systems with DMA zone enabled.
4. Rebased onto v3.2-rc2 kernel.
v16:
1. merged a fixup from Michal Nazarewicz to address comments from Dave
Hansen about checking if pfns belong to the same memory zone
2. merged a fix from Michal Nazarewicz for incorrect handling of pages
which belong to page block that is in MIGRATE_ISOLATE state, in very
rare cases the migrate type of page block might have been changed
from MIGRATE_CMA to MIGRATE_MOVABLE because of this bug
3. moved some common code to include/asm-generic
4. added support for x86 DMA-mapping framework for pci-dma hardware,
CMA can be now even more widely tested on KVM/QEMU and a lot of common
x86 boxes
5. rebased onto next-20111005 kernel tree, which includes changes in ARM
DMA-mapping subsystem (CONSISTENT_DMA_SIZE removal)
6. removed patch for CMA s5p-fimc device private regions (served only as
example) and provided the one that matches real life case - s5p-mfc
device
v15:
1. fixed calculation of the total memory after activating CMA area (was
broken from v12)
2. more code cleanup in drivers/base/dma-contiguous.c
3. added address limit for default CMA area
4. rewrote ARM DMA integration:
- removed "ARM: DMA: steal memory for DMA coherent mappings" patch
- kept current DMA mapping implementation for coherent, nommu and
ARMv4/ARMv5 systems
- enabled CMA for all ARMv6+ systems
- added separate, small pool for coherent atomic allocations, defaults
to CONSISTENT_DMA_SIZE/8, but can be changed with kernel parameter
coherent_pool=[size]
v14:
1. Merged with "ARM: DMA: steal memory for DMA coherent mappings"
patch, added support for GFP_ATOMIC allocations.
2. Added checks for NULL device pointer
v13: (internal, intentionally not released)
v12:
1. Fixed 2 nasty bugs in dma-contiguous allocator:
- alignment argument was not passed correctly
- range for dma_release_from_contiguous was not checked correctly
2. Added support for architecture specfic dma_contiguous_early_fixup()
function
3. CMA and DMA-mapping integration for ARM architechture has been
rewritten to take care of the memory aliasing issue that might
happen for newer ARM CPUs (mapping of the same pages with different
cache attributes is forbidden). TODO: add support for GFP_ATOMIC
allocations basing on the "ARM: DMA: steal memory for DMA coherent
mappings" patch and implement support for contiguous memory areas
that are placed in HIGHMEM zone
v11:
1. Removed genalloc usage and replaced it with direct calls to
bitmap_* functions, dropped patches that are not needed
anymore (genalloc extensions)
2. Moved all contiguous area management code from mm/cma.c
to drivers/base/dma-contiguous.c
3. Renamed cm_alloc/free to dma_alloc/release_from_contiguous
4. Introduced global, system wide (default) contiguous area
configured with kernel config and kernel cmdline parameters
5. Simplified initialization to just one function:
dma_declare_contiguous()
6. Added example of device private memory contiguous area
v10:
1. Rebased onto 3.0-rc2 and resolved all conflicts
2. Simplified CMA to be just a pure memory allocator, for use
with platfrom/bus specific subsystems, like dma-mapping.
Removed all device specific functions are calls.
3. Integrated with ARM DMA-mapping subsystem.
4. Code cleanup here and there.
5. Removed private context support.
v9: 1. Rebased onto 2.6.39-rc1 and resolved all conflicts
2. Fixed a bunch of nasty bugs that happened when the allocation
failed (mainly kernel oops due to NULL ptr dereference).
3. Introduced testing code: cma-regions compatibility layer and
videobuf2-cma memory allocator module.
v8: 1. The alloc_contig_range() function has now been separated from
CMA and put in page_allocator.c. This function tries to
migrate all LRU pages in specified range and then allocate the
range using alloc_contig_freed_pages().
2. Support for MIGRATE_CMA has been separated from the CMA code.
I have not tested if CMA works with ZONE_MOVABLE but I see no
reasons why it shouldn't.
3. I have added a @private argument when creating CMA contexts so
that one can reserve memory and not share it with the rest of
the system. This way, CMA acts only as allocation algorithm.
v7: 1. A lot of functionality that handled driver->allocator_context
mapping has been removed from the patchset. This is not to say
that this code is not needed, it's just not worth posting
everything in one patchset.
Currently, CMA is "just" an allocator. It uses it's own
migratetype (MIGRATE_CMA) for defining ranges of pageblokcs
which behave just like ZONE_MOVABLE but dispite the latter can
be put in arbitrary places.
2. The migration code that was introduced in the previous version
actually started working.
v6: 1. Most importantly, v6 introduces support for memory migration.
The implementation is not yet complete though.
Migration support means that when CMA is not using memory
reserved for it, page allocator can allocate pages from it.
When CMA wants to use the memory, the pages have to be moved
and/or evicted as to make room for CMA.
To make it possible it must be guaranteed that only movable and
reclaimable pages are allocated in CMA controlled regions.
This is done by introducing a MIGRATE_CMA migrate type that
guarantees exactly that.
Some of the migration code is "borrowed" from Kamezawa
Hiroyuki's alloc_contig_pages() implementation. The main
difference is that thanks to MIGRATE_CMA migrate type CMA
assumes that memory controlled by CMA are is always movable or
reclaimable so that it makes allocation decisions regardless of
the whether some pages are actually allocated and migrates them
if needed.
The most interesting patches from the patchset that implement
the functionality are:
09/13: mm: alloc_contig_free_pages() added
10/13: mm: MIGRATE_CMA migration type added
11/13: mm: MIGRATE_CMA isolation functions added
12/13: mm: cma: Migration support added [wip]
Currently, kernel panics in some situations which I am trying
to investigate.
2. cma_pin() and cma_unpin() functions has been added (after
a conversation with Johan Mossberg). The idea is that whenever
hardware does not use the memory (no transaction is on) the
chunk can be moved around. This would allow defragmentation to
be implemented if desired. No defragmentation algorithm is
provided at this time.
3. Sysfs support has been replaced with debugfs. I always felt
unsure about the sysfs interface and when Greg KH pointed it
out I finally got to rewrite it to debugfs.
v5: (intentionally left out as CMA v5 was identical to CMA v4)
v4: 1. The "asterisk" flag has been removed in favour of requiring
that platform will provide a "*=<regions>" rule in the map
attribute.
2. The terminology has been changed slightly renaming "kind" to
"type" of memory. In the previous revisions, the documentation
indicated that device drivers define memory kinds and now,
v3: 1. The command line parameters have been removed (and moved to
a separate patch, the fourth one). As a consequence, the
cma_set_defaults() function has been changed -- it no longer
accepts a string with list of regions but an array of regions.
2. The "asterisk" attribute has been removed. Now, each region
has an "asterisk" flag which lets one specify whether this
region should by considered "asterisk" region.
3. SysFS support has been moved to a separate patch (the third one
in the series) and now also includes list of regions.
v2: 1. The "cma_map" command line have been removed. In exchange,
a SysFS entry has been created under kernel/mm/contiguous.
The intended way of specifying the attributes is
a cma_set_defaults() function called by platform initialisation
code. "regions" attribute (the string specified by "cma"
command line parameter) can be overwritten with command line
parameter; the other attributes can be changed during run-time
using the SysFS entries.
2. The behaviour of the "map" attribute has been modified
slightly. Currently, if no rule matches given device it is
assigned regions specified by the "asterisk" attribute. It is
by default built from the region names given in "regions"
attribute.
3. Devices can register private regions as well as regions that
can be shared but are not reserved using standard CMA
mechanisms. A private region has no name and can be accessed
only by devices that have the pointer to it.
4. The way allocators are registered has changed. Currently,
a cma_allocator_register() function is used for that purpose.
Moreover, allocators are attached to regions the first time
memory is registered from the region or when allocator is
registered which means that allocators can be dynamic modules
that are loaded after the kernel booted (of course, it won't be
possible to allocate a chunk of memory from a region if
allocator is not loaded).
5. Index of new functions:
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size,
+ dma_addr_t alignment)
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+int __must_check cma_region_register(struct cma_region *reg);
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+ size_t size, dma_addr_t alignment);
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions,
+ size_t size, dma_addr_t alignment);
+int cma_allocator_register(struct cma_allocator *alloc);
Patches in this patchset:
Marek Szyprowski (5):
mm: add optional memory reclaim in split_free_page()
drivers: add Contiguous Memory Allocator
X86: integrate CMA with DMA-mapping subsystem
ARM: integrate CMA with DMA-mapping subsystem
ARM: Samsung: use CMA for 2 memory banks for s5p-mfc device
Michal Nazarewicz (6):
mm: page_alloc: set_migratetype_isolate: drain PCP prior to isolating
mm: compaction: introduce isolate_{free,migrate}pages_range().
mm: compaction: export some of the functions
mm: page_alloc: introduce alloc_contig_range()
mm: mmzone: MIGRATE_CMA migration type added
mm: page_isolation: MIGRATE_CMA isolation functions added
Documentation/kernel-parameters.txt | 9 +
arch/Kconfig | 3 +
arch/arm/Kconfig | 2 +
arch/arm/include/asm/dma-contiguous.h | 16 ++
arch/arm/include/asm/mach/map.h | 1 +
arch/arm/kernel/setup.c | 9 +-
arch/arm/mm/dma-mapping.c | 368 +++++++++++++++++++++-----
arch/arm/mm/init.c | 22 ++-
arch/arm/mm/mm.h | 3 +
arch/arm/mm/mmu.c | 29 ++-
arch/arm/plat-s5p/dev-mfc.c | 51 +---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/dma-contiguous.h | 13 +
arch/x86/include/asm/dma-mapping.h | 4 +
arch/x86/kernel/pci-dma.c | 18 ++-
arch/x86/kernel/pci-nommu.c | 8 +-
arch/x86/kernel/setup.c | 2 +
drivers/base/Kconfig | 89 +++++++
drivers/base/Makefile | 1 +
drivers/base/dma-contiguous.c | 404 ++++++++++++++++++++++++++++
include/asm-generic/dma-contiguous.h | 27 ++
include/linux/device.h | 4 +
include/linux/dma-contiguous.h | 110 ++++++++
include/linux/mm.h | 2 +-
include/linux/mmzone.h | 41 +++-
include/linux/page-isolation.h | 27 ++-
mm/Kconfig | 2 +-
mm/Makefile | 3 +-
mm/compaction.c | 467 +++++++++++++++++++--------------
mm/internal.h | 35 +++
mm/memory-failure.c | 2 +-
mm/memory_hotplug.c | 6 +-
mm/page_alloc.c | 403 +++++++++++++++++++++++++----
mm/page_isolation.c | 15 +-
mm/vmstat.c | 1 +
35 files changed, 1773 insertions(+), 425 deletions(-)
create mode 100644 arch/arm/include/asm/dma-contiguous.h
create mode 100644 arch/x86/include/asm/dma-contiguous.h
create mode 100644 drivers/base/dma-contiguous.c
create mode 100644 include/asm-generic/dma-contiguous.h
create mode 100644 include/linux/dma-contiguous.h
--
1.7.1.569.g6f426
Hello Everyone,
This is RFC v2 for DMA buffer sharing mechanism - changes from v1 are in the
changelog below.
Various subsystems - V4L2, GPU-accessors, DRI to name a few - have felt the
need to have a common mechanism to share memory buffers across different
devices - ARM, video hardware, GPU.
This need comes forth from a variety of use cases including cameras, image
processing, video recorders, sound processing, DMA engines, GPU and display
buffers, and others.
This RFC is an attempt to define such a buffer sharing mechanism- it is the
result of discussions from a couple of memory-management mini-summits held by
Linaro to understand and address common needs around memory management. [1]
A new dma_buf buffer object is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- a new buffer-object to be created with fixed size.
- different devices to 'attach' themselves to this buffer, to facilitate
backing storage negotiation, using dma_buf_attach() API.
- association of a file pointer with each user-buffer and associated
allocator-defined operations on that buffer. This operation is called the
'export' operation.
- this exported buffer-object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist using map_dma_buf and
unmap_dma_buf operations.
Documentation present in the patch-set gives more details.
This is based on design suggestions from many people at the mini-summits,
most notably from Arnd Bergmann <arnd(a)arndb.de>, Rob Clark <rob(a)ti.com> and
Daniel Vetter <daniel(a)ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws(a)samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
References:
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Patchset based on top of 3.2-rc3, the current version can be found at
http://git.linaro.org/gitweb?p=people/sumitsemwal/linux-3.x.git
Branch: dma-buf-upstr-v2
Earlier version at: https://lkml.org/lkml/2011/10/11/92
Best regards,
~Sumit Semwal
History:
v2:
- Review comments incorporated:
- from Tomasz Stanislawski [https://lkml.org/lkml/2011/10/14/136]
- kzalloc moved out of critical section
- corrected some in-code comments
- from Dave Airlie [https://lkml.org/lkml/2011/11/25/123]
- from Daniel Vetter and Rob Clark [https://lkml.org/lkml/2011/11/26/53]
- use struct sg_table in place of struct scatterlist
- rename {get,put}_scatterlist to {map,unmap}_dma_buf
- add new wrapper APIs dma_buf_{map,unmap}_attachment for ease of users
- documentation updates as per review comments from Randy Dunlap
[https://lkml.org/lkml/2011/10/12/439]
v1: original
Sumit Semwal (2):
dma-buf: Introduce dma buffer sharing mechanism
dma-buf: Documentation for buffer sharing framework
Documentation/dma-buf-sharing.txt | 223 ++++++++++++++++++++++++++++
drivers/base/Kconfig | 10 ++
drivers/base/Makefile | 1 +
drivers/base/dma-buf.c | 290 +++++++++++++++++++++++++++++++++++++
include/linux/dma-buf.h | 176 ++++++++++++++++++++++
5 files changed, 700 insertions(+), 0 deletions(-)
create mode 100644 Documentation/dma-buf-sharing.txt
create mode 100644 drivers/base/dma-buf.c
create mode 100644 include/linux/dma-buf.h
--
1.7.4.1
Hello Everyone,
This is RFC v3 for DMA buffer sharing mechanism - changes from v2 are in the
changelog below.
Various subsystems - V4L2, GPU-accessors, DRI to name a few - have felt the
need to have a common mechanism to share memory buffers across different
devices - ARM, video hardware, GPU.
This need comes forth from a variety of use cases including cameras, image
processing, video recorders, sound processing, DMA engines, GPU and display
buffers, and others.
This RFC is an attempt to define such a buffer sharing mechanism- it is the
result of discussions from a couple of memory-management mini-summits held by
Linaro to understand and address common needs around memory management. [1]
A new dma_buf buffer object is added, with operations and API to allow easy
sharing of this buffer object across devices.
The framework allows:
- a new buffer-object to be created with fixed size.
- different devices to 'attach' themselves to this buffer, to facilitate
backing storage negotiation, using dma_buf_attach() API.
- association of a file pointer with each user-buffer and associated
allocator-defined operations on that buffer. This operation is called the
'export' operation.
- this exported buffer-object to be shared with the other entity by asking for
its 'file-descriptor (fd)', and sharing the fd across.
- a received fd to get the buffer object back, where it can be accessed using
the associated exporter-defined operations.
- the exporter and user to share the scatterlist using map_dma_buf and
unmap_dma_buf operations.
Documentation present in the patch-set gives more details.
This is based on design suggestions from many people at the mini-summits,
most notably from Arnd Bergmann <arnd(a)arndb.de>, Rob Clark <rob(a)ti.com> and
Daniel Vetter <daniel(a)ffwll.ch>.
The implementation is inspired from proof-of-concept patch-set from
Tomasz Stanislawski <t.stanislaws(a)samsung.com>, who demonstrated buffer sharing
between two v4l2 devices. [2]
References:
[1]: https://wiki.linaro.org/OfficeofCTO/MemoryManagement
[2]: http://lwn.net/Articles/454389
Patchset based on top of 3.2-rc3, the current version can be found at
http://git.linaro.org/gitweb?p=people/sumitsemwal/linux-3.x.git
Branch: dma-buf-upstr-v2
Earlier versions:
v2 at: https://lkml.org/lkml/2011/12/2/53
v1 at: https://lkml.org/lkml/2011/10/11/92
Best regards,
~Sumit Semwal
History:
v3:
- Review comments incorporated:
- from Konrad Rzeszutek Wilk [https://lkml.org/lkml/2011/12/3/45]
- replaced BUG_ON with WARN_ON - various places
- added some error-checks
- replaced EXPORT_SYMBOL with EXPORT_SYMBOL_GPL
- some cosmetic / documentation comments
- from Arnd Bergmann, Daniel Vetter, Rob Clark
[https://lkml.org/lkml/2011/12/5/321]
- removed mmap() fop and dma_buf_op, also the sg_sync* operations, and
documented that mmap is not allowed for exported buffer
- updated documentation to clearly state when migration is allowed
- changed kconfig
- some error code checks
- from Rob Clark [https://lkml.org/lkml/2011/12/5/572]
- update documentation to allow map_dma_buf to return -EINTR
v2:
- Review comments incorporated:
- from Tomasz Stanislawski [https://lkml.org/lkml/2011/10/14/136]
- kzalloc moved out of critical section
- corrected some in-code comments
- from Dave Airlie [https://lkml.org/lkml/2011/11/25/123]
- from Daniel Vetter and Rob Clark [https://lkml.org/lkml/2011/11/26/53]
- use struct sg_table in place of struct scatterlist
- rename {get,put}_scatterlist to {map,unmap}_dma_buf
- add new wrapper APIs dma_buf_{map,unmap}_attachment for ease of users
- documentation updates as per review comments from Randy Dunlap
[https://lkml.org/lkml/2011/10/12/439]
v1: original
Sumit Semwal (2):
dma-buf: Introduce dma buffer sharing mechanism
dma-buf: Documentation for buffer sharing framework
Documentation/dma-buf-sharing.txt | 222 ++++++++++++++++++++++++++++
drivers/base/Kconfig | 10 ++
drivers/base/Makefile | 1 +
drivers/base/dma-buf.c | 289 +++++++++++++++++++++++++++++++++++++
include/linux/dma-buf.h | 172 ++++++++++++++++++++++
5 files changed, 694 insertions(+), 0 deletions(-)
create mode 100644 Documentation/dma-buf-sharing.txt
create mode 100644 drivers/base/dma-buf.c
create mode 100644 include/linux/dma-buf.h
--
1.7.4.1
Hi folks,
I went ahead and checked what is the status of the UMM effort as 2011 is
closing to an end. What follows is my attempt to summarise the status,
collating the pieces from the latest announcements and changelogs and
having discussed it briefly with Jesse and Rob. Feel free to comment if
there is something missing or not correct:
A. CMA - Contiguous memory allocator. This is in its v17 incarnation at
the moment (a v18 is in preparation but not released yet). v17 shares
the code with memory compaction subsystem, not the hotplug like it was
before (change has been suggested by Mel Gorman) and there are also a
few fixes here and there, like addressing most of the comments from
Andrew Morton and Mel Gorman in the rest of the CMA code, fixing broken
initialization on ARM systems with DMA zone enabled and rebasing the
code on v3.2-rc2 kernel.
An issue has been noted in linaro-dev from the TI landing team : without
any highmem the code is working great, but with HIGHMEM inclusion of
the CMA v17 consistently causes failure during DMA init. This is
expected to be fixed soon (perhaps in v18?). In meantime it is
suggested using 2G/2G memory split as a workaround (Kernel Features ->
Memory split -> 2G/2G user/kernel split).
B. dma mapping API - DMA-mapping framework redesign for ARM
architecture: this is in the v4 now. It includes a few minor changes
since last version. The changes are mainly on the IOMMU mapper, keeping
the DMA-mapping redesign patches almost unchanged. The code is rebased
onto v3.2-rc4 kernel + IOMMU/next branch to include latest changes from
IOMMU kernel tree. This series also contains support for mapping with
pages larger than 4KiB using new, extended IOMMU API, and did a general
cleanup of the DMA mapping implementation. However it seems that this
patchset "attempts to fix everyone at once". It has been suggested
that instead of trying to do that the implementation should give
sufficient transition period - for example just adding the new methods
now and only removing them in the following merge window when all the
architectures have had a chance to migrate.
C. dmabuf - a DMA-buf object sharing framework: this is now in its 3rd
version. The newest version incorporates changes as requested during the
review, such as - replacing BUG_ON with WARN_ON at various places,
removing mmap() fop and dma_buf_op, also the sg_sync* operations, and
documenting that mmap is not allowed for exported buffer, adding error
checks, replacing EXPORT_SYMBOL with EXPORT_SYMBOL_GPL and fixing some
cosmetic/documentation items. There are still some items under
discusion such as userspace mmap support, more advanced (and more
strictly specified) coherency models and shared infrastructure for
implementing exporters. However, there is a suggestion that these items
will become much clearer once we have a few example drivers at hand
and a better understanding of what cases need to be handled better.
D. Finally some repositories - where can you find the code to try it out:
* git://git.linaro.org/people/jessebarker/linaro-mm-sig/linux.git
contains (for the moment) 6 branches:
+ cma-v17 == unadulterated v3.2-rc4 + cma v17 patchset
+ dma-mapping-v4 == unadulterated v3.2-rc4 + dma-mapping v4 patchset
+ android-cma-v17 == john stultz's androidization tree based upon
unadulterated v3.2-rc4 + cma v17 patchset
+ android-dma-mapping-v4 == john stultz's androidization tree based
upon unadulterated v3.2-rc4 + dma-mapping v4 patchset
Also note these repos:
- https://github.com/robclark/kernel-omap4/commits/drmplane-dmabuf
contains patches to enable sharing of buffers between drm and v4l2,
Rob commented that it isn't really identified yet which tree to push
dmabuf through.. airlied has volunteered to push via drm tree, which is ok
- git://git.linaro.org/people/bgaignard/linux-snowball-cma-test.git
contains a first version of the CMA testing scripts for LAVA (snowball
specific at least for now)
Best regards,
--
Ilias Biris ilias.biris(a)linaro.org
Project Manager, Linaro
M: +358504839608, IRC: ibiris Skype: ilias_biris
Linaro.org│ Open source software for ARM SoCs
Hi,
One question is inlined below:
From: Hiroshi Doyu <hdoyu(a)nvidia.com>
Subject: [PATCH v2 1/2] [RFC] ARM: IOMMU: Tegra20: iommu_ops for GART driver
Date: Thu, 15 Dec 2011 14:11:29 +0100
Message-ID: <1323954690-7000-2-git-send-email-hdoyu(a)nvidia.com>
> Tegra 20 IOMMU H/W, GART (Graphics Address Relocation Table). This
> patch implements struct iommu_ops for GART for the upper IOMMU API.
>
> This H/W module supports only single virtual address space(domain),
> and manages a single level 1-to-1 mapping H/W translation page table.
>
> Signed-off-by: Hiroshi DOYU <hdoyu(a)nvidia.com>
> ---
> drivers/iommu/Kconfig | 11 +
> drivers/iommu/Makefile | 1 +
> drivers/iommu/tegra-gart.c | 451 ++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 463 insertions(+), 0 deletions(-)
> create mode 100644 drivers/iommu/tegra-gart.c
>
....
> +
> +struct gart_device {
> + void __iomem *regs;
> + u32 *savedata;
> + u32 page_count; /* total remappable size */
> + dma_addr_t iovmm_base; /* offset to apply to vmm_area */
> + spinlock_t pte_lock; /* for pagetable */
> + struct list_head client;
> + spinlock_t client_lock; /* for client list */
> + struct device *dev;
> +};
> +
> +static struct gart_device *gart_handle; /* unique for a system */
^^^^^^^^^^^
.....
> +
> +static int gart_iommu_domain_init(struct iommu_domain *domain)
> +{
> + domain->priv = gart_handle;
^^^^^^^^^^^^^^^^^^^^^^^^^^^
> + pr_debug("gart@%p\n", gart_handle);
> + return 0;
> +}
In the above, the global pointer is used to pass gart_device to set it
in dmain->priv. It works with a single gart_device, but not with
multiple gart_devices. This is too bad, I know;). I guess that this
can be solved with device tree info where a client device is set as a
child of gart_device at device registration. Is this the right way
from IOMMU API POV?
for (i = 0; i < ARRAY_SIZE(dmaapi_dummy_device); i++) {
int err;
struct platform_device *pdev = &dmaapi_dummy_device[i];
pdev->dev.platform_data = (void *)dummy_hwgrp_map[i];
pdev->dev.parent = &tegra_gart_device;
^^^^^^^^^^^^^^^^^^^
err = platform_device_register(pdev);
Hi Joerg, Thank you for your quick review.
From: Joerg Roedel <joerg.roedel(a)amd.com>
Subject: Re: [PATCH v2 1/2] [RFC] ARM: IOMMU: Tegra20: iommu_ops for GART driver
Date: Fri, 16 Dec 2011 16:43:31 +0100
Message-ID: <20111216154331.GD29877(a)amd.com>
> On Thu, Dec 15, 2011 at 03:11:29PM +0200, Hiroshi DOYU wrote:
> > Tegra 20 IOMMU H/W, GART (Graphics Address Relocation Table). This
> > patch implements struct iommu_ops for GART for the upper IOMMU API.
> >
> > This H/W module supports only single virtual address space(domain),
> > and manages a single level 1-to-1 mapping H/W translation page table.
> >
> > Signed-off-by: Hiroshi DOYU <hdoyu(a)nvidia.com>
> > ---
> > drivers/iommu/Kconfig | 11 +
> > drivers/iommu/Makefile | 1 +
> > drivers/iommu/tegra-gart.c | 451 ++++++++++++++++++++++++++++++++++++++++++++
>
> The code looks good in general.
Thanks.
> I think we need to extend the IOMMU-API
> a little bit to better support GART IOMMUs. A user of the IOMMU-API
> should be able to get the information about the iova-base and size of
> the aperture for example. These extensions will be beneficial for
> porting other GART-like driver to the IOMMU-API too.
is it similar to how ".pgsize_bitmap" is passed?
>
>
> Joerg
>
> --
> AMD Operating System Research Center
>
> Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
> General Managers: Alberto Bozzo, Andrew Bowd
> Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
>
Welcome everyone once again,
This is yet another release of the Contiguous Memory Allocator patches.
This version is mainly a result of the discussion on Kernel Summit in
Prague. The main change is completely different code base for the
migration feature of the CMA. Now it shares the code with memory
compaction subsystem, not the hotplug like it was before. This code has
been kindly provided by Michal Nazarewicz. There are also a few fixes
here and there, see changelog for the details.
Please notice that this patch series is aimed to start further
discussion. There are still few issues that need to be resolved before
CMA will be really ready. The most hot problem is the issue with movable
pages that causes migration to fail from time to time. Our investigation
leads us to the point that these rare pages cannot be migrated because
there are some pending io operations on them.
ARM integration code has not been changed since last version, it
provides implementation of all the ideas that has been discussed during
Linaro Sprint meeting. Here are the details:
This version provides a solution for complete integration of CMA to
DMA mapping subsystem on ARM architecture. The issue caused by double
dma pages mapping and possible aliasing in coherent memory mapping has
been finally resolved, both for GFP_ATOMIC case (allocations comes from
coherent memory pool) and non-GFP_ATOMIC case (allocations comes from
CMA managed areas).
For coherent, nommu, ARMv4 and ARMv5 systems the current DMA-mapping
implementation has been kept.
For ARMv6+ systems, CMA has been enabled and a special pool of coherent
memory for atomic allocations has been created. The size of this pool
defaults to DEFAULT_CONSISTEN_DMA_SIZE/8, but can be changed with
coherent_pool kernel parameter (if really required).
All atomic allocations are served from this pool. I've did a little
simplification here, because there is no separate pool for writecombine
memory - such requests are also served from coherent pool. I don't
think that such simplification is a problem here - I found no driver
that use dma_alloc_writecombine with GFP_ATOMIC flags.
All non-atomic allocation are served from CMA area. Kernel mapping is
updated to reflect required memory attributes changes. This is possible
because during early boot, all CMA area are remapped with 4KiB pages in
kernel low-memory.
This version have been tested on Samsung S5PC110 based Goni machine and
Exynos4 UniversalC210 board with various V4L2 multimedia drivers.
Coherent atomic allocations has been tested by manually enabling the dma
bounce for the s3c-sdhci device.
All patches are prepared for Linux Kernel v3.2-rc2.
A few words for these who see CMA for the first time:
The Contiguous Memory Allocator (CMA) makes it possible for device
drivers to allocate big contiguous chunks of memory after the system
has booted.
The main difference from the similar frameworks is the fact that CMA
allows to transparently reuse memory region reserved for the big
chunk allocation as a system memory, so no memory is wasted when no
big chunk is allocated. Once the alloc request is issued, the
framework will migrate system pages to create a required big chunk of
physically contiguous memory.
For more information you can refer to nice LWN articles:
http://lwn.net/Articles/447405/ and http://lwn.net/Articles/450286/
as well as links to previous versions of the CMA framework.
The CMA framework has been initially developed by Michal Nazarewicz
at Samsung Poland R&D Center. Since version 9, I've taken over the
development, because Michal has left the company.
TODO (optional):
- implement support for contiguous memory areas placed in HIGHMEM zone
- resolve issue with movable pages with pending io operations
Best regards
Marek Szyprowski
Samsung Poland R&D Center
Links to previous versions of the patchset:
v16: <http://www.spinics.net/lists/linux-mm/msg25066.html>
v15: <http://www.spinics.net/lists/linux-mm/msg23365.html>
v14: <http://www.spinics.net/lists/linux-media/msg36536.html>
v13: (internal, intentionally not released)
v12: <http://www.spinics.net/lists/linux-media/msg35674.html>
v11: <http://www.spinics.net/lists/linux-mm/msg21868.html>
v10: <http://www.spinics.net/lists/linux-mm/msg20761.html>
v9: <http://article.gmane.org/gmane.linux.kernel.mm/60787>
v8: <http://article.gmane.org/gmane.linux.kernel.mm/56855>
v7: <http://article.gmane.org/gmane.linux.kernel.mm/55626>
v6: <http://article.gmane.org/gmane.linux.kernel.mm/55626>
v5: (intentionally left out as CMA v5 was identical to CMA v4)
v4: <http://article.gmane.org/gmane.linux.kernel.mm/52010>
v3: <http://article.gmane.org/gmane.linux.kernel.mm/51573>
v2: <http://article.gmane.org/gmane.linux.kernel.mm/50986>
v1: <http://article.gmane.org/gmane.linux.kernel.mm/50669>
Changelog:
v17:
1. Replaced whole CMA core memory migration code to the new one kindly
provided by Michal Nazarewicz. The new code is based on memory
compaction framework not the memory hotplug, like it was before. This
change has been suggested by Mel Godman.
2. Addressed most of the comments from Andrew Morton and Mel Gorman in
the rest of the CMA code.
3. Fixed broken initialization on ARM systems with DMA zone enabled.
4. Rebased onto v3.2-rc2 kernel.
v16:
1. merged a fixup from Michal Nazarewicz to address comments from Dave
Hansen about checking if pfns belong to the same memory zone
2. merged a fix from Michal Nazarewicz for incorrect handling of pages
which belong to page block that is in MIGRATE_ISOLATE state, in very
rare cases the migrate type of page block might have been changed
from MIGRATE_CMA to MIGRATE_MOVABLE because of this bug
3. moved some common code to include/asm-generic
4. added support for x86 DMA-mapping framework for pci-dma hardware,
CMA can be now even more widely tested on KVM/QEMU and a lot of common
x86 boxes
5. rebased onto next-20111005 kernel tree, which includes changes in ARM
DMA-mapping subsystem (CONSISTENT_DMA_SIZE removal)
6. removed patch for CMA s5p-fimc device private regions (served only as
example) and provided the one that matches real life case - s5p-mfc
device
v15:
1. fixed calculation of the total memory after activating CMA area (was
broken from v12)
2. more code cleanup in drivers/base/dma-contiguous.c
3. added address limit for default CMA area
4. rewrote ARM DMA integration:
- removed "ARM: DMA: steal memory for DMA coherent mappings" patch
- kept current DMA mapping implementation for coherent, nommu and
ARMv4/ARMv5 systems
- enabled CMA for all ARMv6+ systems
- added separate, small pool for coherent atomic allocations, defaults
to CONSISTENT_DMA_SIZE/8, but can be changed with kernel parameter
coherent_pool=[size]
v14:
1. Merged with "ARM: DMA: steal memory for DMA coherent mappings"
patch, added support for GFP_ATOMIC allocations.
2. Added checks for NULL device pointer
v13: (internal, intentionally not released)
v12:
1. Fixed 2 nasty bugs in dma-contiguous allocator:
- alignment argument was not passed correctly
- range for dma_release_from_contiguous was not checked correctly
2. Added support for architecture specfic dma_contiguous_early_fixup()
function
3. CMA and DMA-mapping integration for ARM architechture has been
rewritten to take care of the memory aliasing issue that might
happen for newer ARM CPUs (mapping of the same pages with different
cache attributes is forbidden). TODO: add support for GFP_ATOMIC
allocations basing on the "ARM: DMA: steal memory for DMA coherent
mappings" patch and implement support for contiguous memory areas
that are placed in HIGHMEM zone
v11:
1. Removed genalloc usage and replaced it with direct calls to
bitmap_* functions, dropped patches that are not needed
anymore (genalloc extensions)
2. Moved all contiguous area management code from mm/cma.c
to drivers/base/dma-contiguous.c
3. Renamed cm_alloc/free to dma_alloc/release_from_contiguous
4. Introduced global, system wide (default) contiguous area
configured with kernel config and kernel cmdline parameters
5. Simplified initialization to just one function:
dma_declare_contiguous()
6. Added example of device private memory contiguous area
v10:
1. Rebased onto 3.0-rc2 and resolved all conflicts
2. Simplified CMA to be just a pure memory allocator, for use
with platfrom/bus specific subsystems, like dma-mapping.
Removed all device specific functions are calls.
3. Integrated with ARM DMA-mapping subsystem.
4. Code cleanup here and there.
5. Removed private context support.
v9: 1. Rebased onto 2.6.39-rc1 and resolved all conflicts
2. Fixed a bunch of nasty bugs that happened when the allocation
failed (mainly kernel oops due to NULL ptr dereference).
3. Introduced testing code: cma-regions compatibility layer and
videobuf2-cma memory allocator module.
v8: 1. The alloc_contig_range() function has now been separated from
CMA and put in page_allocator.c. This function tries to
migrate all LRU pages in specified range and then allocate the
range using alloc_contig_freed_pages().
2. Support for MIGRATE_CMA has been separated from the CMA code.
I have not tested if CMA works with ZONE_MOVABLE but I see no
reasons why it shouldn't.
3. I have added a @private argument when creating CMA contexts so
that one can reserve memory and not share it with the rest of
the system. This way, CMA acts only as allocation algorithm.
v7: 1. A lot of functionality that handled driver->allocator_context
mapping has been removed from the patchset. This is not to say
that this code is not needed, it's just not worth posting
everything in one patchset.
Currently, CMA is "just" an allocator. It uses it's own
migratetype (MIGRATE_CMA) for defining ranges of pageblokcs
which behave just like ZONE_MOVABLE but dispite the latter can
be put in arbitrary places.
2. The migration code that was introduced in the previous version
actually started working.
v6: 1. Most importantly, v6 introduces support for memory migration.
The implementation is not yet complete though.
Migration support means that when CMA is not using memory
reserved for it, page allocator can allocate pages from it.
When CMA wants to use the memory, the pages have to be moved
and/or evicted as to make room for CMA.
To make it possible it must be guaranteed that only movable and
reclaimable pages are allocated in CMA controlled regions.
This is done by introducing a MIGRATE_CMA migrate type that
guarantees exactly that.
Some of the migration code is "borrowed" from Kamezawa
Hiroyuki's alloc_contig_pages() implementation. The main
difference is that thanks to MIGRATE_CMA migrate type CMA
assumes that memory controlled by CMA are is always movable or
reclaimable so that it makes allocation decisions regardless of
the whether some pages are actually allocated and migrates them
if needed.
The most interesting patches from the patchset that implement
the functionality are:
09/13: mm: alloc_contig_free_pages() added
10/13: mm: MIGRATE_CMA migration type added
11/13: mm: MIGRATE_CMA isolation functions added
12/13: mm: cma: Migration support added [wip]
Currently, kernel panics in some situations which I am trying
to investigate.
2. cma_pin() and cma_unpin() functions has been added (after
a conversation with Johan Mossberg). The idea is that whenever
hardware does not use the memory (no transaction is on) the
chunk can be moved around. This would allow defragmentation to
be implemented if desired. No defragmentation algorithm is
provided at this time.
3. Sysfs support has been replaced with debugfs. I always felt
unsure about the sysfs interface and when Greg KH pointed it
out I finally got to rewrite it to debugfs.
v5: (intentionally left out as CMA v5 was identical to CMA v4)
v4: 1. The "asterisk" flag has been removed in favour of requiring
that platform will provide a "*=<regions>" rule in the map
attribute.
2. The terminology has been changed slightly renaming "kind" to
"type" of memory. In the previous revisions, the documentation
indicated that device drivers define memory kinds and now,
v3: 1. The command line parameters have been removed (and moved to
a separate patch, the fourth one). As a consequence, the
cma_set_defaults() function has been changed -- it no longer
accepts a string with list of regions but an array of regions.
2. The "asterisk" attribute has been removed. Now, each region
has an "asterisk" flag which lets one specify whether this
region should by considered "asterisk" region.
3. SysFS support has been moved to a separate patch (the third one
in the series) and now also includes list of regions.
v2: 1. The "cma_map" command line have been removed. In exchange,
a SysFS entry has been created under kernel/mm/contiguous.
The intended way of specifying the attributes is
a cma_set_defaults() function called by platform initialisation
code. "regions" attribute (the string specified by "cma"
command line parameter) can be overwritten with command line
parameter; the other attributes can be changed during run-time
using the SysFS entries.
2. The behaviour of the "map" attribute has been modified
slightly. Currently, if no rule matches given device it is
assigned regions specified by the "asterisk" attribute. It is
by default built from the region names given in "regions"
attribute.
3. Devices can register private regions as well as regions that
can be shared but are not reserved using standard CMA
mechanisms. A private region has no name and can be accessed
only by devices that have the pointer to it.
4. The way allocators are registered has changed. Currently,
a cma_allocator_register() function is used for that purpose.
Moreover, allocators are attached to regions the first time
memory is registered from the region or when allocator is
registered which means that allocators can be dynamic modules
that are loaded after the kernel booted (of course, it won't be
possible to allocate a chunk of memory from a region if
allocator is not loaded).
5. Index of new functions:
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size,
+ dma_addr_t alignment)
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+int __must_check cma_region_register(struct cma_region *reg);
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+ size_t size, dma_addr_t alignment);
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions,
+ size_t size, dma_addr_t alignment);
+int cma_allocator_register(struct cma_allocator *alloc);
Patches in this patchset:
Marek Szyprowski (4):
drivers: add Contiguous Memory Allocator
X86: integrate CMA with DMA-mapping subsystem
ARM: integrate CMA with DMA-mapping subsystem
ARM: Samsung: use CMA for 2 memory banks for s5p-mfc device
Michal Nazarewicz (7):
mm: page_alloc: handle MIGRATE_ISOLATE in free_pcppages_bulk()
mm: compaction: introduce isolate_{free,migrate}pages_range().
mm: mmzone: introduce zone_pfn_same_memmap()
mm: compaction: export some of the functions
mm: page_alloc: introduce alloc_contig_range()
mm: mmzone: MIGRATE_CMA migration type added
mm: page_isolation: MIGRATE_CMA isolation functions added
Documentation/kernel-parameters.txt | 9 +
arch/Kconfig | 3 +
arch/arm/Kconfig | 2 +
arch/arm/include/asm/dma-contiguous.h | 16 ++
arch/arm/include/asm/mach/map.h | 1 +
arch/arm/kernel/setup.c | 8 +-
arch/arm/mm/dma-mapping.c | 368 +++++++++++++++++++++++++------
arch/arm/mm/init.c | 20 ++-
arch/arm/mm/mm.h | 3 +
arch/arm/mm/mmu.c | 29 ++-
arch/arm/plat-s5p/dev-mfc.c | 51 +----
arch/x86/Kconfig | 1 +
arch/x86/include/asm/dma-contiguous.h | 13 +
arch/x86/include/asm/dma-mapping.h | 4 +
arch/x86/kernel/pci-dma.c | 18 ++-
arch/x86/kernel/pci-nommu.c | 8 +-
arch/x86/kernel/setup.c | 2 +
drivers/base/Kconfig | 89 ++++++++
drivers/base/Makefile | 1 +
drivers/base/dma-contiguous.c | 396 +++++++++++++++++++++++++++++++++
include/asm-generic/dma-contiguous.h | 27 +++
include/linux/device.h | 4 +
include/linux/dma-contiguous.h | 110 +++++++++
include/linux/mmzone.h | 57 ++++-
include/linux/page-isolation.h | 27 ++-
mm/Kconfig | 2 +-
mm/Makefile | 3 +-
mm/compaction.c | 230 +++++++++++--------
mm/internal.h | 35 +++
mm/memory-failure.c | 2 +-
mm/memory_hotplug.c | 6 +-
mm/page_alloc.c | 315 ++++++++++++++++++++++++--
mm/page_isolation.c | 15 +-
33 files changed, 1591 insertions(+), 284 deletions(-)
create mode 100644 arch/arm/include/asm/dma-contiguous.h
create mode 100644 arch/x86/include/asm/dma-contiguous.h
create mode 100644 drivers/base/dma-contiguous.c
create mode 100644 include/asm-generic/dma-contiguous.h
create mode 100644 include/linux/dma-contiguous.h
--
1.7.1.569.g6f426