Hello,
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
Best regards Marek Szyprowski Samsung R&D Institute Poland
Marek Szyprowski (2): mm: cma: adjust address limit to avoid hitting low/high memory boundary ARM: mm: don't limit default CMA region only to low memory
arch/arm/mm/init.c | 2 +- mm/cma.c | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-)
Automatically allocated regions should not cross low/high memory boundary, because such regions cannot be later correctly initialized due to spanning across two memory zones. This patch adds a check for this case and a simple code for moving region to low memory if automatically selected address might not fit completely into high memory.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- mm/cma.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/mm/cma.c b/mm/cma.c index c17751c0dcaf..4acc6aa4a086 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -32,6 +32,7 @@ #include <linux/slab.h> #include <linux/log2.h> #include <linux/cma.h> +#include <linux/highmem.h>
struct cma { unsigned long base_pfn; @@ -163,6 +164,8 @@ int __init cma_declare_contiguous(phys_addr_t base, bool fixed, struct cma **res_cma) { struct cma *cma; + phys_addr_t memblock_end = memblock_end_of_DRAM(); + phys_addr_t highmem_start = __pa(high_memory); int ret = 0;
pr_debug("%s(size %lx, base %08lx, limit %08lx alignment %08lx)\n", @@ -196,6 +199,24 @@ int __init cma_declare_contiguous(phys_addr_t base, if (!IS_ALIGNED(size >> PAGE_SHIFT, 1 << order_per_bit)) return -EINVAL;
+ /* + * adjust limit to avoid crossing low/high memory boundary for + * automatically allocated regions + */ + if (((limit == 0 || limit > memblock_end) && + (memblock_end - size < highmem_start && + memblock_end > highmem_start)) || + (!fixed && limit > highmem_start && limit - size < highmem_start)) { + limit = highmem_start; + } + + if (fixed && base < highmem_start && base+size > highmem_start) { + ret = -EINVAL; + pr_err("Region at %08lx defined on low/high memory boundary (%08lx)\n", + (unsigned long)base, (unsigned long)highmem_start); + goto err; + } + /* Reserve memory */ if (base && fixed) { if (memblock_is_region_reserved(base, size) ||
On Thu, Aug 21 2014, Marek Szyprowski m.szyprowski@samsung.com wrote:
Automatically allocated regions should not cross low/high memory boundary, because such regions cannot be later correctly initialized due to spanning across two memory zones. This patch adds a check for this case and a simple code for moving region to low memory if automatically selected address might not fit completely into high memory.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com
Acked-by: Michal Nazarewicz mina86@mina86.com
mm/cma.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/mm/cma.c b/mm/cma.c index c17751c0dcaf..4acc6aa4a086 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -32,6 +32,7 @@ #include <linux/slab.h> #include <linux/log2.h> #include <linux/cma.h> +#include <linux/highmem.h> struct cma { unsigned long base_pfn; @@ -163,6 +164,8 @@ int __init cma_declare_contiguous(phys_addr_t base, bool fixed, struct cma **res_cma) { struct cma *cma;
- phys_addr_t memblock_end = memblock_end_of_DRAM();
- phys_addr_t highmem_start = __pa(high_memory); int ret = 0;
pr_debug("%s(size %lx, base %08lx, limit %08lx alignment %08lx)\n", @@ -196,6 +199,24 @@ int __init cma_declare_contiguous(phys_addr_t base, if (!IS_ALIGNED(size >> PAGE_SHIFT, 1 << order_per_bit)) return -EINVAL;
- /*
* adjust limit to avoid crossing low/high memory boundary for
* automatically allocated regions
*/
- if (((limit == 0 || limit > memblock_end) &&
(memblock_end - size < highmem_start &&
memblock_end > highmem_start)) ||
(!fixed && limit > highmem_start && limit - size < highmem_start)) {
limit = highmem_start;
- }
- if (fixed && base < highmem_start && base+size > highmem_start) {
ret = -EINVAL;
pr_err("Region at %08lx defined on low/high memory boundary (%08lx)\n",
(unsigned long)base, (unsigned long)highmem_start);
goto err;
- }
- /* Reserve memory */ if (base && fixed) { if (memblock_is_region_reserved(base, size) ||
-- 1.9.2
DMA-mapping supports CMA regions places either in low or high memory, so there is no longer needed to limit default CMA regions only to low memory. The real limit is still defined by architecture specific DMA limit.
Reported-by: Russell King - ARM Linux linux@arm.linux.org.uk Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- arch/arm/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 659c75d808dc..c1b513555786 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -322,7 +322,7 @@ void __init arm_memblock_init(const struct machine_desc *mdesc) * reserve memory for DMA contigouos allocations, * must come from DMA area inside low memory */ - dma_contiguous_reserve(min(arm_dma_limit, arm_lowmem_limit)); + dma_contiguous_reserve(arm_dma_limit);
arm_memblock_steal_permitted = false; memblock_dump_all();
Hi Marek,
On Thu, Aug 21, 2014 at 9:45 AM, Marek Szyprowski m.szyprowski@samsung.com wrote:
DMA-mapping supports CMA regions places either in low or high memory, so there is no longer needed to limit default CMA regions only to low memory. The real limit is still defined by architecture specific DMA limit.
Thanks for working on this!
I think you need to update the comment here though, which still says: /* * reserve memory for DMA contigouos allocations, * must come from DMA area inside low memory */
If you're making a second version, as a minor nitpick you could also s/places/placed in the commit message.
Daniel
On Thu, Aug 21 2014, Marek Szyprowski m.szyprowski@samsung.com wrote:
DMA-mapping supports CMA regions places either in low or high memory, so there is no longer needed to limit default CMA regions only to low memory. The real limit is still defined by architecture specific DMA limit.
Reported-by: Russell King - ARM Linux linux@arm.linux.org.uk Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com
Acked-by: Michal Nazarewicz mina86@mina86.com
arch/arm/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 659c75d808dc..c1b513555786 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -322,7 +322,7 @@ void __init arm_memblock_init(const struct machine_desc *mdesc) * reserve memory for DMA contigouos allocations, * must come from DMA area inside low memory */
- dma_contiguous_reserve(min(arm_dma_limit, arm_lowmem_limit));
- dma_contiguous_reserve(arm_dma_limit);
arm_memblock_steal_permitted = false; memblock_dump_all(); -- 1.9.2
Hello,
On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote:
Hello,
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
Agreed. It should be from the beginning because CMA page is effectly pinned if it is anonymous page and system has no swap.
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
I totally agree with the patchset although I didn't review code at all.
Another topic: It means it should be a problem still if system has CMA in lowmem by some reason(ex, hardware limit or other purpose of CMA rather than DMA subsystem)?
In that case, an idea that just popped in my head is to migrate pages from cma area to highest zone because they are all userspace pages which should be in there but not sure it's worth to implement at this point because how many such cripple platform are.
Just for the recording.
Best regards Marek Szyprowski Samsung R&D Institute Poland
Marek Szyprowski (2): mm: cma: adjust address limit to avoid hitting low/high memory boundary ARM: mm: don't limit default CMA region only to low memory
arch/arm/mm/init.c | 2 +- mm/cma.c | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-)
-- 1.9.2
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Hello,
On 2014-08-25 03:26, Minchan Kim wrote:
Hello,
On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote:
Hello,
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
Agreed. It should be from the beginning because CMA page is effectly pinned if it is anonymous page and system has no swap.
Nope. Even without swap, anonymous page can be correctly migrated to other location. Migration code doesn't depend on presence of swap.
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
I totally agree with the patchset although I didn't review code at all.
Another topic: It means it should be a problem still if system has CMA in lowmem by some reason(ex, hardware limit or other purpose of CMA rather than DMA subsystem)?
In that case, an idea that just popped in my head is to migrate pages from cma area to highest zone because they are all userspace pages which should be in there but not sure it's worth to implement at this point because how many such cripple platform are.
Just for the recording.
Moving pages between low and high zone is not that easy. If I remember correctly you cannot migrate a page from low memory to high zone in generic case, although it should be possible to add exception for anonymous pages. This will definitely improve poor low memory handling in low zone when CMA is enabled.
Best regards
On Mon, Aug 25, 2014 at 10:00:32AM +0200, Marek Szyprowski wrote:
Hello,
On 2014-08-25 03:26, Minchan Kim wrote:
Hello,
On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote:
Hello,
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
Agreed. It should be from the beginning because CMA page is effectly pinned if it is anonymous page and system has no swap.
Nope. Even without swap, anonymous page can be correctly migrated to other location. Migration code doesn't depend on presence of swap.
I could be possible only if the zone has freeable page(ie, free pages + shrinkable page like page cache). IOW, if the zone is full with anon pages, it's efffectively pinned.
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
I totally agree with the patchset although I didn't review code at all.
Another topic: It means it should be a problem still if system has CMA in lowmem by some reason(ex, hardware limit or other purpose of CMA rather than DMA subsystem)?
In that case, an idea that just popped in my head is to migrate pages from cma area to highest zone because they are all userspace pages which should be in there but not sure it's worth to implement at this point because how many such cripple platform are.
Just for the recording.
Moving pages between low and high zone is not that easy. If I remember correctly you cannot migrate a page from low memory to high zone in generic case, although it should be possible to add exception for anonymous pages. This will definitely improve poor low memory handling in low zone when CMA is enabled.
Yeb, it's possible for anonymous pages but I just wonder it's worth to add more complexitiy to mm and and you are answering it's worth. Okay. May I understand your positive feedback means such platform( ie, DMA works with only lowmem) are still common?
Thanks.
Best regards
Marek Szyprowski, PhD Samsung R&D Institute Poland
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Hello,
On 2014-08-25 10:18, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:00:32AM +0200, Marek Szyprowski wrote:
On 2014-08-25 03:26, Minchan Kim wrote:
On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote:
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
Agreed. It should be from the beginning because CMA page is effectly pinned if it is anonymous page and system has no swap.
Nope. Even without swap, anonymous page can be correctly migrated to other location. Migration code doesn't depend on presence of swap.
I could be possible only if the zone has freeable page(ie, free pages
- shrinkable page like page cache). IOW, if the zone is full with
anon pages, it's efffectively pinned.
Why? __alloc_contig_migrate_range() uses alloc_migrate_target() function, which can take free page from any zone matching given flags.
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
I totally agree with the patchset although I didn't review code at all.
Another topic: It means it should be a problem still if system has CMA in lowmem by some reason(ex, hardware limit or other purpose of CMA rather than DMA subsystem)?
In that case, an idea that just popped in my head is to migrate pages from cma area to highest zone because they are all userspace pages which should be in there but not sure it's worth to implement at this point because how many such cripple platform are.
Just for the recording.
Moving pages between low and high zone is not that easy. If I remember correctly you cannot migrate a page from low memory to high zone in generic case, although it should be possible to add exception for anonymous pages. This will definitely improve poor low memory handling in low zone when CMA is enabled.
Yeb, it's possible for anonymous pages but I just wonder it's worth to add more complexitiy to mm and and you are answering it's worth. Okay. May I understand your positive feedback means such platform( ie, DMA works with only lowmem) are still common?
There are still some platforms, which have limited DMA capabilities. However the ability to move anonymous a page from lowmem to highmem will be a benefit in any case, as low memory is really much more precious.
It also doesn't look to be really hard to add this exception for anonymous pages from low memory. It will be just a matter of setting __GFP_HIGHMEM flag if source page is anonymous page in alloc_migrate_target() function. Am i right?
Best regards
Hey Marek,
On Mon, Aug 25, 2014 at 10:33:50AM +0200, Marek Szyprowski wrote:
Hello,
On 2014-08-25 10:18, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:00:32AM +0200, Marek Szyprowski wrote:
On 2014-08-25 03:26, Minchan Kim wrote:
On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote:
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
Agreed. It should be from the beginning because CMA page is effectly pinned if it is anonymous page and system has no swap.
Nope. Even without swap, anonymous page can be correctly migrated to other location. Migration code doesn't depend on presence of swap.
I could be possible only if the zone has freeable page(ie, free pages
- shrinkable page like page cache). IOW, if the zone is full with
anon pages, it's efffectively pinned.
Why? __alloc_contig_migrate_range() uses alloc_migrate_target() function, which can take free page from any zone matching given flags.
Strictly speaking, it's not any zones. It allows zones which are equal or lower with zone of source page.
Pz, look at Russell's case. The pgd_alloc is trying to allocate order 2 page on normal zone, which is lowest zone so there is no fallback zones to migrate anonymous pages in normal zone out and alloc_migrate_target doesn't allocate target page from higher zones of source page at the moment. That's why I call it as effectively pinned.
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
I totally agree with the patchset although I didn't review code at all.
Another topic: It means it should be a problem still if system has CMA in lowmem by some reason(ex, hardware limit or other purpose of CMA rather than DMA subsystem)?
In that case, an idea that just popped in my head is to migrate pages from cma area to highest zone because they are all userspace pages which should be in there but not sure it's worth to implement at this point because how many such cripple platform are.
Just for the recording.
Moving pages between low and high zone is not that easy. If I remember correctly you cannot migrate a page from low memory to high zone in generic case, although it should be possible to add exception for anonymous pages. This will definitely improve poor low memory handling in low zone when CMA is enabled.
Yeb, it's possible for anonymous pages but I just wonder it's worth to add more complexitiy to mm and and you are answering it's worth. Okay. May I understand your positive feedback means such platform( ie, DMA works with only lowmem) are still common?
There are still some platforms, which have limited DMA capabilities. However
Thanks for your comment. I just wanted to know it's worth before I dive into that but it seems I was driving wrong way. See below.
the ability to move anonymous a page from lowmem to highmem will be a benefit in any case, as low memory is really much more precious.
Maybe, but in case of this report, even if we move anonymous pages into higher zones, the problem(ie, OOM) is still there because pgd_alloc wanted high order page in no cma area in normal zone.
The feature which move CMA pages into higher zones would help CMA alloc latency if there are lots of free pages in higher zone but no freeable page in the zone which source page located in. But it wouldn't help this OOM problem.
It also doesn't look to be really hard to add this exception for anonymous pages from low memory. It will be just a matter of setting __GFP_HIGHMEM flag if source page is anonymous page in alloc_migrate_target() function. Am i right?
When I read source code, yes, it might work(but not sure I didn't test it) for anonymous page but as I read report from Russell in detail, fundamental problem is that why compaction didn't work although it has lots of free pages in normal zone.
Normal free:159400kB min:3440kB low:4300kB high:5160kB active_anon:54336kB inactive_anon:2580kB active_file:56kB inactive_file:204kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:778240kB managed:740044kB mlocked:0kB dirty:0kB writeback:0kB mapped:5336kB shmem:5428kB slab_reclaimable:14420kB slab_unreclaimable:383976kB kernel_stack:2512kB pagetables:1088kB unstable:0kB bounce:0kB free_cma:150788kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
As you can see, there are lots of CMA free pages in normal zone and there are (54M + 2M) anon pages so compaction should migrate 64M anon pages into CMA area but it didn't.
I think it's primary reason of the problem and moving CMA area into higher zone by default can hide the problem and it's not a exact solution of this problem. (But still I support this patch, default CMA area should be higher zones if possible)
Thesedays, there are some of reports about compaction fail although there are lots of free pages. I think the report is just one example.
Best regards
Marek Szyprowski, PhD Samsung R&D Institute Poland
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Hello,
On 2014-08-26 04:43, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:33:50AM +0200, Marek Szyprowski wrote:
On 2014-08-25 10:18, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:00:32AM +0200, Marek Szyprowski wrote:
On 2014-08-25 03:26, Minchan Kim wrote:
On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote:
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
Agreed. It should be from the beginning because CMA page is effectly pinned if it is anonymous page and system has no swap.
Nope. Even without swap, anonymous page can be correctly migrated to other location. Migration code doesn't depend on presence of swap.
I could be possible only if the zone has freeable page(ie, free pages
- shrinkable page like page cache). IOW, if the zone is full with
anon pages, it's efffectively pinned.
Why? __alloc_contig_migrate_range() uses alloc_migrate_target() function, which can take free page from any zone matching given flags.
Strictly speaking, it's not any zones. It allows zones which are equal or lower with zone of source page.
Pz, look at Russell's case. The pgd_alloc is trying to allocate order 2 page on normal zone, which is lowest zone so there is no fallback zones to migrate anonymous pages in normal zone out and alloc_migrate_target doesn't allocate target page from higher zones of source page at the moment. That's why I call it as effectively pinned.
In Russell's case the issue is related to compaction code. It should still be able to compact low zone and get some free pages. It is not a case of alloc_migrate_target. I mentioned this function because I wanted to show that it is possible to move pages out of that zone in case of doing CMA alloc and having no swap.
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
I totally agree with the patchset although I didn't review code at all.
Another topic: It means it should be a problem still if system has CMA in lowmem by some reason(ex, hardware limit or other purpose of CMA rather than DMA subsystem)?
In that case, an idea that just popped in my head is to migrate pages from cma area to highest zone because they are all userspace pages which should be in there but not sure it's worth to implement at this point because how many such cripple platform are.
Just for the recording.
Moving pages between low and high zone is not that easy. If I remember correctly you cannot migrate a page from low memory to high zone in generic case, although it should be possible to add exception for anonymous pages. This will definitely improve poor low memory handling in low zone when CMA is enabled.
Yeb, it's possible for anonymous pages but I just wonder it's worth to add more complexitiy to mm and and you are answering it's worth. Okay. May I understand your positive feedback means such platform( ie, DMA works with only lowmem) are still common?
There are still some platforms, which have limited DMA capabilities. However
Thanks for your comment. I just wanted to know it's worth before I dive into that but it seems I was driving wrong way. See below.
the ability to move anonymous a page from lowmem to highmem will be a benefit in any case, as low memory is really much more precious.
Maybe, but in case of this report, even if we move anonymous pages into higher zones, the problem(ie, OOM) is still there because pgd_alloc wanted high order page in no cma area in normal zone.
The feature which move CMA pages into higher zones would help CMA alloc latency if there are lots of free pages in higher zone but no freeable page in the zone which source page located in. But it wouldn't help this OOM problem.
Right. The mentioned OOM problem shows that compaction fails in some cases for unknown reasons. The question here is weather compaction_alloc() function is able to get free CMA pages or not. Right now I'm not sure if it will take pages from the right list or not. This case definitely should be investigated.
It also doesn't look to be really hard to add this exception for anonymous pages from low memory. It will be just a matter of setting __GFP_HIGHMEM flag if source page is anonymous page in alloc_migrate_target() function. Am i right?
When I read source code, yes, it might work(but not sure I didn't test it) for anonymous page but as I read report from Russell in detail, fundamental problem is that why compaction didn't work although it has lots of free pages in normal zone.
Normal free:159400kB min:3440kB low:4300kB high:5160kB active_anon:54336kB inactive_anon:2580kB active_file:56kB inactive_file:204kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:778240kB managed:740044kB mlocked:0kB dirty:0kB writeback:0kB mapped:5336kB shmem:5428kB slab_reclaimable:14420kB slab_unreclaimable:383976kB kernel_stack:2512kB pagetables:1088kB unstable:0kB bounce:0kB free_cma:150788kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
As you can see, there are lots of CMA free pages in normal zone and there are (54M + 2M) anon pages so compaction should migrate 64M anon pages into CMA area but it didn't.
I think it's primary reason of the problem and moving CMA area into higher zone by default can hide the problem and it's not a exact solution of this problem. (But still I support this patch, default CMA area should be higher zones if possible)
Thesedays, there are some of reports about compaction fail although there are lots of free pages. I think the report is just one example.
Ok. Thanks for supporting the change the zone for default CMA region. May I add your Acked-by?
The issue with OOM caused by failed compaction case should be definitely investigated, I will check if it can be easily reproduced or not.
Best regards
Hey Marek,
On Tue, Aug 26, 2014 at 02:34:24PM +0200, Marek Szyprowski wrote:
Hello,
On 2014-08-26 04:43, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:33:50AM +0200, Marek Szyprowski wrote:
On 2014-08-25 10:18, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:00:32AM +0200, Marek Szyprowski wrote:
On 2014-08-25 03:26, Minchan Kim wrote:
On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote: >Russell King recently noticed that limiting default CMA region only to >low memory on ARM architecture causes serious memory management issues >with machines having a lot of memory (which is mainly available as high >memory). More information can be found the following thread: >http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/ > >Those two patches removes this limit letting kernel to put default CMA >region into high memory when this is possible (there is enough high >memory available and architecture specific DMA limit fits). Agreed. It should be from the beginning because CMA page is effectly pinned if it is anonymous page and system has no swap.
Nope. Even without swap, anonymous page can be correctly migrated to other location. Migration code doesn't depend on presence of swap.
I could be possible only if the zone has freeable page(ie, free pages
- shrinkable page like page cache). IOW, if the zone is full with
anon pages, it's efffectively pinned.
Why? __alloc_contig_migrate_range() uses alloc_migrate_target() function, which can take free page from any zone matching given flags.
Strictly speaking, it's not any zones. It allows zones which are equal or lower with zone of source page.
Pz, look at Russell's case. The pgd_alloc is trying to allocate order 2 page on normal zone, which is lowest zone so there is no fallback zones to migrate anonymous pages in normal zone out and alloc_migrate_target doesn't allocate target page from higher zones of source page at the moment. That's why I call it as effectively pinned.
In Russell's case the issue is related to compaction code. It should still be able to compact low zone and get some free pages. It is not a case of alloc_migrate_target. I mentioned this function because I wanted to show that it is possible to move pages out of that zone in case of doing CMA alloc and having no swap.
>This should solve strange OOM issues on systems with lots of RAM >(i.e. >1GiB) and large (>256M) CMA area. I totally agree with the patchset although I didn't review code at all.
Another topic: It means it should be a problem still if system has CMA in lowmem by some reason(ex, hardware limit or other purpose of CMA rather than DMA subsystem)?
In that case, an idea that just popped in my head is to migrate pages from cma area to highest zone because they are all userspace pages which should be in there but not sure it's worth to implement at this point because how many such cripple platform are.
Just for the recording.
Moving pages between low and high zone is not that easy. If I remember correctly you cannot migrate a page from low memory to high zone in generic case, although it should be possible to add exception for anonymous pages. This will definitely improve poor low memory handling in low zone when CMA is enabled.
Yeb, it's possible for anonymous pages but I just wonder it's worth to add more complexitiy to mm and and you are answering it's worth. Okay. May I understand your positive feedback means such platform( ie, DMA works with only lowmem) are still common?
There are still some platforms, which have limited DMA capabilities. However
Thanks for your comment. I just wanted to know it's worth before I dive into that but it seems I was driving wrong way. See below.
the ability to move anonymous a page from lowmem to highmem will be a benefit in any case, as low memory is really much more precious.
Maybe, but in case of this report, even if we move anonymous pages into higher zones, the problem(ie, OOM) is still there because pgd_alloc wanted high order page in no cma area in normal zone.
The feature which move CMA pages into higher zones would help CMA alloc latency if there are lots of free pages in higher zone but no freeable page in the zone which source page located in. But it wouldn't help this OOM problem.
Right. The mentioned OOM problem shows that compaction fails in some cases for unknown reasons. The question here is weather compaction_alloc() function is able to get free CMA pages or not. Right now I'm not sure if it will take pages from the right list or not. This case definitely should be investigated.
I think it can because suitable_migrate_target and migrate_async_suitable consider CMA. That's why I think the culprit is cmpaction deferring logic and sent a patch to detect it. http://www.spinics.net/lists/kernel/msg1812538.html
I should have Cced you. Sorry for that!
It also doesn't look to be really hard to add this exception for anonymous pages from low memory. It will be just a matter of setting __GFP_HIGHMEM flag if source page is anonymous page in alloc_migrate_target() function. Am i right?
When I read source code, yes, it might work(but not sure I didn't test it) for anonymous page but as I read report from Russell in detail, fundamental problem is that why compaction didn't work although it has lots of free pages in normal zone.
Normal free:159400kB min:3440kB low:4300kB high:5160kB active_anon:54336kB inactive_anon:2580kB active_file:56kB inactive_file:204kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:778240kB managed:740044kB mlocked:0kB dirty:0kB writeback:0kB mapped:5336kB shmem:5428kB slab_reclaimable:14420kB slab_unreclaimable:383976kB kernel_stack:2512kB pagetables:1088kB unstable:0kB bounce:0kB free_cma:150788kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
As you can see, there are lots of CMA free pages in normal zone and there are (54M + 2M) anon pages so compaction should migrate 64M anon pages into CMA area but it didn't.
I think it's primary reason of the problem and moving CMA area into higher zone by default can hide the problem and it's not a exact solution of this problem. (But still I support this patch, default CMA area should be higher zones if possible)
Thesedays, there are some of reports about compaction fail although there are lots of free pages. I think the report is just one example.
Ok. Thanks for supporting the change the zone for default CMA region. May I add your Acked-by?
I tend to be quiet without reviewd or acked sign for arch specific stuff because I know there are other people to understand that part than me. :) Anyway, Andrew alrady merged so you don't need my sign any more.
One thing I'd like to change is description if Andrew accept or you respin. Description should say Russell's problem is compaction problem, not CMA but moving default CMA into higher zone will mitigate the problem but it's better than now at least.
The issue with OOM caused by failed compaction case should be definitely investigated, I will check if it can be easily reproduced or not.
Hope my patch I mentioned above helps you. Thanks.
Best regards
Marek Szyprowski, PhD Samsung R&D Institute Poland
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
On Wed, Aug 27, 2014 at 09:36:11AM +0900, Minchan Kim wrote:
Hey Marek,
On Tue, Aug 26, 2014 at 02:34:24PM +0200, Marek Szyprowski wrote:
Hello,
On 2014-08-26 04:43, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:33:50AM +0200, Marek Szyprowski wrote:
On 2014-08-25 10:18, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:00:32AM +0200, Marek Szyprowski wrote:
On 2014-08-25 03:26, Minchan Kim wrote: >On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote: >>Russell King recently noticed that limiting default CMA region only to >>low memory on ARM architecture causes serious memory management issues >>with machines having a lot of memory (which is mainly available as high >>memory). More information can be found the following thread: >>http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/ >> >>Those two patches removes this limit letting kernel to put default CMA >>region into high memory when this is possible (there is enough high >>memory available and architecture specific DMA limit fits). >Agreed. It should be from the beginning because CMA page is effectly >pinned if it is anonymous page and system has no swap. Nope. Even without swap, anonymous page can be correctly migrated to other location. Migration code doesn't depend on presence of swap.
I could be possible only if the zone has freeable page(ie, free pages
- shrinkable page like page cache). IOW, if the zone is full with
anon pages, it's efffectively pinned.
Why? __alloc_contig_migrate_range() uses alloc_migrate_target() function, which can take free page from any zone matching given flags.
Strictly speaking, it's not any zones. It allows zones which are equal or lower with zone of source page.
Pz, look at Russell's case. The pgd_alloc is trying to allocate order 2 page on normal zone, which is lowest zone so there is no fallback zones to migrate anonymous pages in normal zone out and alloc_migrate_target doesn't allocate target page from higher zones of source page at the moment. That's why I call it as effectively pinned.
In Russell's case the issue is related to compaction code. It should still be able to compact low zone and get some free pages. It is not a case of alloc_migrate_target. I mentioned this function because I wanted to show that it is possible to move pages out of that zone in case of doing CMA alloc and having no swap.
>>This should solve strange OOM issues on systems with lots of RAM >>(i.e. >1GiB) and large (>256M) CMA area. >I totally agree with the patchset although I didn't review code >at all. > >Another topic: >It means it should be a problem still if system has CMA in lowmem >by some reason(ex, hardware limit or other purpose of CMA >rather than DMA subsystem)? > >In that case, an idea that just popped in my head is to migrate >pages from cma area to highest zone because they are all >userspace pages which should be in there but not sure it's worth >to implement at this point because how many such cripple platform >are. > >Just for the recording. Moving pages between low and high zone is not that easy. If I remember correctly you cannot migrate a page from low memory to high zone in generic case, although it should be possible to add exception for anonymous pages. This will definitely improve poor low memory handling in low zone when CMA is enabled.
Yeb, it's possible for anonymous pages but I just wonder it's worth to add more complexitiy to mm and and you are answering it's worth. Okay. May I understand your positive feedback means such platform( ie, DMA works with only lowmem) are still common?
There are still some platforms, which have limited DMA capabilities. However
Thanks for your comment. I just wanted to know it's worth before I dive into that but it seems I was driving wrong way. See below.
the ability to move anonymous a page from lowmem to highmem will be a benefit in any case, as low memory is really much more precious.
Maybe, but in case of this report, even if we move anonymous pages into higher zones, the problem(ie, OOM) is still there because pgd_alloc wanted high order page in no cma area in normal zone.
The feature which move CMA pages into higher zones would help CMA alloc latency if there are lots of free pages in higher zone but no freeable page in the zone which source page located in. But it wouldn't help this OOM problem.
Right. The mentioned OOM problem shows that compaction fails in some cases for unknown reasons. The question here is weather compaction_alloc() function is able to get free CMA pages or not. Right now I'm not sure if it will take pages from the right list or not. This case definitely should be investigated.
Hello, Minchan and Marek.
IIUC, compaction_alloc() can get free CMA pages.
I think it can because suitable_migrate_target and migrate_async_suitable consider CMA. That's why I think the culprit is cmpaction deferring logic and sent a patch to detect it. http://www.spinics.net/lists/kernel/msg1812538.html
I guess that this problem is related to CMA. When direct_compaction begins, compaction logic check whether this zone is suitable or not by compaction_suitable(). In this function, we check fragmentation_index() and it didn't consider whether free_blocks is on CMA or not for free_blocks_suitable. So, in Russell's case, it would always return -1000 and then return COMPACT_PARTIAL. After all, compaction wouldn't actually happen and allocation request would fail, too.
I should note that there is one more flaw on zone_watermark_ok(). zone_watermark_ok() doesn't handle > 0 allocation correctly if there is free CMA memory so we can easily pass this watermakr check in this case.
I have a plan to fix it, but, it will takes some time. :)
Thanks.
On Wed, Aug 27, 2014 at 10:42:54AM +0900, Joonsoo Kim wrote:
On Wed, Aug 27, 2014 at 09:36:11AM +0900, Minchan Kim wrote:
Hey Marek,
On Tue, Aug 26, 2014 at 02:34:24PM +0200, Marek Szyprowski wrote:
Hello,
On 2014-08-26 04:43, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:33:50AM +0200, Marek Szyprowski wrote:
On 2014-08-25 10:18, Minchan Kim wrote:
On Mon, Aug 25, 2014 at 10:00:32AM +0200, Marek Szyprowski wrote: >On 2014-08-25 03:26, Minchan Kim wrote: >>On Thu, Aug 21, 2014 at 10:45:12AM +0200, Marek Szyprowski wrote: >>>Russell King recently noticed that limiting default CMA region only to >>>low memory on ARM architecture causes serious memory management issues >>>with machines having a lot of memory (which is mainly available as high >>>memory). More information can be found the following thread: >>>http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/ >>> >>>Those two patches removes this limit letting kernel to put default CMA >>>region into high memory when this is possible (there is enough high >>>memory available and architecture specific DMA limit fits). >>Agreed. It should be from the beginning because CMA page is effectly >>pinned if it is anonymous page and system has no swap. >Nope. Even without swap, anonymous page can be correctly migrated to other >location. Migration code doesn't depend on presence of swap. I could be possible only if the zone has freeable page(ie, free pages
- shrinkable page like page cache). IOW, if the zone is full with
anon pages, it's efffectively pinned.
Why? __alloc_contig_migrate_range() uses alloc_migrate_target() function, which can take free page from any zone matching given flags.
Strictly speaking, it's not any zones. It allows zones which are equal or lower with zone of source page.
Pz, look at Russell's case. The pgd_alloc is trying to allocate order 2 page on normal zone, which is lowest zone so there is no fallback zones to migrate anonymous pages in normal zone out and alloc_migrate_target doesn't allocate target page from higher zones of source page at the moment. That's why I call it as effectively pinned.
In Russell's case the issue is related to compaction code. It should still be able to compact low zone and get some free pages. It is not a case of alloc_migrate_target. I mentioned this function because I wanted to show that it is possible to move pages out of that zone in case of doing CMA alloc and having no swap.
>>>This should solve strange OOM issues on systems with lots of RAM >>>(i.e. >1GiB) and large (>256M) CMA area. >>I totally agree with the patchset although I didn't review code >>at all. >> >>Another topic: >>It means it should be a problem still if system has CMA in lowmem >>by some reason(ex, hardware limit or other purpose of CMA >>rather than DMA subsystem)? >> >>In that case, an idea that just popped in my head is to migrate >>pages from cma area to highest zone because they are all >>userspace pages which should be in there but not sure it's worth >>to implement at this point because how many such cripple platform >>are. >> >>Just for the recording. >Moving pages between low and high zone is not that easy. If I remember >correctly you cannot migrate a page from low memory to high zone in >generic case, although it should be possible to add exception for >anonymous pages. This will definitely improve poor low memory >handling in low zone when CMA is enabled. Yeb, it's possible for anonymous pages but I just wonder it's worth to add more complexitiy to mm and and you are answering it's worth. Okay. May I understand your positive feedback means such platform( ie, DMA works with only lowmem) are still common?
There are still some platforms, which have limited DMA capabilities. However
Thanks for your comment. I just wanted to know it's worth before I dive into that but it seems I was driving wrong way. See below.
the ability to move anonymous a page from lowmem to highmem will be a benefit in any case, as low memory is really much more precious.
Maybe, but in case of this report, even if we move anonymous pages into higher zones, the problem(ie, OOM) is still there because pgd_alloc wanted high order page in no cma area in normal zone.
The feature which move CMA pages into higher zones would help CMA alloc latency if there are lots of free pages in higher zone but no freeable page in the zone which source page located in. But it wouldn't help this OOM problem.
Right. The mentioned OOM problem shows that compaction fails in some cases for unknown reasons. The question here is weather compaction_alloc() function is able to get free CMA pages or not. Right now I'm not sure if it will take pages from the right list or not. This case definitely should be investigated.
Hello, Minchan and Marek.
IIUC, compaction_alloc() can get free CMA pages.
I think it can because suitable_migrate_target and migrate_async_suitable consider CMA. That's why I think the culprit is cmpaction deferring logic and sent a patch to detect it. http://www.spinics.net/lists/kernel/msg1812538.html
I guess that this problem is related to CMA. When direct_compaction begins, compaction logic check whether this zone is suitable or not by compaction_suitable(). In this function, we check fragmentation_index() and it didn't consider whether free_blocks is on CMA or not for free_blocks_suitable. So, in Russell's case, it would always return -1000 and then return COMPACT_PARTIAL. After all, compaction wouldn't actually happen and allocation request would fail, too.
Good catch! Acutally I checked it but I thought COMPACT_PARTIAL will go with compaction. Brain damaged.
I should note that there is one more flaw on zone_watermark_ok(). zone_watermark_ok() doesn't handle > 0 allocation correctly if there is free CMA memory so we can easily pass this watermakr check in this case.
I have a plan to fix it, but, it will takes some time. :)
Okay, I am looking forward to seeing that. Thanks Joonsoo!
Thanks.
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
On Thu, 21 Aug 2014 10:45:12 +0200 Marek Szyprowski m.szyprowski@samsung.com wrote:
Russell King recently noticed that limiting default CMA region only to low memory on ARM architecture causes serious memory management issues with machines having a lot of memory (which is mainly available as high memory). More information can be found the following thread: http://thread.gmane.org/gmane.linux.ports.arm.kernel/348441/
Those two patches removes this limit letting kernel to put default CMA region into high memory when this is possible (there is enough high memory available and architecture specific DMA limit fits).
This should solve strange OOM issues on systems with lots of RAM (i.e. >1GiB) and large (>256M) CMA area.
What do we think is the priority on these fixes? 3.17 or 3.18?
linaro-mm-sig@lists.linaro.org