It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA pages, a heuristics is introduced which turns on CMA fallback for movable pages when the real number of free pages (excluding CMA free pages) approaches low water mark.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Reviewed-by: Kyungmin Park kyungmin.park@samsung.com CC: Michal Nazarewicz mina86@mina86.com --- mm/page_alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fcb9719..90b51f3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1076,6 +1076,15 @@ static struct page *__rmqueue(struct zone *zone, unsigned int order, { struct page *page;
+#ifdef CONFIG_CMA + unsigned long nr_free = zone_page_state(zone, NR_FREE_PAGES); + unsigned long nr_cma_free = zone_page_state(zone, NR_FREE_CMA_PAGES); + + if (migratetype == MIGRATE_MOVABLE && nr_cma_free && + nr_free - nr_cma_free < 2 * low_wmark_pages(zone)) + migratetype = MIGRATE_CMA; +#endif /* CONFIG_CMA */ + retry_reserve: page = __rmqueue_smallest(zone, order, migratetype);
On Mon, 12 Nov 2012 09:59:42 +0100 Marek Szyprowski m.szyprowski@samsung.com wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA pages, a heuristics is introduced which turns on CMA fallback for movable pages when the real number of free pages (excluding CMA free pages) approaches low water mark.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Reviewed-by: Kyungmin Park kyungmin.park@samsung.com CC: Michal Nazarewicz mina86@mina86.com
mm/page_alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fcb9719..90b51f3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1076,6 +1076,15 @@ static struct page *__rmqueue(struct zone *zone, unsigned int order, { struct page *page; +#ifdef CONFIG_CMA
- unsigned long nr_free = zone_page_state(zone, NR_FREE_PAGES);
- unsigned long nr_cma_free = zone_page_state(zone, NR_FREE_CMA_PAGES);
- if (migratetype == MIGRATE_MOVABLE && nr_cma_free &&
nr_free - nr_cma_free < 2 * low_wmark_pages(zone))
migratetype = MIGRATE_CMA;
+#endif /* CONFIG_CMA */
retry_reserve: page = __rmqueue_smallest(zone, order, migratetype);
erk, this is right on the page allocator hotpath. Bad.
At the very least, we could code it so it is not quite so dreadfully inefficient:
if (migratetype == MIGRATE_MOVABLE) { unsigned long nr_cma_free;
nr_cma_free = zone_page_state(zone, NR_FREE_CMA_PAGES); if (nr_cma_free) { unsigned long nr_free;
nr_free = zone_page_state(zone, NR_FREE_PAGES);
if (nr_free - nr_cma_free < 2 * low_wmark_pages(zone)) migratetype = MIGRATE_CMA; } }
but it still looks pretty bad.
Hello,
On 11/14/2012 11:58 PM, Andrew Morton wrote:
On Mon, 12 Nov 2012 09:59:42 +0100 Marek Szyprowski m.szyprowski@samsung.com wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA pages, a heuristics is introduced which turns on CMA fallback for movable pages when the real number of free pages (excluding CMA free pages) approaches low water mark.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Reviewed-by: Kyungmin Park kyungmin.park@samsung.com CC: Michal Nazarewicz mina86@mina86.com
mm/page_alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fcb9719..90b51f3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1076,6 +1076,15 @@ static struct page *__rmqueue(struct zone *zone, unsigned int order, { struct page *page;
+#ifdef CONFIG_CMA
- unsigned long nr_free = zone_page_state(zone, NR_FREE_PAGES);
- unsigned long nr_cma_free = zone_page_state(zone, NR_FREE_CMA_PAGES);
- if (migratetype == MIGRATE_MOVABLE && nr_cma_free &&
nr_free - nr_cma_free < 2 * low_wmark_pages(zone))
migratetype = MIGRATE_CMA;
+#endif /* CONFIG_CMA */
retry_reserve: page = __rmqueue_smallest(zone, order, migratetype);
erk, this is right on the page allocator hotpath. Bad.
Yes, I know that it adds an overhead to allocation hot path, but I found no other place for such change. Do You have any suggestion where such change can be applied to avoid additional load on hot path?
At the very least, we could code it so it is not quite so dreadfully inefficient:
if (migratetype == MIGRATE_MOVABLE) { unsigned long nr_cma_free;
nr_cma_free = zone_page_state(zone, NR_FREE_CMA_PAGES); if (nr_cma_free) { unsigned long nr_free; nr_free = zone_page_state(zone, NR_FREE_PAGES); if (nr_free - nr_cma_free < 2 * low_wmark_pages(zone)) migratetype = MIGRATE_CMA; }
}
but it still looks pretty bad.
Do You want me to resend such patch?
Best regards
On Mon, 19 Nov 2012 16:38:18 +0100 Marek Szyprowski m.szyprowski@samsung.com wrote:
Hello,
On 11/14/2012 11:58 PM, Andrew Morton wrote:
On Mon, 12 Nov 2012 09:59:42 +0100 Marek Szyprowski m.szyprowski@samsung.com wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA pages, a heuristics is introduced which turns on CMA fallback for movable pages when the real number of free pages (excluding CMA free pages) approaches low water mark.
...
erk, this is right on the page allocator hotpath. Bad.
Yes, I know that it adds an overhead to allocation hot path, but I found no other place for such change. Do You have any suggestion where such change can be applied to avoid additional load on hot path?
Do the work somewhere else, not on a hot path? Somewhere on the page reclaim path sounds appropriate. How messy would it be to perform some sort of balancing at reclaim time?
Hi Marek,
On Mon, Nov 12, 2012 at 09:59:42AM +0100, Marek Szyprowski wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable
CMA free pages are just fallback for movable pages so if user requires many user pages, it ends up consuming cma free pages after out of movable pages. What do you mean that system tend to keep free pages even in very high memory pressure?
pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA
Why does OOM is triggered very soon if movable pages are burned out while there are many cma pages?
It seems I can't understand your point quitely. Please make your problem clear for silly me to understand clearly.
Thanks.
pages, a heuristics is introduced which turns on CMA fallback for movable pages when the real number of free pages (excluding CMA free pages) approaches low water mark.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Reviewed-by: Kyungmin Park kyungmin.park@samsung.com CC: Michal Nazarewicz mina86@mina86.com
mm/page_alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fcb9719..90b51f3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1076,6 +1076,15 @@ static struct page *__rmqueue(struct zone *zone, unsigned int order, { struct page *page; +#ifdef CONFIG_CMA
- unsigned long nr_free = zone_page_state(zone, NR_FREE_PAGES);
- unsigned long nr_cma_free = zone_page_state(zone, NR_FREE_CMA_PAGES);
- if (migratetype == MIGRATE_MOVABLE && nr_cma_free &&
nr_free - nr_cma_free < 2 * low_wmark_pages(zone))
migratetype = MIGRATE_CMA;
+#endif /* CONFIG_CMA */
retry_reserve: page = __rmqueue_smallest(zone, order, migratetype); -- 1.7.9.5
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Hello,
On 11/20/2012 1:01 AM, Minchan Kim wrote:
Hi Marek,
On Mon, Nov 12, 2012 at 09:59:42AM +0100, Marek Szyprowski wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable
CMA free pages are just fallback for movable pages so if user requires many user pages, it ends up consuming cma free pages after out of movable pages. What do you mean that system tend to keep free pages even in very high memory pressure?
pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA
Why does OOM is triggered very soon if movable pages are burned out while there are many cma pages?
It seems I can't understand your point quitely. Please make your problem clear for silly me to understand clearly.
Right now running out of 'plain' movable pages is the only possibility to get movable pages allocated from CMA. On the other hand running out of 'plain' movable pages is very deadly for the system, as movable pageblocks are also the main fallbacks for reclaimable and non-movable pages.
Then, once we run out of movable pages and kernel needs non-mobable or reclaimable page (what happens quite often), it usually triggers OOM to satisfy the memory needs. Such OOM is very strange, especially on a system with dozen of megabytes of CMA memory, having most of them free at the OOM event. By high memory pressure I mean the high memory usage.
This patch introduces a heuristics which let kernel to consume free CMA pages before it runs out of 'plain' movable pages, what is usually enough to keep some spare movable pages for emergency cases before the reclaim occurs.
Best regards
On Tue, Nov 20 2012, Marek Szyprowski wrote:
Right now running out of 'plain' movable pages is the only possibility to get movable pages allocated from CMA. On the other hand running out of 'plain' movable pages is very deadly for the system, as movable pageblocks are also the main fallbacks for reclaimable and non-movable pages.
Then, once we run out of movable pages and kernel needs non-mobable or reclaimable page (what happens quite often), it usually triggers OOM to satisfy the memory needs. Such OOM is very strange, especially on a system with dozen of megabytes of CMA memory, having most of them free at the OOM event. By high memory pressure I mean the high memory usage.
Would it make sense to *always* use MIGRATE_CMA for movable allocations before MIGRATE_MOVABLE? Ie. how about this patch (not tested):
------------------------- >8 -------------------------------------------------
From 790a3b5743414f2770e413e5e8866679de2920b4 Mon Sep 17 00:00:00 2001
Message-Id: 790a3b5743414f2770e413e5e8866679de2920b4.1353425911.git.mina86@mina86.com From: Michal Nazarewicz mina86@mina86.com Date: Tue, 20 Nov 2012 16:37:50 +0100 Subject: [PATCH] mm: cma: on movable allocations try MIGRATE_CMA first
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable pages is used very rarely, only when system is completely pruned from MOVABLE pages. This means that the out-of-memory is triggered for unmovable allocations even when there are many CMA pages available. This problem was not observed previously since movable pages were used as a fallback for unmovable allocations.
To avoid such situation this commit changes the allocation order so that on movable allocations the MIGRATE_CMA pageblocks are used first.
This change means that the MIGRATE_CMA can be removed from fallback path of the MIGRATE_MOVABLE type. This means that the __rmqueue_fallback() function will never deal with CMA pages and thus all the checks around MIGRATE_CMA can be removed from that function.
Signed-off-by: Michal Nazarewicz mina86@mina86.com Reported-by: Marek Szyprowski m.szyprowski@samsung.com Cc: Kyungmin Park kyungmin.park@samsung.com --- mm/page_alloc.c | 55 +++++++++++++++++++++++++------------------------------ 1 files changed, 25 insertions(+), 30 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index bb90971..b60bd75 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -893,14 +893,12 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, * This array describes the order lists are fallen back to when * the free lists for the desirable migrate type are depleted */ -static int fallbacks[MIGRATE_TYPES][4] = { +static int fallbacks[MIGRATE_TYPES][3] = { [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE }, + [MIGRATE_MOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_RESERVE }, #ifdef CONFIG_CMA - [MIGRATE_MOVABLE] = { MIGRATE_CMA, MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_RESERVE }, [MIGRATE_CMA] = { MIGRATE_RESERVE }, /* Never used */ -#else - [MIGRATE_MOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_RESERVE }, #endif [MIGRATE_RESERVE] = { MIGRATE_RESERVE }, /* Never used */ [MIGRATE_ISOLATE] = { MIGRATE_RESERVE }, /* Never used */ @@ -1019,17 +1017,10 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype) * pages to the preferred allocation list. If falling * back for a reclaimable kernel allocation, be more * aggressive about taking ownership of free pages - * - * On the other hand, never change migration - * type of MIGRATE_CMA pageblocks nor move CMA - * pages on different free lists. We don't - * want unmovable pages to be allocated from - * MIGRATE_CMA areas. */ - if (!is_migrate_cma(migratetype) && - (unlikely(current_order >= pageblock_order / 2) || - start_migratetype == MIGRATE_RECLAIMABLE || - page_group_by_mobility_disabled)) { + if (unlikely(current_order >= pageblock_order / 2) || + start_migratetype == MIGRATE_RECLAIMABLE || + page_group_by_mobility_disabled) { int pages; pages = move_freepages_block(zone, page, start_migratetype); @@ -1048,14 +1039,12 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype) rmv_page_order(page);
/* Take ownership for orders >= pageblock_order */ - if (current_order >= pageblock_order && - !is_migrate_cma(migratetype)) + if (current_order >= pageblock_order) change_pageblock_range(page, current_order, start_migratetype);
expand(zone, page, order, current_order, area, - is_migrate_cma(migratetype) - ? migratetype : start_migratetype); + start_migratetype);
trace_mm_page_alloc_extfrag(page, order, current_order, start_migratetype, migratetype); @@ -1076,21 +1065,27 @@ static struct page *__rmqueue(struct zone *zone, unsigned int order, { struct page *page;
-retry_reserve: - page = __rmqueue_smallest(zone, order, migratetype); +#ifdef CONFIG_CMA + if (migratetype == MIGRATE_MOVABLE) + migratetype = MIGRATE_CMA; +#endif
- if (unlikely(!page) && migratetype != MIGRATE_RESERVE) { - page = __rmqueue_fallback(zone, order, migratetype); + for(;;) { + page = __rmqueue_smallest(zone, order, migratetype); + if (likely(page) || migratetype == MIGRATE_RESERVE) + break;
- /* - * Use MIGRATE_RESERVE rather than fail an allocation. goto - * is used because __rmqueue_smallest is an inline function - * and we want just one call site - */ - if (!page) { - migratetype = MIGRATE_RESERVE; - goto retry_reserve; + if (is_migrate_cma(migratetype)) { + migratetype = MIGRATE_MOVABLE; + continue; } + + page = __rmqueue_fallback(zone, order, migratetype); + if (page) + break; + + /* Use MIGRATE_RESERVE rather than fail an allocation. */ + migratetype = MIGRATE_RESERVE; }
trace_mm_page_alloc_zone_locked(page, order, migratetype);
On Tue, Nov 20, 2012 at 03:49:35PM +0100, Marek Szyprowski wrote:
Hello,
On 11/20/2012 1:01 AM, Minchan Kim wrote:
Hi Marek,
On Mon, Nov 12, 2012 at 09:59:42AM +0100, Marek Szyprowski wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable
CMA free pages are just fallback for movable pages so if user requires many user pages, it ends up consuming cma free pages after out of movable pages. What do you mean that system tend to keep free pages even in very high memory pressure?
pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA
Why does OOM is triggered very soon if movable pages are burned out while there are many cma pages?
It seems I can't understand your point quitely. Please make your problem clear for silly me to understand clearly.
Right now running out of 'plain' movable pages is the only possibility to get movable pages allocated from CMA. On the other hand running out of 'plain' movable pages is very deadly for the system, as movable pageblocks are also the main fallbacks for reclaimable and non-movable pages.
Then, once we run out of movable pages and kernel needs non-mobable or reclaimable page (what happens quite often), it usually triggers OOM to satisfy the memory needs. Such OOM is very strange, especially on a system with dozen of megabytes of CMA memory, having most of them free at the OOM event. By high memory pressure I mean the high memory usage.
So your concern is that too many free pages in MIGRATE_CMA when OOM happens is odd? It's natural with considering CMA design which kernel never fallback non-movable page allocation to CMA area. I guess it's not a your concern.
Let's think below extreme cases.
= Before =
* 1000M DRAM system. * 400M kernel used pages. * 300M movable used pages. * 300M cma freed pages.
1. kernel want to request 400M non-movable memory, additionally. 2. VM start to reclaim 300M movable pages. 3. But it's not enough to meet 400M request. 4. go to OOM. (It's natural)
= After(with your patch) =
* 1000M DRAM system. * 400M kernel used pages. * 300M movable *freed* pages. * 300M cma used pages(by your patch, I simplified your concept)
1. kernel want to request 400M non-movable memory. 2. 300M movable freed pages isn't enough to meet 400M request. 3. Also, there is no point to reclaim CMA pages for non-movable allocation. 4. go to OOM. (It's natural)
There is no difference between before and after in allocation POV. Let's think another example.
= Before =
* 1000M DRAM system. * 400M kernel used pages. * 300M movable used pages. * 300M cma freed pages.
1. kernel want to request 300M non-movable memory. 2. VM start to reclaim 300M movable pages. 3. It's enough to meet 300M request. 4. happy end
= After(with your patch) =
* 1000M DRAM system. * 400M kernel used pages. * 300M movable *freed* pages. * 300M cma used pages(by your patch, I simplified your concept)
1. kernel want to request 300M non-movable memory. 2. 300M movable freed pages is enough to meet 300M request. 3. happy end.
There is no difference in allocation POV, too.
So I guess that if you see OOM while there are many movable pages, I think principal problem is VM reclaimer which should try to reclaim best effort if there are freeable movable pages. If VM reclaimer has some problem for your workload, firstly we should try fix it rather than adding such heuristic to hot path. Otherwise, if you see OOM while there are many free CMA pages, it's not odd to me.
This patch introduces a heuristics which let kernel to consume free CMA pages before it runs out of 'plain' movable pages, what is usually enough to keep some spare movable pages for emergency cases before the reclaim occurs.
Best regards
Marek Szyprowski Samsung Poland R&D Center
-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, Nov 21 2012, Minchan Kim wrote:
So your concern is that too many free pages in MIGRATE_CMA when OOM happens is odd? It's natural with considering CMA design which kernel never fallback non-movable page allocation to CMA area. I guess it's not a your concern.
Let's think below extreme cases.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 400M non-movable memory, additionally.
- VM start to reclaim 300M movable pages.
- But it's not enough to meet 400M request.
- go to OOM. (It's natural)
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 400M non-movable memory.
- 300M movable freed pages isn't enough to meet 400M request.
- Also, there is no point to reclaim CMA pages for non-movable allocation.
- go to OOM. (It's natural)
There is no difference between before and after in allocation POV. Let's think another example.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 300M non-movable memory.
- VM start to reclaim 300M movable pages.
- It's enough to meet 300M request.
- happy end
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 300M non-movable memory.
- 300M movable freed pages is enough to meet 300M request.
- happy end.
There is no difference in allocation POV, too.
The difference thou is that before 30% of memory is wasted (ie. free), whereas after all memory is used. The main point of CMA is to make the memory useful if devices are not using it. Having it not allocated is defeating that purpose.
On Wed, Nov 21, 2012 at 02:07:04PM +0100, Michal Nazarewicz wrote:
On Wed, Nov 21 2012, Minchan Kim wrote:
So your concern is that too many free pages in MIGRATE_CMA when OOM happens is odd? It's natural with considering CMA design which kernel never fallback non-movable page allocation to CMA area. I guess it's not a your concern.
Let's think below extreme cases.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 400M non-movable memory, additionally.
- VM start to reclaim 300M movable pages.
- But it's not enough to meet 400M request.
- go to OOM. (It's natural)
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 400M non-movable memory.
- 300M movable freed pages isn't enough to meet 400M request.
- Also, there is no point to reclaim CMA pages for non-movable allocation.
- go to OOM. (It's natural)
There is no difference between before and after in allocation POV. Let's think another example.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 300M non-movable memory.
- VM start to reclaim 300M movable pages.
- It's enough to meet 300M request.
- happy end
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 300M non-movable memory.
- 300M movable freed pages is enough to meet 300M request.
- happy end.
There is no difference in allocation POV, too.
The difference thou is that before 30% of memory is wasted (ie. free), whereas after all memory is used. The main point of CMA is to make the memory useful if devices are not using it. Having it not allocated is defeating that purpose.
I think it's not a waste because if reclaimed movable pages is working set, they are soon reloaded to migrate_cma in this time.
-- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' ,=./ `o ..o | Computer Science, Michał “mina86” Nazarewicz (o o) ooo +----<email/xmpp: mpn@google.com>--------------ooO--(_)--Ooo--
Hello,
On 11/21/2012 2:05 AM, Minchan Kim wrote:
On Tue, Nov 20, 2012 at 03:49:35PM +0100, Marek Szyprowski wrote:
Hello,
On 11/20/2012 1:01 AM, Minchan Kim wrote:
Hi Marek,
On Mon, Nov 12, 2012 at 09:59:42AM +0100, Marek Szyprowski wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable
CMA free pages are just fallback for movable pages so if user requires many user pages, it ends up consuming cma free pages after out of movable pages. What do you mean that system tend to keep free pages even in very high memory pressure?
pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA
Why does OOM is triggered very soon if movable pages are burned out while there are many cma pages?
It seems I can't understand your point quitely. Please make your problem clear for silly me to understand clearly.
Right now running out of 'plain' movable pages is the only possibility to get movable pages allocated from CMA. On the other hand running out of 'plain' movable pages is very deadly for the system, as movable pageblocks are also the main fallbacks for reclaimable and non-movable pages.
Then, once we run out of movable pages and kernel needs non-mobable or reclaimable page (what happens quite often), it usually triggers OOM to satisfy the memory needs. Such OOM is very strange, especially on a system with dozen of megabytes of CMA memory, having most of them free at the OOM event. By high memory pressure I mean the high memory usage.
So your concern is that too many free pages in MIGRATE_CMA when OOM happens is odd? It's natural with considering CMA design which kernel never fallback non-movable page allocation to CMA area. I guess it's not a your concern.
My concern is how to minimize memory waste with CMA.
Let's think below extreme cases.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 400M non-movable memory, additionally.
- VM start to reclaim 300M movable pages.
- But it's not enough to meet 400M request.
- go to OOM. (It's natural)
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 400M non-movable memory.
- 300M movable freed pages isn't enough to meet 400M request.
- Also, there is no point to reclaim CMA pages for non-movable allocation.
- go to OOM. (It's natural)
There is no difference between before and after in allocation POV. Let's think another example.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 300M non-movable memory.
- VM start to reclaim 300M movable pages.
- It's enough to meet 300M request.
- happy end
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 300M non-movable memory.
- 300M movable freed pages is enough to meet 300M request.
- happy end.
There is no difference in allocation POV, too.
Those cases are just theoretical, out-of-real live examples. In real world kernel allocates (and frees) non-movable memory in small portions while system is running. Typically keeping some amount of free 'plain' movable pages is enough to make kernel happy about any kind of allocations (especially non-movable). This requirement is in complete contrast to the current fallback mechanism, which activates only when kernel runs out of movable pages completely.
So I guess that if you see OOM while there are many movable pages, I think principal problem is VM reclaimer which should try to reclaim best effort if there are freeable movable pages. If VM reclaimer has some problem for your workload, firstly we should try fix it rather than adding such heuristic to hot path. Otherwise, if you see OOM while there are many free CMA pages, it's not odd to me.
Frankly I don't see how reclaim procedure can ensure that it will be always possible to allocate non-movable pages with current fallback mechanism, which is used only when kernel runs out of pages of a given type. Could you explain how would You like to change the reclaim procedure to avoid the above situation?
This patch introduces a heuristics which let kernel to consume free CMA pages before it runs out of 'plain' movable pages, what is usually enough to keep some spare movable pages for emergency cases before the reclaim occurs.
Best regards
Hi Marek,
On Wed, Nov 21, 2012 at 04:50:45PM +0100, Marek Szyprowski wrote:
Hello,
On 11/21/2012 2:05 AM, Minchan Kim wrote:
On Tue, Nov 20, 2012 at 03:49:35PM +0100, Marek Szyprowski wrote:
Hello,
On 11/20/2012 1:01 AM, Minchan Kim wrote:
Hi Marek,
On Mon, Nov 12, 2012 at 09:59:42AM +0100, Marek Szyprowski wrote:
It has been observed that system tends to keep a lot of CMA free pages even in very high memory pressure use cases. The CMA fallback for movable
CMA free pages are just fallback for movable pages so if user requires many user pages, it ends up consuming cma free pages after out of movable pages. What do you mean that system tend to keep free pages even in very high memory pressure?
pages is used very rarely, only when system is completely pruned from MOVABLE pages, what usually means that the out-of-memory even will be triggered very soon. To avoid such situation and make better use of CMA
Why does OOM is triggered very soon if movable pages are burned out while there are many cma pages?
It seems I can't understand your point quitely. Please make your problem clear for silly me to understand clearly.
Right now running out of 'plain' movable pages is the only possibility to get movable pages allocated from CMA. On the other hand running out of 'plain' movable pages is very deadly for the system, as movable pageblocks are also the main fallbacks for reclaimable and non-movable pages.
Then, once we run out of movable pages and kernel needs non-mobable or reclaimable page (what happens quite often), it usually triggers OOM to satisfy the memory needs. Such OOM is very strange, especially on a system with dozen of megabytes of CMA memory, having most of them free at the OOM event. By high memory pressure I mean the high memory usage.
So your concern is that too many free pages in MIGRATE_CMA when OOM happens is odd? It's natural with considering CMA design which kernel never fallback non-movable page allocation to CMA area. I guess it's not a your concern.
My concern is how to minimize memory waste with CMA.
Let's think below extreme cases.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 400M non-movable memory, additionally.
- VM start to reclaim 300M movable pages.
- But it's not enough to meet 400M request.
- go to OOM. (It's natural)
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 400M non-movable memory.
- 300M movable freed pages isn't enough to meet 400M request.
- Also, there is no point to reclaim CMA pages for non-movable allocation.
- go to OOM. (It's natural)
There is no difference between before and after in allocation POV. Let's think another example.
= Before =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable used pages.
- 300M cma freed pages.
- kernel want to request 300M non-movable memory.
- VM start to reclaim 300M movable pages.
- It's enough to meet 300M request.
- happy end
= After(with your patch) =
- 1000M DRAM system.
- 400M kernel used pages.
- 300M movable *freed* pages.
- 300M cma used pages(by your patch, I simplified your concept)
- kernel want to request 300M non-movable memory.
- 300M movable freed pages is enough to meet 300M request.
- happy end.
There is no difference in allocation POV, too.
Those cases are just theoretical, out-of-real live examples. In real world kernel allocates (and frees) non-movable memory in small portions while system is running. Typically keeping some amount of free 'plain' movable pages is enough to make kernel happy about any kind of allocations (especially non-movable). This requirement is in complete contrast to the current fallback mechanism, which activates only when kernel runs out of movable pages completely.
So I guess that if you see OOM while there are many movable pages, I think principal problem is VM reclaimer which should try to reclaim best effort if there are freeable movable pages. If VM reclaimer has some problem for your workload, firstly we should try fix it rather than adding such heuristic to hot path. Otherwise, if you see OOM while there are many free CMA pages, it's not odd to me.
Frankly I don't see how reclaim procedure can ensure that it will be always possible to allocate non-movable pages with current fallback mechanism, which is used only when kernel runs out of pages of a given type. Could you explain how would You like to change the reclaim procedure to avoid the above situation?
What I have a mind is following as.
1. Reclaimer should migrate MIGRATE_MOVABLE into MIGRATE_CMA if there are free space in MIGRATE_CMA so VM could allocate non-movalbe pages with MIGRATE_MOVABLE fallback.
2. Reclaimer should consider non-movable page allocation. I mean reclaimer can reclaim MIGRATE_CMA pages when memory pressure happens by request of non-movable page but it is useless and such unnecessary reclaim hit performance. So reclaimer should reclaim target pages(ie, MIGRATE_MOVABLE)
3. If reclaiming got failed by some reason(ex, they are working set), we should reclaim MIGRATE_CMA and migrate MIGRATE_MOVABLE pages to MIGRATE_CMA. So kernel allocatio would be succeeded.
Above migration scheme is important for embedded system which don't have a swap because they has a limit to reclaim anonymous pages in MIGRATE_MOVABLE.
Will take a look when I have a time.
linaro-mm-sig@lists.linaro.org