Hello,
On Monday, January 30, 2012 2:06 PM Mel Gorman wrote:
On Thu, Jan 26, 2012 at 10:00:53AM +0100, Marek Szyprowski wrote:
alloc_contig_range() performs memory allocation so it also should keep track on keeping the correct level of memory watermarks. This commit adds a call to *_slowpath style reclaim to grab enough pages to make sure that the final collection of contiguous pages from freelists will not starve the system.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com CC: Michal Nazarewicz mina86@mina86.com
mm/page_alloc.c | 36 ++++++++++++++++++++++++++++++++++++ 1 files changed, 36 insertions(+), 0 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e35d06b..05eaa82 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5613,6 +5613,34 @@ static int __alloc_contig_migrate_range(unsigned long start, unsigned
long end)
return ret; }
+/*
- Trigger memory pressure bump to reclaim some pages in order to be able to
- allocate 'count' pages in single page units. Does similar work as
- *__alloc_pages_slowpath() function.
- */
+static int __reclaim_pages(struct zone *zone, gfp_t gfp_mask, int count) +{
- enum zone_type high_zoneidx = gfp_zone(gfp_mask);
- struct zonelist *zonelist = node_zonelist(0, gfp_mask);
- int did_some_progress = 0;
- int order = 1;
- unsigned long watermark;
- /* Obey watermarks as if the page was being allocated */
- watermark = low_wmark_pages(zone) + count;
- while (!zone_watermark_ok(zone, 0, watermark, 0, 0)) {
wake_all_kswapd(order, zonelist, high_zoneidx, zone_idx(zone));
did_some_progress = __perform_reclaim(gfp_mask, order, zonelist,
NULL);
if (!did_some_progress) {
/* Exhausted what can be done so it's blamo time */
out_of_memory(zonelist, gfp_mask, order, NULL);
}
There are three problems here
- CMA can trigger the OOM killer.
That seems like overkill to me but as I do not know the consequences of CMA failing, it's your call.
This behavior is intended, we agreed that the contiguous allocations should have higher priority than others.
- You cannot guarantee that try_to_free_pages will free pages from the zone you care about or that kswapd will do anything
You check the watermarks and take into account the size of the pending CMA allocation. kswapd in vmscan.c on the other hand will simply check the watermarks and probably go back to sleep. You should be aware of this in case you ever get bugs that CMA takes too long and that it appears to be stuck in this loop with kswapd staying asleep.
Right, I experienced this problem today. The simplest workaround I've found is to adjust watermark before calling kswapd, but I'm not sure that increasing min_free_kbytes and calling setup_per_zone_wmarks() is the nicest approach for it.
- You reclaim from zones other than your target zone
try_to_free_pages is not necessarily going to free pages in the zone you are checking for. It'll work on ARM in many cases because there will be only one zone but on other arches, this logic will be problematic and will potentially livelock. You need to pass in a zonelist that only contains the zone that CMA cares about. If it cannot reclaim, did_some_progress == 0 and it'll exit. Otherwise there is a possibility that this will loop forever reclaiming pages from the wrong zones.
Right. I tested it on a system with only one zone, so I never experienced such problem. For the first version I think we might assume that the buffer allocated by alloc_contig_range() must fit the single zone. I will add some comments about it. Later we can extend it for more advanced cases.
I won't ack this particular patch but I am not going to insist that you fix these prior to merging either. If you leave problem 3 as it is, I would really like to see a comment explaning the problem for future users of CMA on other arches (if they exist).
I will add more comments about the issues You have pointed out to make the life easier for other arch developers.
Best regards