On 11/20/2017 08:39 PM, Mike Kravetz wrote:
If the call __alloc_contig_migrate_range() in alloc_contig_range returns -EBUSY, processing continues so that test_pages_isolated() is called where there is a tracepoint to identify the busy pages. However, it is possible for busy pages to become available between the calls to these two routines. In this case, the range of pages may be allocated. Unfortunately, the original return code (ret == -EBUSY) is still set and returned to the caller. Therefore, the caller believes the pages were not allocated and they are leaked.
Update the return code with the value from test_pages_isolated().
Good catch and seems ok for a stable fix. But it's another indication that this area needs some larger rewrite.
For example, it seems that the tracepoints in test_pages_isolated() will report not only pages which were busy during migration attempt, but also pages that were not at all attempted, because __alloc_contig_migrate_range() gave up?
Fixes: 8ef5849fa8a2 ("mm/cma: always check which page caused allocation failure") Cc: stable@vger.kernel.org Signed-off-by: Mike Kravetz mike.kravetz@oracle.com
Acked-by: Vlastimil Babka vbabka@suse.cz
mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 77e4d3c5c57b..3605ca82fd29 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7632,10 +7632,10 @@ int alloc_contig_range(unsigned long start, unsigned long end, } /* Make sure the range is really isolated. */
- if (test_pages_isolated(outer_start, end, false)) {
- ret = test_pages_isolated(outer_start, end, false);
- if (ret) { pr_info_ratelimited("%s: [%lx, %lx) PFNs busy\n", __func__, outer_start, end);
goto done; }ret = -EBUSY;