On Tue 21-11-17 08:53:11, Vlastimil Babka wrote:
On 11/20/2017 08:39 PM, Mike Kravetz wrote:
If the call __alloc_contig_migrate_range() in alloc_contig_range returns -EBUSY, processing continues so that test_pages_isolated() is called where there is a tracepoint to identify the busy pages. However, it is possible for busy pages to become available between the calls to these two routines. In this case, the range of pages may be allocated. Unfortunately, the original return code (ret == -EBUSY) is still set and returned to the caller. Therefore, the caller believes the pages were not allocated and they are leaked.
Update the return code with the value from test_pages_isolated().
Good catch and seems ok for a stable fix. But it's another indication that this area needs some larger rewrite.
Absolutely. The whole thing is subtle as hell. And shaping the code just around the tracepoint here smells like the whole design could be thought through much more.