On Mon, Nov 23, 2020 at 03:15:37PM +0100, David Hildenbrand wrote:
On 17.11.20 19:19, Minchan Kim wrote:
There is a need for special HW to require bulk allocation of high-order pages. For example, 4800 * order-4 pages, which would be minimum, sometimes, it requires more.
To meet the requirement, a option reserves 300M CMA area and requests the whole 300M contiguous memory. However, it doesn't work if even one of those pages in the range is long-term pinned directly or indirectly. The other option is to ask higher-order size (e.g., 2M) than requested order(64K) repeatedly until driver could gather necessary amount of memory. Basically, this approach makes the allocation very slow due to cma_alloc's function slowness and it could be stuck on one of the pageblocks if it encounters unmigratable page.
To solve the issue, this patch introduces cma_alloc_bulk.
int cma_alloc_bulk(struct cma *cma, unsigned int align, gfp_t gfp_mask, unsigned int order, size_t nr_requests, struct page **page_array, size_t *nr_allocated);
Most parameters are same with cma_alloc but it additionally passes vector array to store allocated memory. What's different with cma_alloc is it will skip pageblocks without waiting/stopping if it has unmovable page so that API continues to scan other pageblocks to find requested order page.
cma_alloc_bulk is best effort approach in that it skips some pageblocks if they have unmovable pages unlike cma_alloc. It doesn't need to be perfect from the beginning at the cost of performance. Thus, the API takes gfp_t to support __GFP_NORETRY which is propagated into alloc_contig_page to avoid significat overhead functions to inrecase CMA allocation success ratio(e.g., migration retrial, PCP, LRU draining per pageblock) at the cost of less allocation success ratio. If the caller couldn't allocate enough pages with __GFP_NORETRY, they could call it without __GFP_NORETRY to increase success ratio this time if they are okay to expense the overhead for the success ratio.
I'm not a friend of connecting __GFP_NORETRY to PCP and LRU draining.
I was also not a fan of the gfp flags in the cma funcions since it could cause misunderstanding easily but saw taling about brining back gfp_t into cma_alloc. Since It seems to be dropped, let's use other term.
diff --git a/mm/cma.c b/mm/cma.c index 7c11ec2dc04c..806280050307 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -505,7 +505,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, * * @cma: contiguous memory region for which the allocation is performed. * @align: requested alignment of pages (in PAGE_SIZE order). - * @gfp_mask: memory allocation flags + * @fast: will skip costly opeartions if it's true. * @order: requested page order * @nr_requests: the number of 2^order pages requested to be allocated as input, * @page_array: page_array pointer to store allocated pages (must have space @@ -513,10 +513,10 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, * @nr_allocated: the number of 2^order pages allocated as output * * This function tries to allocate up to @nr_requests @order pages on specific - * contiguous memory area. If @gfp_mask has __GFP_NORETRY, it will avoid costly - * functions to increase allocation success ratio so it will be fast but might - * return less than requested number of pages. User could retry with - * !__GFP_NORETRY if it is needed. + * contiguous memory area. If @fast has true, it will avoid costly functions + * to increase allocation success ratio so it will be faster but might return + * less than requested number of pages. User could retry it with true if it is + * needed. * * Return: it will return 0 only if all pages requested by @nr_requestsed are * allocated. Otherwise, it returns negative error code. @@ -525,7 +525,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, * how many @order pages are allocated and free those pages when they are not * needed. */ -int cma_alloc_bulk(struct cma *cma, unsigned int align, gfp_t gfp_mask, +int cma_alloc_bulk(struct cma *cma, unsigned int align, bool fast, unsigned int order, size_t nr_requests, struct page **page_array, size_t *nr_allocated) { @@ -538,8 +538,8 @@ int cma_alloc_bulk(struct cma *cma, unsigned int align, gfp_t gfp_mask, unsigned long start = 0; unsigned long bitmap_maxno, bitmap_no, bitmap_count; struct page *page = NULL; - gfp_t gfp = GFP_KERNEL|__GFP_NOWARN|gfp_mask; - + enum alloc_contig_mode mode = fast ? ALLOC_CONTIG_FAST : + ALLOC_CONTIG_NORMAL; *nr_allocated = 0; if (!cma || !cma->count || !cma->bitmap || !page_array) return -EINVAL; @@ -576,7 +576,8 @@ int cma_alloc_bulk(struct cma *cma, unsigned int align, gfp_t gfp_mask, mutex_unlock(&cma->lock);
pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); - ret = alloc_contig_range(pfn, pfn + nr_pages, MIGRATE_CMA, gfp); + ret = alloc_contig_range(pfn, pfn + nr_pages, MIGRATE_CMA, + GFP_KERNEL|__GFP_NOWARN, mode); if (ret) { cma_clear_bitmap(cma, pfn, nr_pages); if (ret != -EBUSY)
Also, gfp flags apply mostly to compaction (e.g., how to allocate free pages for migration), so this seems a little wrong.
Can we instead introduce
enum alloc_contig_mode { /* * Normal mode: * * Retry page migration 5 times, ... TBD * */ ALLOC_CONTIG_NORMAL = 0, /* * Fast mode: e.g., used for bulk allocations. * * Don't retry page migration if it fails, don't drain PCP * lists, don't drain LRU. */ ALLOC_CONTIG_FAST, };
Yeah, the mode is better. Let's have it as preparation patch.