Re: [Linaro-mm-sig] [PATCH v2 2/4] mm: introduce cma_alloc_bulk API

2 Dec 2020


      On 02.12.20 16:49, Michal Hocko wrote:
...
On Wed 02-12-20 10:14:41, David Hildenbrand wrote:
...
On 01.12.20 18:51, Minchan Kim wrote:
...
There is a need for special HW to require bulk allocation of
high-order pages. For example, 4800 * order-4 pages, which
would be minimum, sometimes, it requires more.
To meet the requirement, a option reserves 300M CMA area and
requests the whole 300M contiguous memory. However, it doesn't
work if even one of those pages in the range is long-term pinned
directly or indirectly. The other option is to ask higher-order
My latest knowledge is that pages in the CMA area are never long term
pinned.
https://lore.kernel.org/lkml/20201123090129.GD27488@dhcp22.suse.cz/
"gup already tries to deal with long term pins on CMA regions and migrate
to a non CMA region. Have a look at __gup_longterm_locked."
We should rather identify ways how that is still possible and get rid of
them.
Now, short-term pinnings and PCP are other issues where
alloc_contig_range() could be improved (e.g., in contrast to a FAST
mode, a HARD mode which temporarily disables the PCP, ...).
Agreed!
...
...
size (e.g., 2M) than requested order(64K) repeatedly until driver
could gather necessary amount of memory. Basically, this approach
makes the allocation very slow due to cma_alloc's function
slowness and it could be stuck on one of the pageblocks if it
encounters unmigratable page.
To solve the issue, this patch introduces cma_alloc_bulk.
int cma_alloc_bulk(struct cma *cma, unsigned int align,
   	bool fast, unsigned int order, size_t nr_requests,
   	struct page **page_array, size_t *nr_allocated);
Most parameters are same with cma_alloc but it additionally passes
vector array to store allocated memory. What's different with cma_alloc
is it will skip pageblocks without waiting/stopping if it has unmovable
page so that API continues to scan other pageblocks to find requested
order page.
cma_alloc_bulk is best effort approach in that it skips some pageblocks
if they have unmovable pages unlike cma_alloc. It doesn't need to be
perfect from the beginning at the cost of performance. Thus, the API
takes "bool fast parameter" which is propagated into alloc_contig_range to
avoid significat overhead functions to inrecase CMA allocation success
ratio(e.g., migration retrial, PCP, LRU draining per pageblock)
at the cost of less allocation success ratio. If the caller couldn't
allocate enough, they could call it with "false" to increase success ratio
if they are okay to expense the overhead for the success ratio.
Just so I understand what the idea is:
alloc_contig_range() sometimes fails on CMA regions when trying to
allocate big chunks (e.g., 300M). Instead of tackling that issue, you
rather allocate plenty of small chunks, and make these small allocations
fail faster/ make the allocations less reliable. Correct?
I don't really have a strong opinion on that. Giving up fast rather than
trying for longer sounds like a useful thing to have - but I wonder if
it's strictly necessary for the use case you describe.
I'd like to hear Michals opinion on that.
Well, what I can see is that this new interface is an antipatern to our
allocation routines. We tend to control allocations by gfp mask yet you
are introducing a bool parameter to make something faster... What that
really means is rather arbitrary. Would it make more sense to teach
cma_alloc resp. alloc_contig_range to recognize GFP_NOWAIT, GFP_NORETRY resp.
GFP_RETRY_MAYFAIL instead?
Minchan did that before, but I disliked gluing things like "don't drain
lru, don't drain pcp" to GFP_NORETRY and shifting responsibility to the
user.
-- 
Thanks,

David / dhildenb

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [PATCH v2 2/4] mm: introduce cma_alloc_bulk API