looping-in the linaro-mm-sig ML.
hi guys,I've been working for a few days on getting a proper rmmod with the remoteproc/rpmsg modules, and I stumbled upon an interesting issue.when doing sucessive memory allocation and release in the CMA reservation (by loading/unloading the firmware several times), the following message shows up :[ 119.908477] cma: dma_alloc_from_contiguous(cma ed10ad00, count 256, align 8)[ 119.908843] cma: dma_alloc_from_contiguous(): memory range at c0dfb000 is busy, retrying[ 119.909698] cma: dma_alloc_from_contiguous(): returned c0dfd000dma_alloc_from_contiguous() tries to allocate the following range, 0xc0dfd000, succesfully this time.In some cases, the allocation fails after trying several ranges :--[ 119.912231] cma: dma_alloc_from_contiguous(cma ed10ad00, count 768, align 8)[ 119.912719] cma: dma_alloc_from_contiguous(): memory range at c0dff000 is busy, retrying[ 119.913055] cma: dma_alloc_from_contiguous(): memory range at c0e01000 is busy, retrying[ 119.913055] rproc remoteproc0: dma_alloc_coherent failed: 3145728Here is my understanding so far :First, even if we made a CMA reservation, the kernel can still allocate pages in this area, but these pages must be movable (user process page by example).When dma_alloc_from_contiguous() is called to allocate X pages, it looks for the next X contiguous free pages in it's CMA bitmap (with respect to the memory alignment). Then, alloc_contig_range() is called to allocate the given range of pages. Alloc_contig_range() analyses the pages we want to allocate, and if a page is already used, it is migrated to a new page outside the page array we want to reserve. this is done using isolate_migratepages_range() to list the pages to migrate, and migrate_pages() to try to migrate the pages, and that's where it fails. Below is a list of next function calls :fallback_migrate_page() --> migrate_page() --> try_to_release_page() --> try_to_free_buffer() --> drop_buffers() --> buffer_busy()I understand here that the page contains used buffers that can't be dropped, and so the page can't be migrated. Well, I must admit that once here, I'm feeling a little lost in this ocean of memory management code ;). After a few researches, I found the following thread on the linux-arm-kernel ML talking about the same issue :http://lists.infradead.org/pipermail/linux-arm-kernel/2012-June/102844.html with the following patch :mm/page_alloc.c | 3 ++-1 files changed, 2 insertions(+), 1 deletions(-)diff --git a/mm/page_alloc.c b/mm/page_alloc.cindex 0e1c6f5..c9a6483 100644--- a/mm/page_alloc.c+++ b/mm/page_alloc.c@@ -1310,7 +1310,8 @@ void free_hot_cold_page(struct page *page, int cold)* excessively into the page allocator*/if (migratetype >= MIGRATE_PCPTYPES) {- if (unlikely(migratetype == MIGRATE_ISOLATE)) {+ if (unlikely(migratetype == MIGRATE_ISOLATE)+ || is_migrate_cma(migratetype)) {free_one_page(zone, page, 0, migratetype);goto out;}I tried the patch, and it seems to work (I didn't have any "memory range busy" in 5000+ tests), but I'm affraid that this could have some nasty side effects.Any idea ?Thanks in advance,Guillaume
Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920