On Thu, Aug 30, 2012 at 6:40 PM, Laura Abbott lauraa@codeaurora.org wrote:
Hi,
On 8/30/2012 8:45 AM, Aubertin, Guillaume wrote:
looping-in the linaro-mm-sig ML.
On Thu, Aug 30, 2012 at 4:47 PM, Aubertin, Guillaume <g-aubertin@ti.com mailto:g-aubertin@ti.com> wrote:
hi guys, I've been working for a few days on getting a proper rmmod with the remoteproc/rpmsg modules, and I stumbled upon an interesting issue. when doing sucessive memory allocation and release in the CMA reservation (by loading/unloading the firmware several times), the following message shows up : [ 119.908477] cma: dma_alloc_from_contiguous(cma ed10ad00, count 256, align 8) [ 119.908843] cma: dma_alloc_from_contiguous(): memory range at c0dfb000 is busy, retrying [ 119.909698] cma: dma_alloc_from_contiguous(): returned c0dfd000 dma_alloc_from_contiguous() tries to allocate the following range, 0xc0dfd000, succesfully this time. In some cases, the allocation fails after trying several ranges : [ 119.912231] cma: dma_alloc_from_contiguous(cma ed10ad00, count 768, align 8) [ 119.912719] cma: dma_alloc_from_contiguous(): memory range at c0dff000 is busy, retrying [ 119.913055] cma: dma_alloc_from_contiguous(): memory range at c0e01000 is busy, retrying [ 119.913055] rproc remoteproc0: dma_alloc_coherent failed: 3145728 Here is my understanding so far : First, even if we made a CMA reservation, the kernel can still allocate pages in this area, but these pages must be movable (user process page by example). When dma_alloc_from_contiguous() is called to allocate X pages, it looks for the next X contiguous free pages in it's CMA bitmap (with respect to the memory alignment). Then, alloc_contig_range() is called to allocate the given range of pages. Alloc_contig_range() analyses the pages we want to allocate, and if a page is already used, it is migrated to a new page outside the page array we want to reserve. this is done using isolate_migratepages_range() to list the pages to migrate, and migrate_pages() to try to migrate the pages, and that's where it fails. Below is a list of next function calls : fallback_migrate_page() --> migrate_page() --> try_to_release_page() --> try_to_free_buffer() --> drop_buffers() --> buffer_busy() I understand here that the page contains used buffers that can't be dropped, and so the page can't be migrated. Well, I must admit that once here, I'm feeling a little lost in this ocean of memory management code ;). After a few researches, I found the following thread on the linux-arm-kernel ML talking about the same issue : http://lists.infradead.org/**pipermail/linux-arm-kernel/**
2012-June/102844.htmlhttp://lists.infradead.org/pipermail/linux-arm-kernel/2012-June/102844.htmlwith the following patch :
/ mm/page_alloc.c | 3 ++-/ / 1 files changed, 2 insertions(+), 1 deletions(-)/ / / /diff --git a/mm/page_alloc.c b/mm/page_alloc.c/ /index 0e1c6f5..c9a6483 100644/ /--- a/mm/page_alloc.c/ /+++ b/mm/page_alloc.c/ /@@ -1310,7 +1310,8 @@ void free_hot_cold_page(struct page *page, int cold)/ /* excessively into the page allocator/ /*// /if (migratetype >= MIGRATE_PCPTYPES) {/ /-if (unlikely(migratetype == MIGRATE_ISOLATE)) {/ /+if (unlikely(migratetype == MIGRATE_ISOLATE)/ /+ || is_migrate_cma(migratetype)) {/ /free_one_page(zone, page, 0, migratetype);/ /goto out;/ /}/ I tried the patch, and it seems to work (I didn't have any "memory range busy" in 5000+ tests), but I'm affraid that this could have some nasty side effects. Any idea ? Thanks in advance, Guillaume
Hi,
Speaking as the author of that patch, I agree that it does have some nasty side effects and is not the right approach. I finally got down to debugging where the extra references came from and asked about it last night ( http://lists.linaro.org/**pipermail/linaro-mm-sig/2012-** August/002510.htmlhttp://lists.linaro.org/pipermail/linaro-mm-sig/2012-August/002510.html )
Basically, as long as the page cache buffers exist on the LRU list they can't be migrated away. I think the fix should be to drop the buffers from the LRU list when migrating away, but as mentioned there I don't really know the filesystem layer well enough to know if that is the right approach.
Unfortunately all the suggested fixes I've seen have nasty side effects so I don't think there has been any consensus on a good solution.
Thanks, Laura
Thanks a lot for the clarification. I'll give a shot to your quick hack in find_or_create_page() to be sure that I don't have any other underlying issue, and wait until a proper solution is defined.
Guillaume
______________________________**_________________
Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/**mailman/listinfo/linaro-mm-sighttp://lists.linaro.org/mailman/listinfo/linaro-mm-sig
-- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.