On 10/11/2011 09:17 AM, Marek Szyprowski wrote:
On Monday, October 10, 2011 2:08 PM Maxime Coquelin wrote:
During our stress tests, we encountered some problems : 1) Contiguous allocation lockup: When system RAM is full of Anon pages, if we try to allocate a
contiguous buffer greater than the min_free value, we face a dma_alloc_from_contiguous lockup. The expected result would be dma_alloc_from_contiguous() to fail. The problem is reproduced systematically on our side. Thanks for the report. Do you use Android's lowmemorykiller? I haven't tested CMA on Android kernel yet. I have no idea how it will interfere with Android patches.
The software used for this test (v16) is a generic 3.0 Kernel and a minimal filesystem using Busybox.
With v15 patchset, I also tested it with Android. IIRC, sometimes the lowmemorykiller succeed to get free space and the contiguous allocation succeed, sometimes we faced the lockup.
2) Contiguous allocation fail: We have developed a small driver and a shell script to
allocate/release contiguous buffers. Sometimes, dma_alloc_from_contiguous() fails to allocate the contiguous buffer (about once every 30 runs). We have 270MB Memory passed to the kernel in our configuration, and the CMA pool is 90MB large. In this setup, the overall memory is either free or full of reclaimable pages.
Yeah. We also did such stress tests recently and faced this issue. I've spent some time investigating it but I have no solution yet.
The problem is caused by a page, which is put in the CMA area. This page is movable, but it's address space provides no 'migratepage' method. In such case mm subsystem uses fallback_migrate_page() function. Sadly this function only returns -EAGAIN. The migration loops a few times over it and fails causing the fail in the allocation procedure.
We are investing now which kernel code created/allocated such problematic pages and how to add real migration support for them.
Ok, thanks for pointing this out.
Regards, Maxime