On Tue, Sep 4, 2012 at 1:01 PM, Nishanth Peethambaran nishanth.peethu@gmail.com wrote:
On Mon, Sep 3, 2012 at 12:46 PM, Haojian Zhuang haojian.zhuang@gmail.com wrote:
On Mon, Sep 3, 2012 at 2:38 PM, Nishanth Peethambaran nishanth.peethu@gmail.com wrote:
The lowmemkiller should kick in once the frees space goes below a threshold or heap gets fragmented, but in the attached patch, it kicks it only when allocation fails. Refer to the comparison of free space to lowmem_minfree[n] done in lowmem_shrink(). Even for CMA, lowmemkiller would be needed unless a huge area is reserved for CMA which will restrict the area available for non-movable allocations.
Could you help correct my understand.
Though lowmem provide lowmem_minfree and lowmem_adj, lowmem_shrink registered as shrink function and only be called when allocation fail.
3.4: mm/page_alloc.c __alloc_pages_nodemask -> if (unlikely(!page)) __alloc_pages_slowpath -> wake_all_kswapd -> vmscan.c kswapd -> do_shrinker_shrink -> lowmemorykiller.c lowmem_shrink
Simple experiment in my environment. by default static int lowmem_minfree[6] = { 3 * 512, /* 6MB */ 2 * 1024, /* 8MB */ 4 * 1024, /* 16MB */ 16 * 1024, /* 64MB */ }; # cd /tmp # dd if=/dev/zero of=del bs=1M count=350 #free total used free shared buffers Mem: 384576 377588 6988 0 0 Swap: 0 0 0 Total: 384576 377588 6988
lowmem_shrink is not triggered even free=6M and only happen when free=3M, suppose allocation fail.
Conclusion is lowmem_shrink only be triggered when allocation fail. While lowmem_minfree and lowmem_adj are useful when allocation fail. And there is no dameon monitoring system memory touch to the threshold.
I am not sure whether the understand is correct, however in our ion oom killer we also triggered the shrink function when allocation fail, which is simple. Or we have to start a kernel task keep monitoring ion size? looks complicated.
I may be wrong here. I have also started looking into linux and mm recently. My understanding is kswapd is a daemon and will be kicked in periodically as well, not only when alloc_page fails. So, to be more pro-active the min_free_threshold can be kept higher which will ensure background apps get killed when free memory falls below the threshold and so the allocation won't fail.
I have different opinion on it. If we set threshold, it means that heap size is monitored. For example, we create 100MB heap. If available heap size is less than 16MB, OOM killer would be triggered. But most failure cases are allocating larger memory size.
If we allocate 20MB heap and available heap size is 50MB. Although we have enough memory size, we can't get enough memory size because of memory fragment.
And how to tell it's memory fragment? If there are a lot of 2MB blocks, we allocate 512KB. There's no memory fragment. If there are a lot of 2MB blocks, we would fail to allocate 8MB because of memory fragment. We can't use a simple formula to judge which scenario is memory fragment or not. And killer would kill background tasks. We should avoid to trigger killer frequently.
You are right. I did not mean we will avoid having a oom killer which is more of reactive and more aggressive - could kill tasks upto oom_adj of 1/0 depending on whether we want to force-kill foreground app or want the user-space app to take action on allocation faiilure. I am just exploring an option of having a lowmem killer which is more of pro-active type, need not be aggressive (maybe kill apps of oom_adj
6). But, I see few issues with that.
- Some of heaps do not need explicit killer (kmalloc/vmalloc) types
could work with default android lowmem killer. - Having lowmemkiller/shrink specific to heap will solve it.
I think that we can add flags field into struct ion_platform_data. We can also define ION_HEAP_OOM_KILLER for flags. If user defines ION_HEAP_OOM_KILLER in platform data, the OOM killer would be enabled for this heap. Otherwise, OOM killer would be skipped.
- Keeping track of heap usage and total size available.
- Add variables to store reserved size and to track current usage
of heap and updating it on alloc/free will solve it.
It's OK to show this kind of message in debugfs. It could help user to know current memory usage. We have similar patch that could be submitted later.
- Fragmentation as you pointed out.
- Solution1 is to not let the heap to fragment. We use a
user-space allocator which avoid number of system calls and always request ION to give same sized slabs.
It's hard to say which size is standard size. Maybe camera driver needs 10MB block. Maybe LCD driver needs 16MB block. If we define the size too large, it wastes memory. If we define the size too small, it can't benefit us.
- Solution2 is to check the biggest available contiguous chunk
available and have a separate threshold for this. (android lowmem killer also checks multiple items) but the major issue with this is that entire buffer-list of the heap will have to be run through each time which is not good.
It could be implemented, but it's a little complex. In this case, killer would be triggered after malloc. In order to implement it, we need to define threshold to monitor chunk usage. If we only use #3, it could be much simpler.
- Solution3 is the lowmem killer not to take care of fragmentation
check and let oom killer do it.
It's simple and effect. We like this one. What's your opinion?
- Defining a general policy is where I am struggling.
- One possibility is to treat ion heaps similar to linux mm
zones(dma,normal,highmem,cma...), each zone has a threshold and once each zone goes below the threshold, shrinking starts. eg: vmalloc is similar to highmem and has id=0, CMA/carveout is similar to normal and has id=1, specific CMA/varveout is similar to dma and has id=2.
What's the definition of id? In ion_platform_data, the comment says "unique identifier for heap. When allocating (lower numbers will be allocated from first)".
How about defining two CMA or carveout heaps at the same time?