On Mon, Mar 30, 2015 at 05:08:18PM +0200, Michal Hocko wrote:
On Sat 28-03-15 10:53:22, Peter Zijlstra wrote: [...]
Alternatively the thing hocko suggests is an utter fail too. You cannot stuff that into hardirq context, that's insane.
I guess you are referring to http://article.gmane.org/gmane.linux.kernel.mm/127569, right?
Why cannot we do something like refresh_cpu_vm_stats from the IRQ context? Especially the first zone stat part.
Big machines have big zone counts. There are machines with >200 nodes. Although with the current trend of bigger nodes, the number of nodes seems to come down as well. Still.
The per-cpu pagesets is more costly and it would need a special treatment, alright. A simple way would be to splice the lists from the per-cpu context and then free those pages from the kthread context.
I am still wondering why those two things were squashed into a single place. Why kswapd is not doing the pcp cleanup?
Probably because they could be. The problem with kswapd is that its per node, not per cpu.