-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/03/2014 10:46 AM, Peter Zijlstra wrote:
On Fri, Oct 03, 2014 at 10:28:42AM -0400, Rik van Riel wrote:
The current code has the potential to be quite painful on systems with a large number of cores per chip, so we will have to change things anyway...
What I said.. so far we've failed at coming up with anything sane though, so far we've found that 2 cpus is too small a slice to look at and we're fairly sure 18/36 is too large :-)
Some more brainstorming points...
1) We should probably (lazily/batched?) propagate load information up the sched_group tree. This will be useful for wake_affine, load_balancing, find_idlest_cpu, and select_idle_sibling
2) With both find_idlest_cpu and select_idle_sibling walking down the tree from the LLC level, they could probably share code
3) Counting both blocked and runnable load may give better long term stability of loads, resulting in a reduction in work preserving behaviour, but an improvement in locality - this could be worthwhile, but it is hard to say in advance
4) We can be pretty sure that CPU makers are not going to stop at a mere 18 cores. We need to subdivide things below the LLC level, turning select_idle_sibling and find_idlest_cpu into a tree walk.
This means whatever selection criteria are used by these need to be propagated up the sched_group tree. This, in turn, means we probably need to restrict ourselves to things that do not get changed/updated too often.
Am I overlooking anything?
- -- All rights reversed