On 7 January 2014 14:22, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Jan 07, 2014 at 09:32:04AM +0100, Vincent Guittot wrote:
On 6 January 2014 17:31, Peter Zijlstra peterz@infradead.org wrote:
On Mon, Jan 06, 2014 at 02:41:31PM +0100, Vincent Guittot wrote:
IMHO, these settings will disappear sooner or later, as an example the idle/busy _idx are going to be removed by Alex's patch.
Well I'm still entirely unconvinced by them..
removing the cpu_load array makes sense, but I'm starting to doubt the removal of the _idx things.. I think we want to retain them in some form, it simply makes sense to look at longer term averages when looking at larger CPU groups.
So maybe we can express the things in log_2(group-span) or so, but we need a working replacement for the cpu_load array. Ideally some expression involving the blocked load.
Using the blocked load can surely give benefit in the load balance because it gives a view of potential load on a core but it still decay with the same speed than runnable load average so it doesn't solve the issue for longer term average. One way is to have a runnable average load with longer time window
Ah, another way of looking at it is that the avg without blocked component is a 'now' picture. It is the load we are concerned with right now.
The more blocked we add the further out we look; with the obvious limit of the entire averaging period.
So the avg that is runnable is right now, t_0; the avg that is runnable + blocked is t_0 + p, where p is the avg period over which we expect the blocked contribution to appear.
So something like:
avg = runnable + p(i) * blocked; where p(i) \e [0,1]
could maybe be used to replace the cpu_load array and still represent the concept of looking at a bigger picture for larger sets. Leaving open the details of the map p.
That needs to be studied more deeply but that could be a way to have a larger picture
Another point is that we are using runnable and blocked load average which are the sum of load_avg_contrib of tasks but we are not using the runnable_avg_sum of the cpus which is not the now picture but a average of the past running time (without taking into account task weight)
Vincent