On Thu, 2013-04-18 at 18:34 +0200, Vincent Guittot wrote:
The current update of the rq's load can be erroneous when RT tasks are involved
The update of the load of a rq that becomes idle, is done only if the avg_idle is less than sysctl_sched_migration_cost. If RT tasks and short idle duration alternate, the runnable_avg will not be updated correctly and the time will be accounted as idle time when a CFS task wakes up.
A new idle_enter function is called when the next task is the idle function so the elapsed time will be accounted as run time in the load of the rq, whatever the average idle time is. The function update_rq_runnable_avg is removed from idle_balance.
When a RT task is scheduled on an idle CPU, the update of the rq's load is not done when the rq exit idle state because CFS's functions are not called. Then, the idle_balance, which is called just before entering the idle function, updates the rq's load and makes the assumption that the elapsed time since the last update, was only running time.
As a consequence, the rq's load of a CPU that only runs a periodic RT task, is close to LOAD_AVG_MAX whatever the running duration of the RT task is.
A new idle_exit function is called when the prev task is the idle function so the elapsed time will be accounted as idle time in the rq's load.
Changes since V5:
- Rename idle_enter/exit function to idle_enter/exit_fair
Changes since V4:
- Rebase on v3.9-rc6 instead of Steven Rostedt's patches
Acked-by: Steven Rostedt rostedt@goodmis.org
-- Steve
- Create the post_schedule_idle function that was previously created by Steven's patches
Changes since V3:
- Remove dependancy with CONFIG_FAIR_GROUP_SCHED
- Add a new idle_enter function and create a post_schedule callback for
idle class
- Remove the update_runnable_avg from idle_balance
Changes since V2:
- remove useless definition for UP platform
- rebased on top of Steven Rostedt's patches :
https://lkml.org/lkml/2013/2/12/558
Changes since V1:
- move code out of schedule function and create a pre_schedule callback for idle class instead.
Signed-off-by: Vincent Guittot vincent.guittot@linaro.org