Hi Joel,
On Friday 08 Jun 2018 at 08:58:39 (-0700), Joel Fernandes wrote:
On Fri, Jun 08, 2018 at 08:54:57AM -0700, Joel Fernandes wrote:
From: "Joel Fernandes (Google)" joel@joelfernandes.org
Here's a very rough patch just to discuss prevention of decay of CPU/task's util_avg signal incase its preempted by RT or DL. Its likely not correct and needs more work but it solves the issue I see with my synthetic test.
To reproduce the issue, I wrote a synthetic rt-app test with RT task preempting a 100% CFS task for 300ms. https://pastebin.com/raw/rXNmRUZY I have seen in traces that the util_avg decays quickly even before the RT task sleeps.
Just to clarify, the issue is:
If a long running CFS task is preempted briefly by RT, then on return from RT itself util_avg will have crashed. The correct behavior IMO is the util_avg should not change and just continue accounting from where we left.
Yeah, so this idea looks a little bit like Patrick's proposal from one week ago or so. I think there is an issue that was mentioned with an example like this:
Only 1 CPU in the system, 1 CFS task (50%), 1 sporadic RT task _: sleeping R: running #: preempted
Scenario 1: RT: ____________RRRR____________ CFS: __RRRRRRRR________RRRRRRRR__ | | t1 t2
Scenario 2 RT: ____RRRR____________________ CFS: __RR####RRRRRR____RRRRRRRR__ | | | | t3 t4 t5 t6
So in the scenario 1, (t2-t1) is the time during which the CFS task decays. In scenario 2, IIUC what you propose to do is to remove (t4-t3) from the decay time. So the task is decayed only for (t6-t5). So, even if you don't increase the util_avg between t3 and t4, the task will look bigger than it really is because you decay it less.
With the current implementation, you would decay the task for (t4-t3) and (t6-t5), which is correct for this 50% task. The fact that the util_avg of the task decays during (t4-t3) doesn't matter, since you would have decayed the task for it at the end of its execution anyway if it wasn't preempted.
What I say above is true only if there is at least a little of idle time in the system, I think.
Does that make any sense ?
Thanks, Quentin