Hi Quentin,
On Mon, Jun 11, 2018 at 09:40:04AM +0100, Quentin Perret wrote:
Hi Joel,
On Friday 08 Jun 2018 at 08:58:39 (-0700), Joel Fernandes wrote:
On Fri, Jun 08, 2018 at 08:54:57AM -0700, Joel Fernandes wrote:
From: "Joel Fernandes (Google)" joel@joelfernandes.org
Here's a very rough patch just to discuss prevention of decay of CPU/task's util_avg signal incase its preempted by RT or DL. Its likely not correct and needs more work but it solves the issue I see with my synthetic test.
To reproduce the issue, I wrote a synthetic rt-app test with RT task preempting a 100% CFS task for 300ms. https://pastebin.com/raw/rXNmRUZY I have seen in traces that the util_avg decays quickly even before the RT task sleeps.
Just to clarify, the issue is:
If a long running CFS task is preempted briefly by RT, then on return from RT itself util_avg will have crashed. The correct behavior IMO is the util_avg should not change and just continue accounting from where we left.
Yeah, so this idea looks a little bit like Patrick's proposal from one week ago or so. I think there is an issue that was mentioned with an
I don't think this patch is similar to that.. my patch only affects CFS tasks being preempted by higher class. For CFS preempting CFS tasks, the intention is that the behavior should remain the same.
example like this:
Only 1 CPU in the system, 1 CFS task (50%), 1 sporadic RT task _: sleeping R: running #: preempted
Scenario 1: RT: ____________RRRR____________ CFS: __RRRRRRRR________RRRRRRRR__ | | t1 t2
Scenario 2 RT: ____RRRR____________________ CFS: __RR####RRRRRR____RRRRRRRR__ | | | | t3 t4 t5 t6
So in the scenario 1, (t2-t1) is the time during which the CFS task decays. In scenario 2, IIUC what you propose to do is to remove (t4-t3) from the decay time. So the task is decayed only for (t6-t5). So, even if you don't increase the util_avg between t3 and t4, the task will look bigger than it really is because you decay it less.
I don't see how. The task shouldn't be decayed between t3 and t4 unless it went to sleep. Say the RT task is not running, the CFS task's utilization should look like the same as if the RT task WAS running... Just because RT task ran, that shouldn't change utilization of the CFS task.. so scenario 2 should only decay between t5 and t6 whether the RT task ran or not. Between t3 and t4, my patch attempts to not change the signal at all for the CFS task and hit the pause button on it. And scenario 1 isn't affected by my patch.
With the current implementation, you would decay the task for (t4-t3) and (t6-t5), which is correct for this 50% task. The fact that the util_avg of the task decays during (t4-t3) doesn't matter, since you would have decayed the task for it at the end of its execution anyway if it wasn't preempted.
What I say above is true only if there is at least a little of idle time in the system, I think.
Does that make any sense ?
Sorry, no :(
I think your example is different from Vincent's. Vincent was talking about some of the "sleep time" of a task actually being counted as "preempted time" which would prevent its decay and cause it to look big. Basically if a task is sleeping, then it has to be decayed. The patch I wrote attempts to ensure that if the task is sleeping, then it will be decayed - unless you can show how this invariant isn't satisfied with the patch?
thanks!
- Joel