Hi Mike,
On 06/05/15 01:58, Mike Turquette wrote:
On Tue, May 5, 2015 at 3:12 AM, Juri Lelli juri.lelli@arm.com wrote:
Hi Ashwin,
On 04/05/15 14:41, Ashwin Chaugule wrote:
Hi Juri,
On 29 April 2015 at 05:39, Juri Lelli juri.lelli@arm.com wrote:
On 29/04/15 09:32, Michael Turquette wrote:
Quoting Juri Lelli (2015-04-28 10:48:27)
Hi Mike,
I apologize in advance for the long email, but I'd still want to share with you today's thoughts :).
On 28/04/15 05:02, Michael Turquette wrote: > Quoting Juri Lelli (2015-04-27 10:09:50)
[snip]
>>> + >>> + wake_up_process(gd->task); >> >> So, we always wake up the kthread, even when we know that we won't >> need a freq change. This might be, I fear, an almost certain source of >> reasonable complain and pushback. I understand that we might not want >> to start optimizing things, but IMHO this point deserves some more >> thought before posting. Don't you think we could do some level of >> aggregation before kicking the kthread? In task_tick_fair(), for >> example, we could just check if we are beyond the 25% threshold and kick >> the kthread only in that case. > > This patch does not check against a threshold. It always requests a rate > based on the current utilization plus 25%. > > On systems with discretized cpu frequencies (opps) we will often target > the same opp, occasionally crossing the boundary into another opp. On > systems with continuous cpu frequencies we will continually give > ourselves "room to grow". >
Can you make an example of such systems?
CPPC-based systems.
I thought a lot about all of the feedback that my v1 patchset got last week on eas-dev. Two comments in particular colored my views on supporting continuous frequency bands and not relying on a threshold.
First is Ashwins' comment here: https://lists.linaro.org/pipermail/eas-dev/2015-April/000093.html
Second is Morten's reply here: https://lists.linaro.org/pipermail/eas-dev/2015-April/000094.html
If we decide that we only care about opps then it is easy to create a threshold for the opp "bucket" that we are currently in. But in a continuous system creating a threshold is more difficult. E.g. if we have decide to use an 80% threshold for a continuous system, we can easily determine if our current utilization exceeds this threshold at our current capacity/frequency. But what is the new frequency target? Without a table to guide us we have to just make something up!
Right, but I'm still not sure that we still want to continuously adapt to the current usage (plus the margin) as we might introduce too much overhead. Also, is it really worthy when we have to activate all this just to save a little more power or go a little more fast? This is really blue sky, but maybe a trade-off would be to try to discretize such systems (if it makes sense to control them from the scheduler). Yes, we already have an activation threshold, but I'm not sure this is enough.
IIUC, the optimization you're getting at is to suppress the CPU freq requests when it falls within some range of the current OPP? I think this may hamper certain latency sensitive workloads, since the freq ramp up could be potentially slowed down. So, theres some merit in making the request path as quick as possible and allow for continuous adaptation. I need to look at your patches in more detail, but eyeballing it seems like you're trying to achieve that.
So, the energy model (and please mind that the patches on top of Mike's patchset don't have that yet) currently gives you these "capacity bands". The idea is to try to adapt the OPP selection to the usage you see on your CPU/cluster. Since the usage signal is subject to saturation, what I'm trying to do is to avoid this condition by jumping up to the max available OPP when we realize that we are going to saturate a particular OPP. After we run for a small interval of time (say a tick) at that max OPP we can better estimate the real usage and directly select an OPP ("capacity band") that suits it.
I'm not sure about jumping to the max frequency when we detect that the signal is saturated.
Ondemand has similar behavior to this and many vendors have implemented out-of-tree solutions that do something like setting the frequency to an "intermediate" rate (maybe 2/3 of the total performance band) and then re-evaluate if they need to jump to max performance after another sampling period.
So at some point you might face the same issue where vendors find this approach too aggressive and wastes too much power, thus some intermediate level will be introduced. I'm not providing you any solutions here, but I'm saying that designing a policy algorithm that works well for everyone is super hard.
No doubt about this :).
I got your point, but I guess it should be fairly easy to make this freq at which we jump somewhat "configurable". Makes sense to me, considering the variety of shapes power-perf curves can have, for example.
I see your point, though. I think the two approaches differ for how we get to the desired capacity: ramping up from bottom vs. selecting from top.
From the energy model perspective, can a continuous performance band be supported at all or is it a hard requirement to have a discretized table?
I don't think it's a hard requirement (Morten or Dietmar may correct me here), but just an abstraction of the systems we develop onto today. I guess we would need to compute some formulas at run time, instead of reading tabular values, if we want to have continuous performance bands. Food for thought :).
We could also tablify continuous frequency domains based on some reasonable factor like 50Mhz or something. I guess that factor could even be supplied by the driver.
Agree. That's what I was thinking with "discretize continuous systems".
Best,
- Juri