On Fri, Apr 10, 2015 at 03:43:45PM +0100, Vincent Guittot wrote:
On 10 April 2015 at 16:14, Morten Rasmussen morten.rasmussen@arm.com wrote:
On Fri, Apr 10, 2015 at 02:51:18PM +0100, Vincent Guittot wrote:
On 10 April 2015 at 15:36, Morten Rasmussen morten.rasmussen@arm.com wrote:
On Wed, Apr 08, 2015 at 05:47:33PM +0100, Morten Rasmussen wrote:
On Wed, Apr 08, 2015 at 04:16:15PM +0100, Mike Turquette wrote:
On Wed, Apr 8, 2015 at 3:38 AM, Morten Rasmussen morten.rasmussen@arm.com wrote: > Hi Mike, > > On Tue, Apr 07, 2015 at 09:53:42PM +0100, Mike Turquette wrote:
[...]
>> #13 "cpufreq: Architecture specific callback for frequency changes" >> >> Vincent's fix for setting current freq (should be rolled into #13 above): >> https://git.linaro.org/people/vincent.guittot/kernel.git/commitdiff/3ac2b6a0... >> >> #14 "arm: Frequency invariant scheduler load-tracking support" > > Yes, these two are the essential ones I think. Since we no longer use a > __weak function to implement arch_scale_freq_capacity() they will have > to be modified a bit. I should be able to that today and provide you > with new versions of those two.
Great!
I have pushed two updated patches to the usual repo based on a fairly recent snapshot of tip/sched/core. I rolled in an optimization as well so the code looks slightly different now. I did a quick test on TC2 and it appears to be working.
Please let me know if it doesn't :)
http://www.linux-arm.org/git?p=linux-power.git%3Ba=shortlog%3Bh=refs/heads/w...
I looked into the issue with setting the current frequency at boot time where I included Vincent's fix in the branch above. It turns out that it isn't necessary with the new implementation of arch_scale_freq_capacity() as it will never return 0. If current frequency hasn't been reported by cpufreq (setting the default governor to userspace is the easiest test case) asfc() will return SCHED_CAPACITY_SCALE until the first frequency change occur. That should be fine I think.
One could argue that we are getting the scaling wrong until the first frequency change which should happen fairly soon after boot for any real
This can even never happen according to the governor you use
system. If the frequency doesn't change, say stays at 80% (like it does for TC2 with userspace as default), there isn't much reason to track frequency changes in the first place.
With current implementation, if the freq of a cpu is not set by cpufreq, arch_scale_freq_capacity will stay with default value whatever the real freq. If a cluster boots with powersave governor and the current freq is already min freq for this cluster, arch_scale_freq_capacity will return max capacity instead of min
All agreed. But do you think it is a problem?
If you run at a fixed frequency (like the performance and powersave, and even userspace in case the user does requrest any chances), does it really matter that arch_scale_freq_capacity is returning max capacity? It will artificially inflate the utilization/usage tracking, but since we aren't changing frequency the scaling is the same for everyone.
IHMO, it's matter because the utilization/usage and the weighted_load sooner or later will not track the right value which will disturb the load balancing.
One could of course come up with scenarios where you choose the powersave governor as default on big.LITTLE-like systems where one cluster would have a very wide frequency range and the other didn't. In that case your scaling could be far off the actual capacity. I'm not sure if it worth fixing that one. It shouldn't break anything.
i agree that the system will not crash but it will change the scheduler behavior which will use inaccurate figures. Then, it's not break anything for now but we don't know what could be the future use of this value so we should fix it now that we have noticed it instead of waiting more visible damage linked to a new use of the value
I will put it back in then :)
However, it doesn't mean that we are all good for all cpufreq drivers. AFAICT, drivers are not required to export a 'get' function which means that set policy->cur might not get set. Also, the drivers using 'set_policy' instead of target, might not fire the change notifiers. So there are plenty of other cases to worry about that are worse I think.
Morten