Re: [Linaro-acpi] [RFC 0/3] Experimental patchset for CPPC

15 Aug 2014

      Hello,
On 15 August 2014 11:47, Arjan van de Ven arjan@linux.intel.com wrote:
...
On 8/15/2014 7:24 AM, Ashwin Chaugule wrote:
...
...
...
we've found that so far that there are two reasonable options

Let the OS device (old style)

Let the hardware decide (new style)

is there in practice today in the turbo range (which is increasingly

the whole thing)
and the hardware can make decisions about power budgetting on a
timescale
the OS
can never even dream of, so once you give control the the hardware (with
CPPC or native)
it's normally better to just get out of the way as OS.
Interesting. This sounds like X86 plans to use the Autonomous bits
that got added to the CPPC spec. (v5.1)?
if and when x86/Intel implement that, we will certainly evaluate it to see
how it behaves... but based on todays use of the hw control of the actual
p-state... I would expect that evaluation to pass.
note that on todays multi-core x96 systems, in practice you operate mostly
in the turbo range (I am ignoring mostly-idle workloads since there the
p-state isn't nearly as relevant anyway); all it takes for one of the cores
to request
a turbo-range state, and the whole chip operates in turbo mode.. and in
turbo mode
the hardware already picks the frequency/voltage.
x96 - Wonder what that has! ;)
So, this I think brings back my point of Freq domain awareness (or
lack of) in todays governors. On X86, it seems as though, the h/w can
take care of "Freq voting rights" among CPUs and it knows to ignore a
request after the requestor goes to sleep. That way the other CPUs in
the domain dont unnecessarily operate under a higher freq/voltage and
their vote can become current. Also on X86, all CPUs are assumed to
have the same min, max operating points?
This may not be true on ARM (or others). So if the h/w isnt capable of
automatically updating freq/voltage for a domain, then the OS needs to
provide that. And I think we can achieve that through the knowledge of
system topology and having a centralized CPU policy governor for each
domain. If each CPU in the domain is capable of making decisions on
behalf of everyone in that domain, then we can at least get past the
problem of "stale CPU freq votes". (replace freq with performance in
CPPC terms).
e.g. to make my point clear, assume there are 3 cpus in the system.
C0, C1 are in one domain and C2 is in another.
If C0 asks for 3Ghz and C1 asks for 1Ghz, the h/w delivers 3Ghz. But
now C0 goes to sleep. With todays governors, we dont reevaluate and
so, C1 continues to get 3Ghz even though it doesnt need it. Maybe X86
can figure out that C0 is asleep and so it should now deliver 1Ghz,
but ARM does not have that AFAIK. So we need the governor to
reevaluate between C0 and C1 (preferably through aperf/mperf like
ratios, rather than the broken p-state assumptions) and send a new
request to ask for 1Ghz.
...
with the current (and more so, past) Linux behavior, even at moderate loads
you end up
there; the more cores you have the more true that becomes.
...
I agree that the platform can
make decisions on a much finer timescale. But even in the
non-Autonomous mode, by providing the bounds around a Desired Value,
the OS can get out of the way knowing that the platform would deliver
something in the range it requested. If the OS can provide bounds, it
seems to me that the platform can make more optimum decisions, rather
than trying to guess whats running (or not).
I highly question that the OS can provide intelligent bounds.
Agreed. This is a challenging problem. Hence the wider discussion.
...
When are you going to request an upper bound that is lower than maximum?
(don't say thermals, there are other mechanisms for controlling thermals
that work much better than direct P state control). Are you still going to
do that
even if sometimes lower frequencies end up costing more battery?
(race-to-halt and all that)
Maybe the answer is that in the short term, we always request for MAX
in the (Max, Min, Desired) tuple. Although I suspect some platforms
will still use P state controls for thermal mitigation.
...
I can see cases where you bump the minimum for QoS reasons, but even there I
would
dare to say that at best the OS will be doing wild-ass guesses.
Right. I see Min being used for QoS too.
Cheers,
Ashwin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Linaro-acpi] [RFC 0/3] Experimental patchset for CPPC