Re: [Question] Regard Of IKS Implementation

12 May 2013

      On Fri, 10 May 2013, Leo Yan wrote:
...
On 05/08/2013 11:40 PM, Nicolas Pitre wrote:
...
On Wed, 8 May 2013, Leo Yan wrote:
...

When outbound core wake up inbound core, the outbound core's thread

will
sleep until the inbound core use MCPM’s early pork to send IPI;
a) Looks like this method somehow is due to TC2 board has long letancy to
power on/off cluster and core; right? How about to use polling method?
because
on our own SoC, the wakenup interval will take _only_ about 10 ~ 20us;
There is no need to poll anything.  If your SOC is fast enough in all
cases, then the outbound may simply go ahead and let the inbound resume
with the saved context whenever it is ready.
Yes, i go through the code and here should be fine; Let's keep current simple
code.
Here have a corner case is: the outbound core set power controller's register
for power down, then it will flush its L1 cache; if it's the last man of the
cluster it need flush L2 cache as well, so this operation may take long time
(about 2ms for 512KB's L2 cache).
At the meantime, the inbound core is running concurrently and the inbound core
may trigger another switching and call *mcpm_cpu_power_up()* to help set some
power controller registers' for outbound core; so finally when the outbound
core call "WFI", the outbound core cannot be really powered off by power
controller; So the polling at here is the inbound core will wait until the
outbound core is really powered off.
That should be handled with the code in mcpm_head.S that waits for the 
CLUSTER_GOING_DOWN state to go away.
What your MCPM backend can do as an optimization is to check the inbound 
cluster state once in a while during its L2 flush procedure, and abort 
the flush halfway when it sees INBOUND_COMING_UP.
...
Even the outbound core has not been powered off, it will not introduce any
issue. Because if the outbound core is waken up from "WFI" state, it will run
s/w's reset sequence.
Here have ONLY one thing need to confirm: the state machine of SoC's power
controller will totally not be interfered by upper corner case. :-)
Indeed.  This is not trivial to get everything right.
...
...
The switcher is _just_ a specialized CPU hotplug agent with a special
side effect.  What it does is to tell the MCPM layer to shut CPU x down,
power up CPU y, etc.  It happens that cpuidle may be doing the same
thing in parallel, and so does the classical CPU hotplug.
So you must add your voltage policy into the MCPM backend for your
platform instead, _irrespective_ of the switcher presence.
MCPM is a basic framework for cpuidle/IKS/hotplug, all low power mode should
run with MCPM's general APIs; so it's make sense to add related code into MCPM
backend.
Let's see another scenario:
At the beginning the logical core 0 is running on A7 core, if the profiling
governor (such as cpufreq's governor) think the performance is not high
enough, then it will call *bL_switch_request()* to switch to A15 core, the
function *bL_switch_request()* will directly return back; but from this point
the governor will think now it has already run on A15, so the governor will do
profiling based on A15's frequency but actually it still run on A7. So the
switcher's async operation may introduce some misunderstanding for governor;
How about u think for this?
This is indeed the reason why a switch completion callback facility was 
recently added: to notify cpufreq governors that the operation has 
completed.  The cpufreq layer has pre and post frequency change 
callbacks and obviously the post callback should be invoked only when 
the switch is complete.
...
...
...

After enabled switcher, then it will disable hotplug.

Actually current code can support hotplug with IKS; because with IKS, the
logical core will map the according physical core id and GIC's interface
id,
so that it can make sure if the system has hotplugged out which physical
core,
later the kernel can hotplug in this physical core. So could u give more
hints
why iks need disable hotplug?
The problem is to decide what the semantic of a hotplug request would
be.
Let's use an example.  The system boots with CPUs 0,1,2,3.  When the
switcher initializes, it itself hot-unplugs CPUs 2,3 and only CPUs 0,1
remain. Of course physical CPUs 2 and 3 are used when a switch happens,
but even if physical CPU 0 is switched to physical CPU 2, the logical
CPU number as far as Linux is concerned remains CPU 0.  So even if CPU 2
is running, Linux thinks this is still CPU 0.
Now, when the switcher is active, we must forbid any hotplugging of
logical CPUs 2 and 3, or the semantic of the switcher would be broken.
So that means keeping track of CPUs that can and cannot be hotplugged,
and that lack of uniformity is likely to cause confusion in user space
already.
But if you really want to hot-unplug CPU 0, this might correspond to
either physical CPU0 or physical CPU2.  What physical CPU should be
brought back in when a hotplug request comes in?
From the functionality level, who is hot-unplugged, who will be brought back.
That might be easy to implement by doing a slight tweaking of the 
request vetoing performed by bL_switcher_hotplug_callback().
...
...
And if the switcher is disabled after hot-unplugging CPU0 when it was
still active, should both physical CPUs 0 and 2 left disabled, or
logical CPU2 should be brought back online nevertheless?
This is a hard decision for dynamical enabling/disabling IKS. Maybe when
disable IKS, we need go back to the start point before we enable IKS: hot-plug
all cores and then disable IKS.
Yes, but that doesn't look pretty.  Hence I want to be convinced of the 
value of hotplug with the switcher active before making such 
compromises.
...
...
So please tell me: why do you want CPU hotplug in combination with the
switcher in the first place?  Using hotplug for power management is
already a bad idea to start with given the cost and overhead associated
with it.  The switcher does perform CPU hotplugging behind the scene but
it avoids all the extra costs from a hotplug operation of a logical CPU
in the core kernel.
Sometimes the customer has strictly power requirement for phone. We found we
can get some benefit from hot-plug/hot-unplug when the system have low load,
the basic reason is: we can reduce times for the core enter/exit low power
modes.
If the system have very seldom tasks need to run, but there have more than one
tasks on the core's runqueue, then kernel will send IPI to other core to do
reschedule to run the thread; if the thread have very low workload, the most
time will spend on the flow for low power mode's enter/exit but not for really
tasks, then hot-plug will be better choice.
Can't you use cgroups for this instead?
...
The per core timer has the same issue, if the core is powered off and its
local timer cannot use anymore, so kernel need use the broadcast timer to help
wake up the core and the core will be waken up to handle the timer event.
Patches are being pushed forward by Viresh Kumar to prevent work queues
from waking up idle CPUs.  I didn't look into the details myself, but 
I'm sure the timer events are in the same boat.
...
So if hot-unplug the cores, we can avoid many times IPIs, so finally we can
get some power benefit when system have low workload.
Let's use TC2 as the example to describe the hot-plug's implementation: the
logical cpu 0 has virtual frequency points:
175Mhz/200Mhz/250Mhz/300Mhz/350Mhz/400Mhz/450Mhz/500Mhz for A7 core, and
600Mhz/700Mhz/800Mhz/900Mhz/1000Mhz/1100Mhz/1200MHz for A15 core;
When the system can meet the performance requirement, cpufreq will call IKS to
firstly switch to A7 core; if the core run <= virtual frequency 200Mhz, then
system can hot-plug the core. If kernel need improve the performance, it will
execute the reverse flow: hot-plug A7 core -> do switch to A15 core.
As I said earlier, CPU hotplug is a very heavyweight operation in the 
core of the kernel.  Hot-plugging a CPU back into the system may 
experience significant latency if the system is loaded, and a loaded 
system is exactly what normally triggers the need to bring it back in. 
It is far better to identify bad sources of CPU wakeups and fix them so 
to leave the unneeded CPU into deep idle mode.
Nicolas

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Question] Regard Of IKS Implementation