On Wed, 8 May 2013, Leo Yan wrote:
hi Nico & all,
After we studied the IKS code, we believe the code is general and smoothly and can almost meet well for our own SoC's requirement; here also have some questions want to confirm with you guys:
Good. We're certainly looking forward to apply this code to other SOCs.
- When outbound core wake up inbound core, the outbound core's thread will
sleep until the inbound core use MCPM’s early pork to send IPI;
a) Looks like this method somehow is due to TC2 board has long letancy to power on/off cluster and core; right? How about to use polling method? because on our own SoC, the wakenup interval will take _only_ about 10 ~ 20us;
There is no need to poll anything. If your SOC is fast enough in all cases, then the outbound may simply go ahead and let the inbound resume with the saved context whenever it is ready.
b) The inbound core will send IPI to outbound core for the synchronization, but at this point the inbound core's GIC's cpu interface is disabled; so even the core's cpu interface is disabled, can the core send SGI to other cores?
It must, otherwise the switch as implemented would never complete.
c) MCPM's patchset merged for mainline have no related function for early pork, so later will early pork related functions be committed to mainline?
The early poke mechanism is only needed by the switcher. This is why it is not submitted yet.
- Now the switching is an async operation, means after the function
bL_switch_request is return back, we cannot say switching has been completed; so we have some concern for it.
For example, when switch from A15 core to A7 core, then maybe we want to decrease the voltage so that can save power; if the switching is an async operation, then it maybe introduce the issue is: after return back from the function bL_switch_request, then s/w will decrease the voltage; but at the meantime, the real switching is ongoing on another pair cores.
i browser the git log and get to know at the beginning the switching is synced by using kernel's workqueue, later changed to use a dedicated kernel thread with FIFO type; do u think it's better to go ahead to add sync method for switching?
No. This is absolutely the wrong way to look at things.
The switcher is _just_ a specialized CPU hotplug agent with a special side effect. What it does is to tell the MCPM layer to shut CPU x down, power up CPU y, etc. It happens that cpuidle may be doing the same thing in parallel, and so does the classical CPU hotplug.
So you must add your voltage policy into the MCPM backend for your platform instead, _irrespective_ of the switcher presence.
First, the switcher is aware of the state on a per logical CPU basis. It knows when its logical CPU0 switched from the A15 to the A7. That logical CPU0 instance doesn't know and doesn't have to know what is happening with logical CPU1. The switcher does not perform cluster wide switching so it does not know when the entire A7 or the entire A15 cores are down. That's the job of the MCPM layer.
Another example: suppose that logical CPU0 is running on the A7 and logical CPU1 is running on the A15, but cpuidle for the later decides to shut itself down. The cpuidle driver will ask MCPM to shut down logical CPU1 which happens to be the last A15 and therefore no more A15 will be alive at that moment, even if the switcher knows that logical CPU1 is still tied to the A15. You certainly want to lower the voltage in that case too.
- After enabled switcher, then it will disable hotplug.
Actually current code can support hotplug with IKS; because with IKS, the logical core will map the according physical core id and GIC's interface id, so that it can make sure if the system has hotplugged out which physical core, later the kernel can hotplug in this physical core. So could u give more hints why iks need disable hotplug?
The problem is to decide what the semantic of a hotplug request would be.
Let's use an example. The system boots with CPUs 0,1,2,3. When the switcher initializes, it itself hot-unplugs CPUs 2,3 and only CPUs 0,1 remain. Of course physical CPUs 2 and 3 are used when a switch happens, but even if physical CPU 0 is switched to physical CPU 2, the logical CPU number as far as Linux is concerned remains CPU 0. So even if CPU 2 is running, Linux thinks this is still CPU 0.
Now, when the switcher is active, we must forbid any hotplugging of logical CPUs 2 and 3, or the semantic of the switcher would be broken. So that means keeping track of CPUs that can and cannot be hotplugged, and that lack of uniformity is likely to cause confusion in user space already.
But if you really want to hot-unplug CPU 0, this might correspond to either physical CPU0 or physical CPU2. What physical CPU should be brought back in when a hotplug request comes in?
And if the switcher is disabled after hot-unplugging CPU0 when it was still active, should both physical CPUs 0 and 2 left disabled, or logical CPU2 should be brought back online nevertheless?
There are many issues to cover, and the code needed to deal with them becomes increasingly complex. And this is not only about the switcher, as some other parts of the kernel such as pmu might expects to shut down physical CPU0 when its hotplug callback is invoked for logical CPU0, etc. etc.
So please tell me: why do you want CPU hotplug in combination with the switcher in the first place? Using hotplug for power management is already a bad idea to start with given the cost and overhead associated with it. The switcher does perform CPU hotplugging behind the scene but it avoids all the extra costs from a hotplug operation of a logical CPU in the core kernel.
But if you _really_ insist on performing CPU hotplug while using the switcher, you still can disable the switcher via sysfs, hot-unplug a CPU still via sysfs, and re-enable the switcher. When the switcher reinitializes, it will go through its pairing with the available CPUs, and if there is no available pairing for a logical CPU because one of the physical CPU has been hot-unplugged then that logical CPU won't be available with the switcher.
Nicolas