On Mon, Jul 27, 2015 at 12:43:28AM -0400, Nicolas Pitre wrote:
On Sat, 25 Jul 2015, Chen-Yu Tsai wrote:
On Fri, Jul 24, 2015 at 11:44 PM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
On Fri, 24 Jul 2015, Chen-Yu Tsai wrote:
Hi,
On Sat, May 2, 2015 at 12:06 AM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
This concerns the following helpers:
__mcpm_cpu_going_down() __mcpm_cpu_down() __mcpm_outbound_enter_critical() __mcpm_outbound_leave_critical() __mcpm_cluster_state()
They are and should only be used by the core code now. Therefore their declarations are removed from mcpm.h and their definitions are made static, hence the need to move them before their users which accounts for the bulk of this patch.
I'm looking for some advice. On the Allwinner A80, at least on mainline, there is no external PMU or embedded controller in charge of power controls. What this means is that I'm doing power sequencing in the kernel as part of the MCPM calls, specifically powering down cores and clusters in the .wait_for_powerdown callback. (I don't think it's reasonable or even possible to power down stuff in .*_powerdown_prepare)
Can you tell me more about the power control knobs at your disposal? Do power gates become effective immediately or only when WFI is asserted?
And can you configure things so a core may be powered up asynchronously from an IRQ?
The above probably wasn't clear enough. Power gates, reset controls and SMP/WFI/WFE status are mapped to various mmio registers. The controls are effective immediately.
The power gates and reset controls can only be manually controlled. There is no mainline support for the embedded controller yet, and I doubt Allwinner's firmware supports it either, as their kernel also does power sequencing itself. In a nutshell, the kernel is on its own, we do not support wakeups with IRQs.
Previously I was using __mcpm_cluster_state() to check if the last core in a cluster was to be powered off, and thus the whole cluster could be turned off as well. I could also check if the individual power gates or resets are asserted, but if a core was already scheduled to be brought up, and MCPM common framework didn't call .cluster_powerup, there might be a problem.
I fail to see how a core could be scheduled to be brought up without deasserting its reset line somehow though.
My point is could there be a race condition in the sequence of events? Say .*_powerup() deasserted the reset lines _after_ we checked them in .wait_for_powerdown(). As Dave mentioned, .wait_for_powerdown() is not called with the MCPM lock held.
In theory this should never happen. Even if .wait_for_powerdown() was prevented from running concurrently with .*_powerup(), nothing would prevent .*_powerup() from running the moment .wait_for_powerdown() has returned. So it is up to the higher level not to power up a CPU and wait for it to be down at the same time since this simply makes no sense... unless it really wants the CPU back right away.
But I've resolved to waiting for L2 WFI before powering off clusters.
Any suggestions? Maybe export __mcpm_cluster_state() so platform code can know what's going to happen?
The cluster state may change unexpectedly. There is a special locking sequence and state machine needed to make this information reliable. Simply returning the current state wouldn't be enough to ensure it can be used race free.
I see.
As Dave stated, we might have to supplement the MCPM core code with special methods involving a surviving CPU to perform the power-down operation on the dying CPU's behalf. Doing this in .wait_for_powerdown is just an abuse of the API.
The other users I looked at all had other pieces of hardware taking care of this, so I couldn't really understand where I could put this.
I think what should be done in this case is simply to put the task of killing the CPU power on a work queue that gets executed by another CPU. Queueing the necessary work could be done from the MCPM core code the moment a special method exists in the machine backend structure. The work callback would take the MCPM lock, make sure the CPU/cluster state is "DOWN" and call that special method.
A call to .wait_for_powerdown() should not be responsible for the actual power down. It should remain optional.
If adding another callback to handle this is acceptable, then I can look into it.
Please be my guest.
I think I originally named .wait_for_powerdown ".power_down_finish", or something like that, with the expectation that this would both do the serialisation and the last rites.
The dual role was a bit confusing though, and doesn't necessarily fit the framework perfectly.
Having an optional callback for the "last rites" part probably would be clearer.
Cheers ---Dave