On Tue, 10 Sep 2013, Catalin Marinas wrote:
On Mon, Sep 09, 2013 at 07:02:50PM +0100, Nicolas Pitre wrote:
On Mon, 9 Sep 2013, Lorenzo Pieralisi wrote:
On Mon, Sep 09, 2013 at 02:02:47PM +0100, Catalin Marinas wrote:
Taking the TC2 code example (it may be extended, I don't know the plans here) it seems that the cpuidle driver is only concerned with the C1 state (CPU rather than cluster suspend). IIUC, cpuidle is not aware of deeper sleep states. The MCPM back-end would get an expected residency information and make another decision for deeper sleep states. Where does it get the residency information from? Isn't this the estimation done by the cpuidle governor? At this point you pretty much move part of cpuidle governor functionality (and its concepts like target residency) down to the MCPM back-end level. Such split will have bad consequences longer term with code duplication between back-ends, harder to understand and maintain cpuidle decision points.
IMHO the subject of this thread should not be related to power management policy decisions and where they should live. The goal of MCPM and PSCI was not about defining policy for power management but providing mechanism and I agree with Catalin on this, we have to keep them separate.
I do agree as well. That's not where my argument fundamentally is. Please let's not divert the discussion again.
To avoid diverting the discussion, we first need to agree on what this discussion is about. As Lorenzo said, the goal of MCPM or PSCI was not about defining policy but coordinating the C states in a multi-cluster context (with or without security implications). Policy in MCPM is a new thing you brought and if that's the future plan I want to stay far away from it. Selling MCPM as the next idle governor does not work for me, sorry.
That's not what I wanted to convey at all. If that's the impression I gave you then I sincerely apologize. Lorenzo probably more appropriately designated it as a mechanism that also has the ability to demote policy requests which is, as far as I know, not contradicting the longer description I made of it. Let's not twist my words any further please.
Going back to the original topic, when we talk about MCPM vs PSCI in the arm64 context, we need to be clear on _which_ parts of MCPM to consider. PSCI is clearly defined, MCPM is not (as you said, it's work in progress but you can probably set high-level goals). So let's start with defining MCPM and please correct my understanding (based on what's currently in mainline):
- Front-end to CPU hotplug and cpuidle.
- Common back-end interface to low-level SoC power management. I would add mcpm_entry_vectors setting here.
- 'last man' state machine for CPU/cluster power coordination.
I hope it is clear by now that if you care about _proper_ security in the power management context, point 3 above must be handled in firmware. That's independent of how complex the firmware needs to be for such task. PSCI as an API and the Generic Firmware implementation is addressing this (and at the same time providing a common framework for secure OS vendors to rely on). Generic Firmware will be further developed to address other concerns but that's not the point.
Indeed it is the point! And as I keep repeating this over and over in various ways because there is no one who is addressing my very concern so far:
- The MCPM state machine is _not_ as obvious to implement as it may seem. Either that, or the sum of us who took a couple months to get it right are total incompetent idiots. The biggest idiot amongst those people was certainly myself who clearly missed the mark by _far_ in my estimation of the effort required to implement this.
- Implementing non trivial functionality in the secure firmware does increase risk. Risk of a buggy implementation that may not be detected by testing since tests tend not to simulate the real life usage patterns accurately. Or risk of compromising the secure part of the system in some unforeseen ways. Please don't tell me you're above those concerns.
- There is always a cost associated to such risks which is unfortunately (or conveniently, due to hardware restrictions or political reasons) seemingly dismissed at the moment.
- Yet the best way to mitigate the risk is to have a flexible update mechanism providing incentive to address issues quickly.
Now... will someone clearly tell me if those concerns have been addressed yet? Is there something besides the PSCI document and the generic firmware implementation in the plans for covering those aspects? Because I've asked this very question, using the TC2 firmware bug that has not been fixed after 6 months of being reported as an example of why I'm even more concerned by even more complex firmware intermangling security operations with power management. It seems that answers to those very concerns are carefully avoided in _all_ the replies I've received so far.
And that has direct influence on whether people will opt for the complex firmware model or managing themselves cheap escape hatches in case something doesn't go according to the plan. And as you said yourself, PSCI is clearly defined, but that's (relatively speaking -- no punt intended on Charles' excellent work) the easy part. The implementation behind it (just like MCPM core and backends today) might not necessarily always be as easy and even complete.
And please let's drop the MCPM vs smp_ops interface argument. This is only bikeshedding over implementation details. I personally don't care what the actual frontend is called.
Myself and others in this thread stated the security requirements for such complexity in firmware several times already. You can complain about the ARM secure architecture but, again, that's not something to be addressed in this thread.
All right. Thank you for stating it so bluntly. There probably is no point discussing this any further in that case.
Nicolas