Re: [Question] MCPM Supporting For ARM64

5 Sep 2013


      I'll let the firmware guys discuss what is and is not upgradable at some
point in the future (when the generic firmware will be available). This
is not related just to PSCI but to other things like secure-only errata
bits.
In the meantime I'll try to focus on the Linux interaction with the
secure firmware (and PSCI). Also cc'ing Mark Rutland since he's working
on CPU hotplug for arm64.
On Wed, Sep 04, 2013 at 02:48:48PM +0100, Nicolas Pitre wrote:
...
On Wed, 4 Sep 2013, Catalin Marinas wrote:
...
On 3 Sep 2013, at 19:53, Nicolas Pitre nicolas.pitre@linaro.org wrote:
...
On Tue, 3 Sep 2013, Catalin Marinas wrote:
...
On Tue, Sep 03, 2013 at 05:16:17AM +0100, Nicolas Pitre wrote
...
...
For example, MCPM provides callbacks into the platform
code when a CPU goes down to disable coherency, flush caches etc. and
this code must call back into the MCPM to complete the CPU tear-down. If
you want such thing, you need a different PSCI specification.
Hmmm... The above statement makes no sense to me.  Sorry I must have
missed something.
OK, let me clarify. I took the dcscb.c example for CPU going down
(please correct me if I got it wrong):
mcpm_cpu_power_down()
 dcscb_power_down()
   flush_cache_all()                         - both secure and non-secure
   set_auxcr()                                       - secure-only
   cci_disable_port_by_cpu()                 - secure-only?
   (__mcpm_outbound_leave_critical())
   __mcpm_cpu_down()
   wfi()
So the dcscb back-end calls back into MCPM to update the state machine.
If you are to run Linux in non-secure mode, set_auxcr() and CCI would
need secure access. Cache flushing also needs to affect the secure
cachelines and disable the caches on the secure side. So from the
flush_cache_all() point, you need to get into EL3. But MCPM requires (as
I see in the dcscb back-end) a return from such EL3 code to call
__mcpm_cpu_down() and do WFI on the non-secure side. This is
incompatible with the (current) PSCI specification.
The MCPM backend doesn't _need_ to call __mcpm_cpu_down() and friends.
Those are helpers for when there is no firmware and proper
synchronization needs to be done between different cores.
The reason people currently ask for MCPM is exactly this synchronisation
which they don't want to do in the firmware.
And this is a hell of a good reason.  I'm scared to death by the
prospect of seeing this kind of algorithmic complexity shoved into
closed firmware.
OK, so if I understand you correctly, you don't want to see PSCI
firmware in use at all or at least not in it's current form (which *you*
also reviewed) because you think it is too complex to be bug-free or
impractical to upgrade.
I respect your opinion but do you have a more concrete proposal? The
options so far:
1. (current status) Don't use PSCI firmware, let Linux handle all the
   CPU power management (possibly under the MCPM framework). If not all
   power-related actions can be done at the non-secure level, just
   implement non-standard SMC calls as needed. If these are changed
   (because in time vendor may have other security needs), add them to
   the driver and hope they have a way to detect or just not upstream
   the #ifdef'ed code.
2. New standard firmware interface, simpler and less error-prone. Handle
   most power management in Linux (with an MCPM-like state machine) and
   have guaranteed race-free calls to the firmware. In the process, also
   convince the secure OS guys that Linux is part of their trusted
   environment (I personally trust Linux more than the trusted OS ;) but
   this doesn't hold any water for certification purposes). Basically if
   you can disable coherency from the non-secure OS (e.g. CCI or just
   powering down a CPU without the secure OS having the chance to flush
   its caches), the only workaround for the secure OS would be to run UP
   (which is probably the case now) or flush caches at every return to
   the non-secure world.
3. Very similar to 2, with PSCI firmware interface but without the
   requirements to do cluster state coordination in firmware (with some
   semantic changes to the affinity arguments). Linux handles the state
   coordination (MCPM state machine) but PSCI firmware does the
   necessary flushing, coherency disabling based on the specified
   affinity level (it doesn't question that because it does not track
   the "last man"). Slightly better security model than 2 as it can
   flush the secure OS caches but I'm not entirely sure PSCI can avoid a
   state machine and whether this has other security implications.
4. MCPM state machine on top of full PSCI. Here I don't see the point of
   tracking cluster/last-man state in Linux if PSCI does it already. If
   PSCI got it wrong (broken coherency, deadlocks), MCPM cannot really
   solve it. Also, are there additional races by having two separate
   state machines for the same thing (I can't think of any now)?
5. Full PSCI with a light wrapper (smp_operations) allowing SoC to hook
   additional code (only if needed for parts which aren't handled by
   firmware). This is similar to the mcpm_platform_ops registration but
   *without* the MCPM state machine and without the separate cpu/cluster
   arguments (just linearise this space for flexibility). Other key
   point is that the re-entry address is driven via PSCI rather than
   mcpm_entry_vectors. Platform ops registration is optional, just
   available for flexibility (which could also mean that it is not the
   platform ops driving the final PSCI call, different from the MCPM
   framework). This approach does not enforce a secure model, it's up to
   the SoC vendor to allow/prevent non-secure access to power
   controller, CCI etc. But it still mandates common kernel entry/exit
   path via PSCI.
6. Full PSCI with generic CPU hotplug and cpuidle drivers. I won't list
   the pros/cons, that's what this thread is mainly about.
Any other options?
My goal is for 6 but 5 could be a more practical/flexible approach.
...
...
As I said in a previous
post, I'm not against MCPM as such but against the back-ends which will
eventually get non-standard secure calls.
Maybe the secure call standardization should focus on providing less
abstract and more low-level interfaces then.  Right now, the PSCI
definition implies that everything is performed behind the interface.
Sorry but you are late for this call. PSCI has been discussed for nearly
two years and you have been involved.
...
But again this would be against the spirit of a secure layer with veto
power on everything happening in non secure world.
The veto power is one of the approaches to reduce security risks. It may
be doable in other ways but it either complicates certification or
reduces the secure OS functionality (like UP only).
...
...
...
If you have PSCI then the MCPM call graph is roughly:
mcpm_cpu_power_down()
       psci_power_down()
               psci_ops.cpu_off(power_state)
That's it.  Nothing has to call back into the kernel.
So for arm64 we expose PSCI functionality via smp_operations (cpu
up/down, suspend is work in progress).  Populating smp_operations is
driven from DT and it has been decoupled from the SoC code.  What would
the MCPM indirection bring here?
This looks like if you just re-invented the MCPM high-level interface.
From this quick description, the SMP ops would do the same as MCPM under
a different name.
The MCPM high level interface is nothing other than smp_ops. It's not a
new API but the same smp_ops. I don't think there is much to re-invent
here. The mcpm_platform_ops callbacks and registration is indeed a
unification but see my point 5 above about what would be different in
the context of PSCI.
MCPM state machine is the biggest innovation of this framework. While it
happens to be implemented in the same MCPM code, I see it as more of a
locking library that can exist outside the MCPM front- or back-end
interface (but in the actual back-end implementation if PSCI isn't
present). You may see a common pattern and move such state machine in
the front-end but then you get to point 4 above if do it on top of PSCI.
-- 
Catalin

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Question] MCPM Supporting For ARM64