Re: [Question] MCPM Supporting For ARM64

8 Sep 2013

      On 7 Sep 2013, at 21:31, Nicolas Pitre nicolas.pitre@linaro.org wrote:
...
On Sat, 7 Sep 2013, Catalin Marinas wrote:
...
On 6 Sep 2013, at 20:52, Nicolas Pitre nicolas.pitre@linaro.org wrote:
...
On Fri, 6 Sep 2013, Catalin Marinas wrote:
...
On Fri, Sep 06, 2013 at 02:50:45AM +0100, Nicolas Pitre wrote:
...
I understand the issue 
with having a secure OS that needs to protect itself from the nasty 
Linux world.  However, if I understand the model right, the secure OS is 
there to provide special services to the non-secure OS and not the 
reverse.  Therefore the secure OS should simply pack and hide its things 
when told to do so, right?
The problem is when it is *not* told to do so.
Well, just halt the whole system in that case.  Or raise a fault if you 
want to be nice.
I don't think you got my point.  How to force the halting of the whole
system when the non-secure OS controls what to halt?  It controls which
CPUs to halt, when to disable cluster coherency.  This is normally for
good reasons like power management but a malicious non-secure OS may
also use these to cause data corruption on the secure side (with various
consequences, it allows options for attack).
What I meant is:

Secure OS traps on any attempt from the non secure OS to disable

coherency or halt CPUs while it is active.
Good.  So we agree that the non-secure OS cannot freely disable
coherency or halt CPUs without the secure OS being informed first.
The architecture does not allow trapping at EL3 as that's normally for
hypervisor-type implementations.  But it can indeed (subject to SoC
implementation) block access to certain peripherals, in which case the
non-secure OS (at EL1) most likely gets an external synchronous abort.
...

Non-secure OS wants to do some power management so it tells secure OS

to pack its things and remove its hands from the hardware controls.
OK, so let's assume the non-secure OS does an SMC #PREPARE_* so that the
firmware enables non-secure access to such hardware after packing its
things.
...

While non-secure OS has control over the hardware knobs, secure OS

refuses to operate.

Non-secure OS tells secure OS to come back.  Secure OS reinstates its

watch guard on the hardware control knobs.

If non-secure OS attempts to touch the hardware knobs without telling

secure OS to get away first, secure OS takes offence and either hangs 
 the system or signals a fault.
What you are missing here is that secure OS "packing its things" is a
lot more complex than simply "refusing to operate".  Let's consider some
scenarios:
First scenario, the non-secure OS tells the secure one to pack all its
things, no matter whether it's CPU suspend or power-down:
1. Non-secure OS issues SMC #SECURE_PACK_ALL.
2. Secure OS needs to issue IPIs to all the CPUs that may be running
   secure code.
3. Non-secure OS performs CPU or cluster power down.
This is either inefficient (secure OS waking up all the CPUs that may be
in suspend) or just not possible if some CPUs were in power-down mode
rather than suspend.  In addition, if only a CPU is in suspend, you
still want the secure OS to work on the other CPUs.  We can just dismiss
the "pack all things" scenario.
Second scenario, per-CPU SMC #PREPARE_CPU_DOWN (or SUSPEND):
1. Non-secure OS issues SMC #PREPARE_CPU_DOWN.
2. Secure OS disables the MMU (coherency) and flushes its caches on
   that CPU.  It then enables non-secure access to the power controller.
3. Non-secure OS performs CPU power down.
This scenario only works if the secure firmware can control CPU and
cluster down independently.  Let's assume this is doable, so we move to
the next scenario.
Third scenario, per-cluster SMC #PREPARE_CLUSTER_DOWN (or SUSPEND) as a
result of a 'last man' detection (in the non-secure OS):
1. Non-secure OS issues SMC #PREPARE_CLUSTER_DOWN
2. Secure OS disables the MMU on that CPU, flushes L1 and L2 caches,
   enables non-secure access to power controller.
3. Non-secure OS performs cluster power down.
At point 2 above, the secure OS has 3 options:
2.a) trusts the non-secure OS to have shut down the other CPUs.
   2.b) issues IPI to the other CPUs in the cluster to pack things
        (flush caches, disable MMU).
   2.c) refuses to enable non-secure access to power controller.
2.a breaks the security model.  2.b has the same issues with the first
scenario (which CPUs to send the IPI to?).  2.c is the safest but it
*requires* 'last man' state machine in the *secure* firmware (same with
2.b, it would need to track which CPUs in the cluster are still up).
You can do the above exercise again but instead of enabling non-secure
access to the power controller, the firmware would perform the actual
power controller action.  You'll see that the cluster scenario still
requires the firmware to track the 'last man' state.
...
...
You still want the power decision (policy) to happen in the
non-secure OS but with the actual hardware access in firmware.
That's where things get murky.  The policy comes as a result of last man 
determination, etc.  In other words, the policy is not only about "I 
want to save power now".  It is also "what kind of power saving I can 
afford now".  And that's basically what MCPM does.  With an abstract 
interface such as PSCI, that policy decision is moved into firmware.
Wrong.  PSCI ops get an affinity parameter whether its CPU or cluster
power down/suspend.  Of course, you can always ask for cluster if only
interested in power saving and PSCI can choose what it is safe.  There
isn't anything in PSCI that would take CPU vs cluster decision away from
the non-secure OS.
And the MCPM framework is not the place for such CPU vs cluster policy
either.  This needs is decided higher up in the cpuidle subsystem and in
an abstract terms like target residency, time taken to recover from
various low power states.  You may go for cluster down directly if 'last
man' but may as well go for CPU down first even if 'last man'.  This is
a decision to be taken by the cpuidle governor and *not* by MCPM.  PSCI
already allows this via affinity parameter.
Of course, we could have said a PSCI CPU_POWER_DOWN with cluster
affinity should return an error if not 'last man'.  But this would have
required a duplicate 'last man' state machine in the non-secure OS.
...
...
same malicious use-case reasons, the firmware cannot afford to rely on
the non-secure OS to prevent "last man" races.
Again, by the time non-secure OS attempts to determine the last man, it 
should have told the secure OS to take cover.
It cannot take cover entirely just because a CPU is going into idle.
See above.
...
...
Shifting the privilege levels down, a better analogy would be user
application (non-root since root has a special 'trusted' status in
Linux) able to control coherency and CPU/cluster shutdown *without*
having to do system calls.  Would you feel comfortable with this?
Let me propose a counter-example: with PSCI and the power management in 
secure firmware is like having the GNOME Power Manager compiled into the 
kernel.  It would work of course, but maybe the GNOME developers would 
prefer dealing with it in user space instead without having to update 
the kernel when there is a bug in the Power Manager.
I understand your uneasiness with more complex firmware but I now wonder
whether you completely missed the point of PSCI.  I'll restate - it does
*not* take away the power management policy from the non-secure,
high-level OS.  It does what it is *asked* to do and in a safe, secure
manner.  This safety *requires* 'last man' state machine in the
firmware.  You can have another state machine in the non-secure OS if
you want/need to, but as I said above, such CPU vs cluster should be
decided based on cost, residency and that's by the cpuidle governor.
MCPM or PSCI can only choose the safest state for the security level
they run at.
...
...
The only workaround is not to trust things controlled from outside that
privilege levels, in such case coherency.  This pretty much means UP
only.
And why is that a problem?  Certainly the secure OS shouldn't need to be 
that CPU hungry, just like the Linux kernel is not supposed to take away 
too much CPU cycles from user space.
It's not about CPU intensive tasks.  It is about the secure OS being
available on all CPUs in an MP system.  This secure OS can just be a
library with a big lock to serialise MP access.  But as long as it needs
to run code on more than one CPU it needs to rely safe cache coherency.
Imagine a secure OS which gets some data from secure storage and
provides it to the non-secure OS:
1. Such data is copied from a PIO secure device as a result of an FIQ.
   If the FIQ happens on CPU0, the secure firmware would dirty the caches
   on that CPU.
2. A non-secure OS asks for that data on CPU1 via an SMC.
3. The secure OS performs a memcpy from the buffer previously allocated
   for the FIQ copy to the non-secure buffer.
The above is a normal secure service provided to the non-secure OS and
memcpy in step 3 requires cache coherency, otherwise CPU1 can access old
data and leak information.
You can think of other scenarios where a (malicious) full cluster is
shut down but the non-secure OS 'missed' one of the CPUs (third scenario
above).  Same loss of data, information leak.
Catalin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Question] MCPM Supporting For ARM64