Re: [Question] MCPM Supporting For ARM64

4 Sep 2013

      On Wed, 4 Sep 2013, Catalin Marinas wrote:
...
On 3 Sep 2013, at 19:53, Nicolas Pitre nicolas.pitre@linaro.org wrote:
...
On Tue, 3 Sep 2013, Catalin Marinas wrote:
...
On Tue, Sep 03, 2013 at 05:16:17AM +0100, Nicolas Pitre wrote
...
...
As a past example, how many SoC maintainers were willing to
convert their code to DT if there wasn't a clear statement from arm-soc
guys mandating this?
This is a wrong comparison.  DT is more or less a more structured way to
describe hardware.  Just like ACPI.  The kernel is not *forced* to
follow DT information if that turns out to be wrong.  And again
experience has shown that this does happen.
The comparison is not meant for DT vs PSCI but an example that even
though DT has benefits and less risks, people didn't rush into adopting
it unless it was mandated for new platforms. For arm64 we try not to get
SoC code under arch/arm64/ (mach-virt like approach) but I still get
people asking in private about copying code into arch/arm64/mach-*
directories for the same easy path reasons.
That's fine, obviously.
But reality will mess that up somewhat eventually.  You'll have to
accept machine specific quirks for firmware bugs because the firmware
update is not forthcoming if at all.  Same story everywhere.
It's one thing to accept machine specific quirks for firmware bugs (but
I'll push back as much as possible) and entirely different to accept
temporary code because the firmware features are not ready yet.
You know that if you push back too hard then people will simply stop 
submitting their code for upstream inclusion and keep it into private 
trees.  There is a fine balance to reach.
...
...
...
I agree with your arguments that Linux is more flexible and easily
upgradable. However, the point I want to emphasize is that unless Linux
is playing all the roles of firmware/secure/non-secure code, you must
have firmware and calls to it from the non-secure OS. On ARMv8, EL3 is
the first mode CPU is running out of reset and it needs to get back to
this mode for power-related (e.g. coherency) settings. Whether we (Linux
people) like it or not, that's the reality.
I know, I know...  From my own point of view this is rather sad.
Whether it's sad for Linux or not, be aware that ARM Ltd is not a
Linux-only shop.  There are other OSes, secure or non-secure, there are
vendors that ask for these features (and this includes vendors that run
Linux/Android).
Obviously.  And those vendors would be the happier if Linux was BSD 
licensed and they didn't have any GPL obligations to follow.  But I'm 
sure you know that already.
Code cleanliness and maintainability is a concern for developers and 
engineers only.  Product vendors don't usually care as much as their 
primary concern is the customer end result.
...
...
...
With proper education, SoC vendors can learn to allow upgradable
(parts of) firmware.
Is this education in ARM's plans?  Is someone working on recommendations
about proper design for fail-safe firmware upgrades via separate
firmware components?
The generic firmware is probably a better place to provide such
functionality rather than a recommendations document.  But I haven't
followed its development closely enough to comment (I know I raised this
exact issue in the past, primarily for handling undocumented CPU errata
bits accessible only in secure mode).
Could this aspect officially be put on the firmware agenda at ARM?
I think this is much more important than people may realize.
...
...
...
But if you have a better proposal and can get all the parts (including
secure OS people) to agree, I'm open to it.
I think that Linux has gained its dominant position for one fundamental
reason: source availability.  No company could ever match the work force
that gathered around that source code.  If secure OS people would agree
to this principle then things could end up more secure and more
efficient.  But that's not something I have influence over.
It's not about secure OS, for many reasons this will probably remain
closed source.  But the firmware and UEFI are a different story and most
of it can be open.  I can see vendors keeping parts of the firmware
closed but I would hope those are minimal (it already happens, it's not
something introduced by PSCI requirements).
Again the issue is not PSCI per se.
If the source is available, then anyone may fix bugs independently from 
the OEM who might have different priorities.  The firmware may evolve 
with advancements and innovations in the kernel.
But given the firmware is going to run in secure world I don't think any 
OEM will allow for easy upgrade as this would dillute the very principle 
of a secure world.  So even if the source is available, this is not 
going to help much.
...
...
...
...
...
For example, MCPM provides callbacks into the platform
code when a CPU goes down to disable coherency, flush caches etc. and
this code must call back into the MCPM to complete the CPU tear-down. If
you want such thing, you need a different PSCI specification.
Hmmm... The above statement makes no sense to me.  Sorry I must have
missed something.
OK, let me clarify. I took the dcscb.c example for CPU going down
(please correct me if I got it wrong):
mcpm_cpu_power_down()
 dcscb_power_down()
   flush_cache_all()                         - both secure and non-secure
   set_auxcr()                                       - secure-only
   cci_disable_port_by_cpu()                 - secure-only?
   (__mcpm_outbound_leave_critical())
   __mcpm_cpu_down()
   wfi()
So the dcscb back-end calls back into MCPM to update the state machine.
If you are to run Linux in non-secure mode, set_auxcr() and CCI would
need secure access. Cache flushing also needs to affect the secure
cachelines and disable the caches on the secure side. So from the
flush_cache_all() point, you need to get into EL3. But MCPM requires (as
I see in the dcscb back-end) a return from such EL3 code to call
__mcpm_cpu_down() and do WFI on the non-secure side. This is
incompatible with the (current) PSCI specification.
The MCPM backend doesn't _need_ to call __mcpm_cpu_down() and friends.
Those are helpers for when there is no firmware and proper
synchronization needs to be done between different cores.
The reason people currently ask for MCPM is exactly this synchronisation
which they don't want to do in the firmware.
And this is a hell of a good reason.  I'm scared to death by the 
prospect of seeing this kind of algorithmic complexity shoved into 
closed firmware.
...
As I said in a previous
post, I'm not against MCPM as such but against the back-ends which will
eventually get non-standard secure calls.
Maybe the secure call standardization should focus on providing less 
abstract and more low-level interfaces then.  Right now, the PSCI 
definition implies that everything is performed behind the interface.
But again this would be against the spirit of a secure layer with veto 
power on everything happening in non secure world.
...
One think I don't like about MCPM (and I raised it during review) is the
cluster/cpu separation with hard-coded number of clusters.  I would have
really liked a linear view of the CPUs and let the back-end (or MCPM
library itself) handle the topology.  I don't think it's hard to change
anyway.
Indeed, that's actually a simple implementation detail.  And MCPM being 
entirely a Linux internal API, we have the freedom to change it at will.
...
...
If you have PSCI then the MCPM call graph is roughly:
mcpm_cpu_power_down()
       psci_power_down()
               psci_ops.cpu_off(power_state)
That's it.  Nothing has to call back into the kernel.
So for arm64 we expose PSCI functionality via smp_operations (cpu
up/down, suspend is work in progress).  Populating smp_operations is
driven from DT and it has been decoupled from the SoC code.  What would
the MCPM indirection bring here?
This looks like if you just re-invented the MCPM high-level interface.
...
From this quick description, the SMP ops would do the same as MCPM under
a different name.
...
In the presence of PSCI firmware, do you agree that a potential MCPM
back-end should be generic (not tied to an SoC)?
Absolutely.
...
In such case, what would the MCPM front-end bring which cannot be 
currently handled by smp_operations (or an extension to it)?
Appart from having a common interface between PSCI and non-PSCI systems, 
the MCPM frontend is likely to handle CPU/cluster synchronization and 
the last man determination in the frontend.  This is currently handled 
by backends, but we now see a common pattern which should be factored 
out.  Not that this is highly useful to a PSCI system though.
The MCPM API also implies additional expectations that are not (and 
should not) be provided by PSCI, such as the ability to have unordered 
cpu_up/cpu_down calls because this greatly helps solving race conditions 
at the caller level.  The reason why PSCI should not carry these 
expectations is because this is a Linux specific implementation detail 
and it is best to have a shim layer to isolate this from a lower level 
interface.
...
I don't (yet?) see the point of PSCI back-end to arm64 MCPM since all
people are asking MCPM for is exactly to avoid the PSCI implementation.
And again there are very legitimate reasons for that.  Having the 
cluster synchronization complexity locked behind a PSCI interface 
implies it has to be done in hard-to-fix firmware.  It is not the fault 
of PSCI, but rather the notion of having algorithmic complexity into 
firmware which is a problem when firmware complexity should be kept to 
a minimum while the easily replaceable OS is responsible for it.
...
Couple of comments on the patch below. Not aimed as a proper review:
...
--- /dev/null
+++ b/arch/arm/mach-vexpress/tc2_pm_psci.c
[…]
...
+static void tc2_pm_psci_power_down(void)
+{

  struct psci_power_state power_state;

  unsigned int mpidr, cpu, cluster;

  mpidr = read_cpuid_mpidr();

  cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);

  cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);

  BUG_ON(!psci_ops.cpu_off);

  switch (atomic_dec_return(&tc2_pm_use_count[cpu][cluster])) {

  case 1:

          /*

           * Overtaken by a power up. Flush caches, exit coherency,

           * return & fake a reset

           */

          asm volatile (

                  "mrc    p15, 0, ip, c1, c0, 0 \n\t"

                  "bic    ip, ip, #(1 << 2) @ clear C bit \n\t"

                  "mcr    p15, 0, ip, c1, c0, 0 \n\t"

                  "dsb \n\t"

                  "isb"

                  : : : "ip" );

          flush_cache_louis();

          asm volatile (

                  "clrex  \n\t"

                  "mrc    p15, 0, ip, c1, c0, 1 \n\t"

                  "bic    ip, ip, #(1 << 6) @ clear SMP bit \n\t"

                  "mcr    p15, 0, ip, c1, c0, 1 \n\t"

                  "isb \n\t"

                  "dsb"

                  : : : "ip" );

          return;

The above part needs to be done on the secure side, ACTLR.SMP bit cannot
be cleared on the non-secure side.  Is this return with coherency
disabled required by MCPM?
The requirement here is that the system should be put in the same state 
as if the kernel was re-entered from the firmware after a cpu_up 
operation i.e. MMU and cache off.  So in this very case I don't think 
the code has to care about ACTLR.SMP as this would be handled by the 
firmware and the kernel would already be entered with this bit set.
...
...

  case 0:

          /* A normal request to possibly power down the cluster */

          power_state.id = PSCI_POWER_STATE_ID;

          power_state.type = PSCI_POWER_STATE_TYPE_POWER_DOWN;

          power_state.affinity_level = PSCI_POWER_STATE_AFFINITY_LEVEL1;

          psci_ops.cpu_off(power_state);

          /* On success this function never returns */

  default:

          /* Any other value is a bug */

          BUG();

  }

+}

+static void tc2_pm_psci_suspend(u64 unused)
+{

  struct psci_power_state power_state;

  BUG_ON(!psci_ops.cpu_suspend);

  /* On TC2 always attempt to power down the cluster */

  power_state.id = PSCI_POWER_STATE_ID;

  power_state.type = PSCI_POWER_STATE_TYPE_POWER_DOWN;

  power_state.affinity_level = PSCI_POWER_STATE_AFFINITY_LEVEL1;

  psci_ops.cpu_suspend(power_state, virt_to_phys(bL_entry_point));

  /* On success this function never returns */

  BUG();

+}
CPU_SUSPEND is allowed to return if there is a pending interrupt.
Yes, but not through this path.  If an interrupt is pending, the PSCI 
implementation should immediately return through 
virt_to_phys(bL_entry_point) (this is mcpm_entry_point in mainline).
Nicolas

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Question] MCPM Supporting For ARM64