Re: [Question] MCPM Supporting For ARM64

12 Sep 2013

      On Thu, 12 Sep 2013, Catalin Marinas wrote:
...
On Tue, Sep 10, 2013 at 08:07:17PM +0100, Nicolas Pitre wrote:
...
On Tue, 10 Sep 2013, Catalin Marinas wrote:
...
I hope it is clear by now that if you care about _proper_ security in
the power management context, point 3 above must be handled in firmware.
That's independent of how complex the firmware needs to be for such
task. PSCI as an API and the Generic Firmware implementation is
addressing this (and at the same time providing a common framework for
secure OS vendors to rely on). Generic Firmware will be further
developed to address other concerns but that's not the point.
Indeed it is the point!  And as I keep repeating this over and over in 
various ways because there is no one who is addressing my very concern 
so far:
It's not hard to understand: the generic firmware is *not* public yet.
You will be able review it and discuss your concerns at the right time.
And how the reviewing of it will alleviate my concerns?  At least if you 
or ARM have no answers then please say so rather than continuously 
ignoring the issues.
I'm asking if there is a _plan_ to produce _recommendations_ for best 
practices about firmware update deployment.  Unless you don't recognize 
the need for them?  That doesn't have to wait until a piece of code is 
published before answers are given, no?
...
I *do* appreciate the complexities of MCPM but that doesn't make it
write once only (it's not even software only, part of the state
coordination may be handled in hardware). But the experience gained from
developing it is definitely not lost.
It is not "write once" for sure.  The gained experience shows that this 
code is going to be an _evolving_ target that cannot be cast into 
static firmware just like a done job.
...
We clearly have a different understanding of the ARM security model and
I've already gone to great lengths explaining it. I won't go over those
arguments again, you seem to have ignored them so far.
I didn't ignore them.  They simply failed to address my point.  
Repeating them won't make them any more relevant.
Let me summarize the situation one more time:
- The market is asking for security sensitive code to be executed in 
  perfect isolation from the standard OS. Hence TrustZone / Secure World.
- The _design_ of the current Secure World architecture implies that the 
  code running there has no choice but to concern itself with non 
  security related operations as well simply to _preserve_ its secure 
  attributes.  In other words, the fact that secure code has to 
  implement e.g. power management mechanisms and cache management 
  operations is a consequence of the architecture design and not a 
  secure service that the market asked for.
- For secure code to be truly secure, the code itself has to be 
  unalterable and unaccessible, especially if it carries encryption 
  keys, etc.  What we've seen so far is that the secure code is getting 
  burned directly into the SoC for those reasons.
- And Secure World is there to stay.
Do we agree so far?
What I'm claiming is that adding more and more complexity to 
non-alterable firmware code is bad bad bad.  You may repeat over and 
over that the ARM security model requires that complexity in the secure 
firmware. I never denied that fact either.  But I will continue to 
assert that this is still the unfortunate _consequence_ of a bad 
architecture model and not something that should be promoted as a design 
feature.
...
The decision for adopting the ARM generic firmware or other firmware
lies entirely with the ARMv8 SoC vendors and they should know better
what their security needs are (or will be). It's not the role of the
Linux community to mandate certain firmware. However, we *do* have the
right to mandate certain standards like booting protocols, DT, ACPI etc.
for code aimed to be included in mainline.
Standards are _*NOT*_ the problem.  Please drop this argument.
...
Linux interaction with the firmware is another area which badly needs
standardisation, whether it is secure firmware or not, simply because
that's the first code a CPU executes when out of reset. Such
standardisation is even more important in the presence of secure
firmware (and given that AArch64 is new, companies will have to write
new firmware and there is little legacy to carry).
What I'm saying is that _complexity_ in the firmware is _*THE*_ problem.  
Whether it is old or new, AArch64 or AArch32, PSCI or whatnot.
The _increasing_ interactions between any firmware/bootloader and Linux 
_is_ a serious problem.  This is a problem because it _will_ have bugs.  
It _will_ have version and implementation skews.  It will require 
_more_ coordinations between different software parts beyond any 
standard interfaces.  And that is _costly_ and a total nightmare to 
manage. And even more so when those bugs are in the (possibly 
unalterable) firmware.
You can bury your head in the sand as you wish and conveniently downplay 
those facts. I personally do care greatly and I do have sympathy for 
vendors who might wish to pursue a different path than this single 
solution with everything in firmware approach.
...
The first interaction with firmware (EL3 or not, boot loader) is the
booting protocol (primary and secondary CPUs). This is defined by
Documentation/arm64/booting.txt and will also cover PSCI. It can be
extended to other protocols in the future as long as they follow
simple rules:

Existing protocols are not feasible/sufficient (with good arguments,
EFI_STUB for example)

"Good" is pretty subjective.
"I don't want complex firmware in my SoC" could be a "good" reasons 
according to certain point of views.
...

There are no SoC or CPU implementation specific quirks required before
the FDT is parsed and the (secondary) CPUs enabled the MMU. IOW:
caches and TLBs clean&invalidated
full CPU/cluster coherency enabled
errata workaround bits set

Items 1 and 2 are normally easy. Item 3 is often known _after_ product 
deployment.  What do you do if this is the responsibility of the secure 
firmware to do?  How do you manage re-certification of the secure code?  
How do you provide secure code updates to products in the field?  What 
if the L3 firmware is not alterable?  Are there recommendations in ARM's 
plans to address this?
...
Power management is not covered by the above document, though there is a
relation between secondary CPU booting and hotplug. To be clear, as the
arm64 kernel _gatekeeper_ I set the basic rules for ARMv8 SoC power
management code aimed for mainline (I'll capture them in a soc.txt
document):

If EL3 is present, standard EL3 firmware interface required

Fair enough for booting.
...

New EL3 interface can be accepted if the existing interfaces are not
feasible (with good arguments, it is properly documented and widely
accepted)

How can an interface be widely accepted if it is not accepted first?
...

CPUs coming out of power down or idle need to be have all the SoC or
implementation specific quirks enabled:

What if they're not all known up front the day the Soc goes into 
production?  Because in practice those quirks often end up much more 
often than we would like being errata workarounds once products are 
deployed.
...

caches and TLBs clean & invalidated
coherency enabled

This looks like a simple enumeration, doesn't it?
What if the above implies the equivalent of MCPM which complexities you 
said you do appreciate?  What if its hardware specific implementation 
(backend in MCPM parlance) is non trivial to implement optimally and 
requires updating?
...

errata workaround bits set

Yada.
...
ARM provides PSCI as such standard API in the presence of EL3 but I'm
_open_ to other _well-thought_ firmware API proposals that can gain
wider acceptance.
Thank you.
One such API might simply be a small L3 stub which only purpose in life 
is to proxy system control accesses that L1 cannot do otherwise, 
especially if there is no need for a secure OS on the system.  This is 
likely to free some vendors of the risks from not getting their firmware 
right for all cases.
Now let's be clear on what my position is:  I do agree on the value of a 
standard booting interface for the kernel or even bootloaders, etc.  
This really helps in having a common distro procedure for different 
hardware, etc.
But once the kernel is booted, it does require hardware specific drivers 
to work properly in all cases.  No one is ever going to accept 
abstracting ethernet hardware into firmware (virtual machines 
notwithstanding).  Specialized disk arrays will also require custom 
drivers -- I really doubt AHCI will cut it for them all.  Furthermore, 
improvements in kernel subsystems often implies modifications to those 
drivers (e.g. when NAPI was introduced), etc.
Therefore... there is no reason why _conceptually_ the same principle 
could not be applied to power management.  If specific _drivers_ are 
needed to support this or that platform then this shouldn't be a 
problem, just like it is not a problem for ethernet interfaces.  Once 
the kernel is booted via the standard firmware interface, then the 
kernel should be provided with the right modules to drive the rest of 
the system in the best possible way.  The only reason why this wouldn't 
work on ARM is because of the security model.
Of course it is a good idea to have DT or ACPI.  Those standards are 
very useful for the factoring of integration differences on otherwise 
common hardware blocks. They are _informative_ and they allow the kernel 
to bypass them when they turn out to be insufficient.  Firmware calls do 
not have that flexibility.
...
Hint: Linaro is a good forum for wider SoC vendor and
Linux community discussions, I would expect concrete proposals rather
than complains.
They might come sooner than you'd expect.
...
(BTW, my impression from the last Connect was that LEG is adopting 
PSCI for the ACPI work)
PSCI has its place, there's no doubt about that.  It can't be a 
one-size-fits-all solution though.
...
Note that the above rules don't have anything to do with MCPM. That's a
SoC power driver implementation detail (and I already suggested turning
it into a library if needed to avoid duplication).
I do agree.  However this is again implementation details.  I'm more 
concerned about the larger picture.
...
But the above firmware API rules still apply and if PSCI is present 
you have the advantage of generic support in Linux.
Easier said than done.  Linux code is cheaper to write and maintain than 
firmware code, _even_ if it has to stay out of mainline because you'd be 
opposing it.
...
(as a side note, generalising your TC2 MMC experience to _any_ 
firmware is unprofessional IMHO. You keep repeating it and to me 
starts sounding like FUD)
I'm a pragmatic.  I don't believe in magic and wishful thinking.
Ignoring reported firmware bugs for months may has its share of 
unprofessionalism too.  But instead of going down the route of name 
calling, I prefer to believe that you have good reasons at ARM to 
explain this situation such as resource shortage and/or higher 
priorities.  And from experience for having worked at several different 
companies I can tell you that resource shortage and priority shifts do 
happen everywhere.
Hence my assertion that complex firmware are not cost effective.  If a 
simpler (aka cheaper) solution exists, you must expect vendors to 
embrace it, whether or not you like it.
Nicolas

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Question] MCPM Supporting For ARM64