On Mon, 2 Sep 2013, Catalin Marinas wrote:
(sorry if you get a duplicate email, the SMTP server I was using keeps reporting delays, so I'm resending with different settings)
This is the first time I see your reply.
Hi Nico,
On Sat, Aug 31, 2013 at 03:19:46AM +0100, Nicolas Pitre wrote:
On Fri, 30 Aug 2013, Catalin Marinas wrote:
My position for the arm64 kernel support is to use the PSCI and implement the cluster power synchronisation in the firmware. IOW, no MCPM in the arm64 kernel :(. To help with this, ARM is going to provide a generic firmware implementation that SoC vendors can expand for their needs.
I am open for discussing a common API that could be shared between MCPM-based code and the PSCI one. But I'm definitely not opting for a light-weight PSCI back-end to a heavy-weight MCPM implementation.
Also note that IKS won't be supported on arm64.
I find the above statements a bit rigid.
While I think the reasoning behind PSCI is sound, I suspect some people will elect not to add it to their system. Either because the hardware doesn't support all the necessary priviledge levels, or simply because they prefer the easiest solution in terms of maintenance and upgradability which means making the kernel in charge. And that may imply MCPM. I know that ARM would like to see PSCI be adopted everywhere but I doubt it'll be easy.
I agree PSCI is not always possible because of technical reasons (missing certain exception levels), so let me refine the above statement: if EL3 mode is available on a CPU implementation, PSCI must be supported (which I think is the case for Marvell).
I may agree to that.
It indeed sounds rigid but what's the point of trying to standardise something if you don't enforce it? People usually try to find the easiest path for them which may not necessarily be the cleanest longer term.
I'll say this only once, and then I'll shut up on this very issue.
<<< Beginning of my editorial on firmware >>>
The firmware _interface_ standard is not a problem. So this is not about PSCI in particular. I'm glad I was among st the early reviewers of the specification and therefore I would be the first to be blamed if the interface was wrong to me. In fact there was something wrong to me in the very first draft and I'm glad this was removed from subsequent revisions. This just to say that I was involved with PSCI to some degree.
The problem is in the _implementation_ of the firmware. Or more precisely its _maintenance_.
Because, as experience has shown, firmware is always *bad* and *buggy* at some point. No matter what you say, someone somewhere will screw it up. And we'll be stuck with it.
You only need to look at the abomination the BIOS on X86 can be. Or buggy ACPI tables requiring all sorts of quirks in the kernel. Or the catastrophe the APM BIOS was before it was later replaced by ACPI.
You might say: this is X86 and on ARM we can do better. Yeah... well... Look at our bootloader story for example. How many people are simply unable to upgrade their bootloader in order to be DT compliant. Or how many OEMs are too scared to let end users upgrade their bootloaders.
Upgrading system firmware is not much different from upgrading a bootloader. Something goes wrong during the upgrade and you have a nice paperweight. So OEMs simply won't be forthcoming with firmware upgrades as this is a huge risk with significant support costs when this goes wrong. And if the firmware is responsible for the secure world as well then you have QA and certification costs to consider, etc.
And even if the upgrade is easy, the OEM might not necessarily be that interested in fixing buggy firmware for many reasons. A good example of that is the very Versatile Express which we (Linaro) reported issues for with the MMC slot many months ago and no fix has come back yet:
https://bugs.launchpad.net/linaro-big-little-system/+bug/1166246
Achin was experimenting with a modified firmware before priorities and people at ARM have moved around and then nothing. OEMs will have priority shifts as well -- this is the hard reality of this business. This means unfixed firmware for older devices that users can't fix unlike with kernel solutions.
You might say: We'll then help people make their firmware perfect from the start. Well... I have enough software engineering experience to know this is simply impossible. Software by definition is always going to have bugs. Especially in the market ARM is after where time to market is paramount. So pressure to have the software done quickly is real. And even if you provide an example implementation for customers to use and extend, then they'll extend it all right and introduce new bugs, possibly even breaking the previously working generic code.
Heck, I've privately reviewed a few MCPM backends so far, and they all had serious flaws. The kernel has the advantage where any extension is reviewed by people from the outside. Example firmware being extended by ARM partners will not. And I don't see why those people wouldn't make the same kind of mistakes in their fork of the generic firmware implementation. This stuff is simply too complex to always be right on the first attempt, even if testing shows it should.
So no, I don't have this faith in firmware like you do. And I've discussed this with you in the past as well, and my mind on this issue has not changed. In particular, when you give a presentation on ARMv8 and your answer to a question from the audience is: "firmware will take care of this" I may only think you'll regret this answer one day.
As a past example, how many SoC maintainers were willing to convert their code to DT if there wasn't a clear statement from arm-soc guys mandating this?
This is a wrong comparison. DT is more or less a more structured way to describe hardware. Just like ACPI. The kernel is not *forced* to follow DT information if that turns out to be wrong. And again experience has shown that this does happen.
Firmware calls such as provided by the PC BIOS, APM, UEFI runtime services or PSCI are black boxes you have to trust while the code being called doesn't know what is really hapening on the system.
Oh and by the way people are now realizing all the issues with DT where the maintainers want only stable final bindings but developers can't tell when a binding is final unless it has seen wide testing. So DT is not the greatest thing since sliced bread anymore.
But let's get back to firmware. Delegating power management to firmware is a completely different type of contract. In that case you're moving a critical functionality with potentially significant algorithmic complexity (as evidenced by the MCPM development effort) out of the kernel's control. The more functionality and complexity you delegate to the firmware, the greater the risk you'll end up with a crippled system the kernel will have to cope with eventually. Because this will happen at some point there is no doubt about that.
Now what is the piece of software people are not afraid of upgrading? Where is it easier to upgrade functionality beyond the initial shipping state because time to market pressure drove products on the street before the software was ready? Yes, it is the kernel. And that's exactly what IKS was about: being able to drive a big.LITTLE system early on before the scheduler was properly enhanced for big.LITTLE. And only a simple kernel upgrade is needed to move from IKS to full MP.
Think for a minute when the kernel used to call the APM BIOS for power management in the old days, and then the kernel was enhanced to be preemptive. APM was not re-entrant and therefore conceptually incompatible with this kernel context at the time. Obviously things where even more difficult when SMP showed up. And just ask the X86 maintainers what they think about the firmware induced NMIs *today*.
So, is e.g. PSCI compatible with Linux-RT now? Is there another kernel execution model innovation coming up for which the firmware will be an obstacle on ARM?
I do understand that the way the ARM architecture is designed, especially latest revisions with TZ and all, do require some sort of firmware. And while there is no other way but to have it, then of course a standardized firmware interface is a must. I'm not disputing this at all.
I'm just extremely worried about the simple presence of firmware and the need to call into it for complex operations such as intelligent power management when the kernel would be a much better location to perform such operations. The kernel is also the only place where those things may be improved over time, even for free from third parties, , whereas the firmware is going to be locked down with no possible evolution beyond its shipping state.
For example, after working on MCPM for a while, I do have ideas for additional performance and power usage optimization. But if that functionality is getting burned into firmware controlled by secure world then there is just no hope for _me_ to optimize things any further.
Not that I can do anything about it anyway. But I had to vent my discomfort about anything firmware-like. Yet I was the one who suggested to Will Deacon and Marc Zyngier they should use PSCI to control KVM instances instead of their ad hoc interface. However, if someone has a legitimate case for _not_ using firmware calls and use machine specific extensions in the kernel then I'll support them.
<<< End of my editorial on firmware >>>
I don't have a problem with the MCPM itself. It is nicely written and can probably be abstracted into a library. My concern is how MCPM is going to be used by the SoC code. The example I have so far is dcscb.c and I really want such code moved out of the kernel into firmware (or even better if there is some hardware control available). It is highly dependent to the SoC and CPU implementation and setting/clearing bits like ACTLT.SMP which is not possible in non-secure mode. I'm pretty sure that such code will end up doing non-standard SMC calls to the secure firmware and we lose any hope of standardisation.
Standardization and innovation are often opposing each other. And yes that sucks.
The PSCI API can evolve in time (it has version information) but so far there is no such thing as "lightweight" PSCI that could be used as an MCPM back-end.
That exists still, written by Achin Gupta. His 6th version is sitting in my mbox and is dated 12 Mar 2013.
For example, MCPM provides callbacks into the platform code when a CPU goes down to disable coherency, flush caches etc. and this code must call back into the MCPM to complete the CPU tear-down. If you want such thing, you need a different PSCI specification.
Hmmm... The above statement makes no sense to me. Sorry I must have missed something.
Nicolas