Re: [Question] MCPM Supporting For ARM64

3 Sep 2013

On Tue, 3 Sep 2013, Catalin Marinas wrote:
...
On Tue, Sep 03, 2013 at 05:16:17AM +0100, Nicolas Pitre wrote:
...
I'll say this only once, and then I'll shut up on this very issue.
I actually enjoy this discussion ;)
OK then.  That's fine by me.
...
...
So no, I don't have this faith in firmware like you do.  And I've
discussed this with you in the past as well, and my mind on this issue
has not changed.  In particular, when you give a presentation on ARMv8
and your answer to a question from the audience is: "firmware will take
care of this" I may only think you'll regret this answer one day.
Well, I don't trust firmware either and I've seen it causing hard to
debug issues in the past. But actually a better statement would be that
I don't trust any software (not even Linux) unless I can see the source.
The big advantage Linux has is that it requires (well, unless you are
NVidia ;)) opening the code and more people look at it.
I can't agree more.
...
My hope for the generic firmware is that it will turn into a proper
open-source project (with a friendly license). This would make it
different from old BIOS implementations.
I'm afraid this won't happen, unfortunately.  "Friendly license" really 
depends who you talk to.  Many people in the industry don't consider the 
GPL as very "friendly" because of the source code requirement.  They 
much prefer the BSD license which is in their position much friendlier.
And if vendors are not constrained to provide their source code, they 
simply won't.  Look at all the high profile BSD derrived deployments to 
see that they've all gone closed source.
...
Of course, bugs can happen and the firmware is harder to update but
really not impossible.
Sure.  Same can be said about bootloaders today.  Yet they simply are 
not updated in most cases because it is less risk to work around them.
...
...
...
As a past example, how many SoC maintainers were willing to
convert their code to DT if there wasn't a clear statement from arm-soc
guys mandating this?
This is a wrong comparison.  DT is more or less a more structured way to
describe hardware.  Just like ACPI.  The kernel is not *forced* to
follow DT information if that turns out to be wrong.  And again
experience has shown that this does happen.
The comparison is not meant for DT vs PSCI but an example that even
though DT has benefits and less risks, people didn't rush into adopting
it unless it was mandated for new platforms. For arm64 we try not to get
SoC code under arch/arm64/ (mach-virt like approach) but I still get
people asking in private about copying code into arch/arm64/mach-*
directories for the same easy path reasons.
That's fine, obviously.
But reality will mess that up somewhat eventually.  You'll have to 
accept machine specific quirks for firmware bugs because the firmware 
update is not forthcoming if at all.  Same story everywhere.
...
...
But let's get back to firmware. Delegating power management to firmware
is a completely different type of contract.  In that case you're moving
a critical functionality with potentially significant algorithmic
complexity (as evidenced by the MCPM development effort) out of the
kernel's control.  The more functionality and complexity you delegate to
the firmware, the greater the risk you'll end up with a crippled system
the kernel will have to cope with eventually.  Because this will happen
at some point there is no doubt about that.
I agree with your arguments that Linux is more flexible and easily
upgradable. However, the point I want to emphasize is that unless Linux
is playing all the roles of firmware/secure/non-secure code, you must
have firmware and calls to it from the non-secure OS. On ARMv8, EL3 is
the first mode CPU is running out of reset and it needs to get back to
this mode for power-related (e.g. coherency) settings. Whether we (Linux
people) like it or not, that's the reality.
I know, I know...  From my own point of view this is rather sad.
...
How much you leave on the secure or the non-secure side goes through a
great deal of discussions/reviews both with the general purpose OS and
secure OS people, hence the creation of PSCI (really, it's not just
about standardising the SoC support on arm64 Linux).
Again I have absolutely nothing against PSCI as in the interface 
specification.
[...]
...
...
Is there another kernel execution model innovation coming up for which
the firmware will be an obstacle on ARM?
Firmwares evolve in time but as I said, even if we would like to, we
can't eliminate them.
...
I'm just extremely worried about the simple presence of firmware and the
need to call into it for complex operations such as intelligent power
management when the kernel would be a much better location to perform
such operations.  The kernel is also the only place where those things
may be improved over time, even for free from third parties, , whereas
the firmware is going to be locked down with no possible evolution
beyond its shipping state.
For example, after working on MCPM for a while, I do have ideas for
additional performance and power usage optimization.  But if that
functionality is getting burned into firmware controlled by secure world
then there is just no hope for _me_ to optimize things any further.
I don't dispute the above but I don't have a better solution that would
accommodate secure/non-secure worlds.
So if a conclusion can be drawn out of this is that we're in agreement 
here.
It is on the firmware implementation front that we probably have 
diverging expectations.
...
With proper education, SoC vendors can learn to allow upgradable 
(parts of) firmware.
Is this education in ARM's plans?  Is someone working on recommendations 
about proper design for fail-safe firmware upgrades via separate 
firmware components?
If there is anything I might have contributed to the the DT on ARM story 
it is the insistence for the need to be able to update the DTB 
separately from the bootloader, and so by end users without special 
equipment.  Given OEM's reticense to let users perform bootloader 
upgrades, it was important that the DTB provided alongside the 
bootloader doesn't get the same treatment.
A broken bootloader is bad but once the kernel has booted the bootloader 
is out of the way.  A broken DTB is a real PITA since that is what the 
kernel has to work with.  Same goes for firmware.
So if firmware is unavoidable in ARM's future, at least there must be 
some contingency plans to mitigate the unavoidable buggy firmware issues 
to come.
...
But if you have a better proposal and can get all the parts (including 
secure OS people) to agree, I'm open to it.
I think that Linux has gained its dominant position for one fundamental 
reason: source availability.  No company could ever match the work force 
that gathered around that source code.  If secure OS people would agree 
to this principle then things could end up more secure and more 
efficient.  But that's not something I have influence over.
...
...
...
For example, MCPM provides callbacks into the platform
code when a CPU goes down to disable coherency, flush caches etc. and
this code must call back into the MCPM to complete the CPU tear-down. If
you want such thing, you need a different PSCI specification.
Hmmm... The above statement makes no sense to me.  Sorry I must have
missed something.
OK, let me clarify. I took the dcscb.c example for CPU going down
(please correct me if I got it wrong):
mcpm_cpu_power_down()
  dcscb_power_down()
    flush_cache_all()				- both secure and non-secure
    set_auxcr()					- secure-only
    cci_disable_port_by_cpu()			- secure-only?
    (__mcpm_outbound_leave_critical())
    __mcpm_cpu_down()
    wfi()
So the dcscb back-end calls back into MCPM to update the state machine.
If you are to run Linux in non-secure mode, set_auxcr() and CCI would
need secure access. Cache flushing also needs to affect the secure
cachelines and disable the caches on the secure side. So from the
flush_cache_all() point, you need to get into EL3. But MCPM requires (as
I see in the dcscb back-end) a return from such EL3 code to call
__mcpm_cpu_down() and do WFI on the non-secure side. This is
incompatible with the (current) PSCI specification.
The MCPM backend doesn't _need_ to call __mcpm_cpu_down() and friends.  
Those are helpers for when there is no firmware and proper 
synchronization needs to be done between different cores.
If you have PSCI then the MCPM call graph is roughly:
mcpm_cpu_power_down()
    psci_power_down()
    	psci_ops.cpu_off(power_state)
That's it.  Nothing has to call back into the kernel.
Of course this is missing many details.  See below for the full patch.
Please note that this might not apply on top of current mainline and it 
shouldn't have to be TC2 specific.  But that should give you a good idea 
of what a PSCI backend for MCPM entails.
----- >8
Date: Tue, 12 Mar 2013 15:41:57 +0000
From: Achin Gupta achin.gupta@arm.com
To: dave.martin@arm.com, nicolas.pitre@linaro.org, charles.garcia-tobin@arm.com
X-Mailer: git-send-email 1.7.9.5
Message-ID: 1363102919-27822-15-git-send-email-achin.gupta@arm.com
Subject: [RFC PATCH v6 13/15] ARM: vexpress: add shim layer for psci backend on TC2
This patch introduces a shim layer for the TC2 platform which converts
'bL_platform_power_ops' routines to their psci counterparts. The psci
counterparts are implemented by the secure firmware. The shim layer
is used only when Linux is running in non-secure world and the secure
firmware implements psci.
It also introduces the use of a reference count to allow a power up call
to go ahead of a power down call.
Change-Id: I6a582bfac9aa7dc2f3e6ef0aaa71a0036457311f
Signed-off-by: Achin Gupta achin.gupta@arm.com
---
 arch/arm/mach-vexpress/Makefile      |    4 +
 arch/arm/mach-vexpress/tc2_pm_psci.c |  179 ++++++++++++++++++++++++++++++++++
 2 files changed, 183 insertions(+)
 create mode 100644 arch/arm/mach-vexpress/tc2_pm_psci.c

diff --git a/arch/arm/mach-vexpress/Makefile b/arch/arm/mach-vexpress/Makefile
index ff2ba26..3822184 100644
--- a/arch/arm/mach-vexpress/Makefile
+++ b/arch/arm/mach-vexpress/Makefile
@@ -10,6 +10,10 @@ obj-$(CONFIG_ARCH_VEXPRESS_DCSCB)	+= dcscb.o	dcscb_setup.o
 CFLAGS_REMOVE_dcscb.o			= -pg
 obj-$(CONFIG_ARCH_VEXPRESS_TC2)		+= tc2_pm.o tc2_pm_setup.o
 CFLAGS_REMOVE_tc2_pm.o			= -pg
+ifeq ($(CONFIG_ARCH_VEXPRESS_TC2),y)
+obj-$(CONFIG_ARM_PSCI)			+= tc2_pm_psci.o
+CFLAGS_REMOVE_tc2_pm_psci.o		= -pg
+endif
 obj-$(CONFIG_SMP)			+= platsmp.o
 obj-$(CONFIG_HOTPLUG_CPU)		+= hotplug.o
 obj-$(CONFIG_VEXPRESS_TC2_CPUIDLE)	+= cpuidle-tc2.o
diff --git a/arch/arm/mach-vexpress/tc2_pm_psci.c b/arch/arm/mach-vexpress/tc2_pm_psci.c
new file mode 100644
index 0000000..c9715b8
--- /dev/null
+++ b/arch/arm/mach-vexpress/tc2_pm_psci.c
@@ -0,0 +1,179 @@
+/*
+ * arch/arm/mach-vexpress/tc2_pm_psci.c - TC2 PSCI support
+ *
+ * Created by: Achin Gupta, December 2012
+ * Copyright:  (C) 2012  ARM Limited
+ *
+ * Some portions of this file were originally written by Nicolas Pitre
+ * Copyright:   (C) 2012  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/spinlock.h>
+#include <linux/errno.h>
+
+#include <asm/bL_entry.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+#include <asm/psci.h>
+#include <asm/atomic.h>
+#include <asm/cputype.h>
+
+#include <mach/motherboard.h>
+#include <mach/tc2.h>
+
+#include <linux/vexpress.h>
+
+/*
+ * Platform specific state id understood by the firmware and used to
+ * program the power controller
+ */
+#define PSCI_POWER_STATE_ID           0
+
+static atomic_t tc2_pm_use_count[TC2_MAX_CPUS][TC2_MAX_CLUSTERS];
+
+static int tc2_pm_psci_power_up(unsigned int cpu, unsigned int cluster)
+{
+	unsigned int mpidr = (cluster << 8) | cpu;
+	int ret = 0;
+
+	BUG_ON(!psci_ops.cpu_on);
+
+	switch (atomic_inc_return(&tc2_pm_use_count[cpu][cluster])) {
+	case 1:
+		/*
+		 * This is a request to power up a cpu that linux thinks has
+		 * been powered down. Retries are needed if the firmware has
+		 * seen the power down request as yet.
+		 */
+		do
+			ret = psci_ops.cpu_on(mpidr,
+					      virt_to_phys(bL_entry_point));
+		while (ret == -EAGAIN);
+
+		return ret;
+	case 2:
+		/* This power up request has overtaken a power down request */
+		return ret;
+	default:
+		/* Any other value is a bug */
+		BUG();
+	}
+}
+
+static void tc2_pm_psci_power_down(void)
+{
+	struct psci_power_state power_state;
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	BUG_ON(!psci_ops.cpu_off);
+
+	switch (atomic_dec_return(&tc2_pm_use_count[cpu][cluster])) {
+	case 1:
+		/*
+		 * Overtaken by a power up. Flush caches, exit coherency,
+		 * return & fake a reset
+		 */
+		asm volatile (
+			"mrc	p15, 0, ip, c1, c0, 0 \n\t"
+			"bic	ip, ip, #(1 << 2) @ clear C bit \n\t"
+			"mcr	p15, 0, ip, c1, c0, 0 \n\t"
+			"dsb \n\t"
+			"isb"
+			: : : "ip" );
+
+		flush_cache_louis();
+
+		asm volatile (
+			"clrex	\n\t"
+			"mrc	p15, 0, ip, c1, c0, 1 \n\t"
+			"bic	ip, ip, #(1 << 6) @ clear SMP bit \n\t"
+			"mcr	p15, 0, ip, c1, c0, 1 \n\t"
+			"isb \n\t"
+			"dsb"
+			: : : "ip" );
+
+		return;
+	case 0:
+		/* A normal request to possibly power down the cluster */
+		power_state.id = PSCI_POWER_STATE_ID;
+		power_state.type = PSCI_POWER_STATE_TYPE_POWER_DOWN;
+		power_state.affinity_level = PSCI_POWER_STATE_AFFINITY_LEVEL1;
+
+		psci_ops.cpu_off(power_state);
+
+		/* On success this function never returns */
+	default:
+		/* Any other value is a bug */
+		BUG();
+	}
+}
+
+static void tc2_pm_psci_suspend(u64 unused)
+{
+	struct psci_power_state power_state;
+
+	BUG_ON(!psci_ops.cpu_suspend);
+
+	/* On TC2 always attempt to power down the cluster */
+	power_state.id = PSCI_POWER_STATE_ID;
+	power_state.type = PSCI_POWER_STATE_TYPE_POWER_DOWN;
+	power_state.affinity_level = PSCI_POWER_STATE_AFFINITY_LEVEL1;
+
+	psci_ops.cpu_suspend(power_state, virt_to_phys(bL_entry_point));
+
+	/* On success this function never returns */
+	BUG();
+}
+
+static const struct bL_platform_power_ops tc2_pm_power_ops = {
+	.power_up      = tc2_pm_psci_power_up,
+	.power_down    = tc2_pm_psci_power_down,
+	.suspend       = tc2_pm_psci_suspend,
+};
+
+static void __init tc2_pm_usage_count_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cluster >= TC2_MAX_CLUSTERS ||
+	       cpu >= vexpress_spc_get_nb_cpus(cluster));
+
+	atomic_set(&tc2_pm_use_count[cpu][cluster], 1);
+}
+
+static int __init tc2_pm_psci_init(void)
+{
+	int ret;
+
+	ret = psci_probe();
+	if (ret) {
+		pr_debug("psci not found. Aborting psci init\n");
+		return -ENODEV;
+	}
+
+	tc2_pm_usage_count_init();
+
+	ret = bL_platform_power_register(&tc2_pm_power_ops);
+	if (!ret)
+		ret = bL_cluster_sync_init(NULL);
+	if (!ret)
+		pr_info("TC2 power management initialized\n");
+	return ret;
+}
+
+early_initcall(tc2_pm_psci_init);
-- 
1.7.9.5





    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Question] MCPM Supporting For ARM64