Hi,
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
- Add two more documantation to explain why we need ACPI in ARM64 servers by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
- Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
- Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
- Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
- Add Tested-by from Qualcomm and Redhat;
- Make ACPI depends on PCI suggested by Catalin;
- Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
- Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and ITS, GICv2m are implemented;
- Rebased on top of 3.19-rc7.
previous version is here: v7: https://lkml.org/lkml/2015/1/14/586 v6: https://lkml.org/lkml/2015/1/4/40
Any comments are welcome :)
Thanks Hanjun
Al Stone (4): ARM64 / ACPI: Get RSDP and ACPI boot-time tables ARM64 / ACPI: Introduce early_param for "acpi" and pass acpi=force to enable ACPI ARM64 / ACPI: Select ACPI_REDUCED_HARDWARE_ONLY if ACPI is enabled on ARM64 arm64: ACPI: additions of ACPI documentation for arm64
Graeme Gregory (6): acpi: add arm64 to the platforms that use ioremap ACPI / sleep: Introduce sleep_arm.c ARM64 / ACPI: If we chose to boot from acpi then disable FDT ARM64 / ACPI: Get PSCI flags in FADT for PSCI init ARM64 / ACPI: Enable ARM64 in Kconfig Documentation: ACPI for ARM64
Hanjun Guo (8): ARM64 / ACPI: Introduce PCI stub functions for ACPI dt / chosen: Add linux,uefi-stub-generated-dtb property ARM64 / ACPI: Disable ACPI if FADT revision is less than 5.1 ACPI / table: Print GIC information when MADT is parsed ARM64 / ACPI: Parse MADT for SMP initialization ACPI / processor: Make it possible to get CPU hardware ID via GICC ARM64 / ACPI: Introduce ACPI_IRQ_MODEL_GIC and register device's gsi clocksource / arch_timer: Parse GTDT to initialize arch timer
Mark Salter (2): acpi: fix acpi_os_ioremap for arm64 arm64: allow late use of early_ioremap
Tomasz Nowicki (1): irqchip: Add GICv2 specific ACPI boot support
Documentation/arm/uefi.txt | 3 + Documentation/arm64/acpi_object_usage.txt | 592 ++++++++++++++++++++++++++++++ Documentation/arm64/arm-acpi.txt | 506 +++++++++++++++++++++++++ Documentation/arm64/why_use_acpi.txt | 231 ++++++++++++ Documentation/kernel-parameters.txt | 3 +- arch/arm64/Kconfig | 3 + arch/arm64/include/asm/acenv.h | 18 + arch/arm64/include/asm/acpi.h | 103 ++++++ arch/arm64/include/asm/cpu_ops.h | 1 + arch/arm64/include/asm/fixmap.h | 3 + arch/arm64/include/asm/pci.h | 6 + arch/arm64/include/asm/psci.h | 3 +- arch/arm64/include/asm/smp.h | 5 +- arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/acpi.c | 362 ++++++++++++++++++ arch/arm64/kernel/cpu_ops.c | 2 +- arch/arm64/kernel/pci.c | 25 ++ arch/arm64/kernel/psci.c | 78 ++-- arch/arm64/kernel/setup.c | 58 ++- arch/arm64/kernel/smp.c | 2 +- arch/arm64/kernel/time.c | 7 + drivers/acpi/Kconfig | 3 +- drivers/acpi/Makefile | 4 + drivers/acpi/bus.c | 3 + drivers/acpi/osl.c | 6 +- drivers/acpi/processor_core.c | 37 ++ drivers/acpi/sleep_arm.c | 28 ++ drivers/acpi/tables.c | 43 +++ drivers/clocksource/arm_arch_timer.c | 132 +++++-- drivers/firmware/efi/libstub/fdt.c | 8 + drivers/irqchip/irq-gic.c | 102 +++++ drivers/irqchip/irqchip.c | 3 + include/acpi/acpi_io.h | 6 + include/linux/acpi.h | 16 + include/linux/clocksource.h | 6 + include/linux/irqchip/arm-gic-acpi.h | 31 ++ 36 files changed, 2374 insertions(+), 66 deletions(-) create mode 100644 Documentation/arm64/acpi_object_usage.txt create mode 100644 Documentation/arm64/arm-acpi.txt create mode 100644 Documentation/arm64/why_use_acpi.txt create mode 100644 arch/arm64/include/asm/acenv.h create mode 100644 arch/arm64/include/asm/acpi.h create mode 100644 arch/arm64/kernel/acpi.c create mode 100644 drivers/acpi/sleep_arm.c create mode 100644 include/linux/irqchip/arm-gic-acpi.h
From: Graeme Gregory graeme.gregory@linaro.org
Now with the base changes to the arm memory mapping it is safe to convert to using ioremap to map in the tables after acpi_gbl_permanent_mmap is set.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- drivers/acpi/osl.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c index f9eeae8..39748bb 100644 --- a/drivers/acpi/osl.c +++ b/drivers/acpi/osl.c @@ -336,11 +336,11 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size size) return NULL; }
-#ifndef CONFIG_IA64 -#define should_use_kmap(pfn) page_is_ram(pfn) -#else +#if defined(CONFIG_IA64) || defined(CONFIG_ARM64) /* ioremap will take care of cache attributes */ #define should_use_kmap(pfn) 0 +#else +#define should_use_kmap(pfn) page_is_ram(pfn) #endif
static void __iomem *acpi_map(acpi_physical_address pg_off, unsigned long pg_sz)
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_
+#include <linux/mm.h> #include <linux/io.h>
static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64 + if (!page_is_ram(phys >> PAGE_SHIFT)) + return ioremap(phys, size); +#endif + return ioremap_cache(phys, size); }
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
There are multiple examples of how things like this are done. Generally, the logic is "If the architecture provides its own function for this, use that one, or use the generic one provided here otherwise."
return ioremap_cache(phys, size);
}
On 2015年02月03日 06:14, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_
+#include <linux/mm.h> #include <linux/io.h>
static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
There are multiple examples of how things like this are done. Generally, the logic is "If the architecture provides its own function for this, use that one, or use the generic one provided here otherwise."
OK. I think weak function would work.
Thanks Hanjun
On Tue, Feb 03, 2015 at 09:08:42AM +0000, Hanjun Guo wrote:
On 2015年02月03日 06:14, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_
+#include <linux/mm.h> #include <linux/io.h>
static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
There are multiple examples of how things like this are done. Generally, the logic is "If the architecture provides its own function for this, use that one, or use the generic one provided here otherwise."
OK. I think weak function would work.
Probably not in a header file. It's better to define acpi_os_ioremap() in an arm64 kernel file, together with something like:
#define ARCH_HAS_ACPI_OS_IOREMAP
and the corresponding #ifdef's in the acpi_io.h file.
On arm64 could we make this function call iorema (nocache) all the time? We need to clarify the contexts where this is used in the core ACPI code. The acpi_map() function for example checks if the page is ram and does a kmap(). Do we need to handle the NVS on arm64? AFAICT, we don't even compile drivers/acpi/sleep.c in.
Are there other cases where acpi_os_ioremap() is called directly and it needs a cacheable mapping?
On 3 February 2015 at 11:37, Catalin Marinas catalin.marinas@arm.com wrote:
On Tue, Feb 03, 2015 at 09:08:42AM +0000, Hanjun Guo wrote:
On 2015年02月03日 06:14, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_
+#include <linux/mm.h> #include <linux/io.h>
static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
There are multiple examples of how things like this are done. Generally, the logic is "If the architecture provides its own function for this, use that one, or use the generic one provided here otherwise."
OK. I think weak function would work.
Probably not in a header file. It's better to define acpi_os_ioremap() in an arm64 kernel file, together with something like:
#define ARCH_HAS_ACPI_OS_IOREMAP
and the corresponding #ifdef's in the acpi_io.h file.
On arm64 could we make this function call iorema (nocache) all the time? We need to clarify the contexts where this is used in the core ACPI code. The acpi_map() function for example checks if the page is ram and does a kmap(). Do we need to handle the NVS on arm64? AFAICT, we don't even compile drivers/acpi/sleep.c in.
Are there other cases where acpi_os_ioremap() is called directly and it needs a cacheable mapping?
The logic behind acpi_os_ioremap() could be based on the physmem series I am preparing for 3.21 timeframe. It allows us to classify physical ranges as backed by RAM or not, and call the appropriate flavor of ioremap()
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com --- arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@
#include <linux/irqchip/arm-gic-acpi.h>
+#include <linux/mm.h> #include <asm/smp_plat.h>
/* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */
+/* + * ACPI table mapping + */ +static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, + acpi_size size) +{ + if (!page_is_ram(phys >> PAGE_SHIFT)) + return ioremap(phys, size); + + return ioremap_cache(phys, size); +} +#define acpi_os_ioremap acpi_os_ioremap + #endif /*_ASM_ACPI_H*/ diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..48f504a 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -2,12 +2,15 @@ #define _ACPI_IO_H_
#include <linux/io.h> +#include <asm/acpi.h>
+#ifndef acpi_os_ioremap static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { return ioremap_cache(phys, size); } +#endif
void __iomem *__init_refok acpi_os_map_iomem(acpi_physical_address phys, acpi_size size);
On Tuesday, February 03, 2015 12:29:36 PM Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
If you want to do it this way, use __weak. You won't need the #define then. Otherwise, please use a proper CONFIG_ARCH_ symbol.
#endif /*_ASM_ACPI_H*/ diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..48f504a 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -2,12 +2,15 @@ #define _ACPI_IO_H_ #include <linux/io.h> +#include <asm/acpi.h> +#ifndef acpi_os_ioremap static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { return ioremap_cache(phys, size); } +#endif void __iomem *__init_refok acpi_os_map_iomem(acpi_physical_address phys, acpi_size size);
On Tue, Feb 03, 2015 at 11:04:27PM +0100, Rafael J. Wysocki wrote:
On Tuesday, February 03, 2015 12:29:36 PM Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
If you want to do it this way, use __weak. You won't need the #define then. Otherwise, please use a proper CONFIG_ARCH_ symbol.
How does __weak work with inline functions? I don't believe it does.
Moreover, __weak is positively harmful when you consider it adds bloat and dead code - the overriden __weak function is left behind in the resulting final image.
On Wednesday, February 04, 2015 10:48:32 AM Russell King - ARM Linux wrote:
On Tue, Feb 03, 2015 at 11:04:27PM +0100, Rafael J. Wysocki wrote:
On Tuesday, February 03, 2015 12:29:36 PM Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
If you want to do it this way, use __weak. You won't need the #define then. Otherwise, please use a proper CONFIG_ARCH_ symbol.
How does __weak work with inline functions? I don't believe it does.
It doesn't work with inline funtions, but the function here doesn't have to be inline.
Moreover, __weak is positively harmful when you consider it adds bloat and dead code - the overriden __weak function is left behind in the resulting final image.
Fair enough.
On Wed, Feb 4, 2015 at 4:48 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
Moreover, __weak is positively harmful when you consider it adds bloat and dead code - the overriden __weak function is left behind in the resulting final image.
Huh, I didn't realize that. Is that a linker bug, or is there some reason the weak function has to be in the final image? I tried a trivial test on x86 with gcc-4.8.2/ld-2.24, and I think the weak function text was omitted, but a string constant used only by the weak function was included.
Bjorn
On Wed, Feb 04, 2015 at 09:53:28AM -0600, Bjorn Helgaas wrote:
On Wed, Feb 4, 2015 at 4:48 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
Moreover, __weak is positively harmful when you consider it adds bloat and dead code - the overriden __weak function is left behind in the resulting final image.
Huh, I didn't realize that. Is that a linker bug, or is there some reason the weak function has to be in the final image? I tried a trivial test on x86 with gcc-4.8.2/ld-2.24, and I think the weak function text was omitted, but a string constant used only by the weak function was included.
Try this:
t1.c: int a; void __weak function(void) { a = 1; }
int main() { return 0; }
t2.c: extern int a; void function(void) { a = 2; }
gcc -O2 -o t12 t1.c t2.c
What I get is:
08048370 <frame_dummy>: ... 80483a0: 55 push %ebp 80483a1: 89 e5 mov %esp,%ebp 80483a3: c7 05 34 96 04 08 01 movl $0x1,0x8049634 80483aa: 00 00 00 80483ad: 5d pop %ebp 80483ae: c3 ret 80483af: 90 nop
That's the code which used to be "function" in t1.c (notice it assigning 1 to 0x8049634).
080483b0 <main>: 80483b0: 55 push %ebp 80483b1: 31 c0 xor %eax,%eax 80483b3: 89 e5 mov %esp,%ebp 80483b5: 5d pop %ebp 80483b6: c3 ret
080483c0 <function>: 80483c0: 55 push %ebp 80483c1: 89 e5 mov %esp,%ebp 80483c3: c7 05 34 96 04 08 02 movl $0x2,0x8049634 80483ca: 00 00 00 80483cd: 5d pop %ebp 80483ce: c3 ret
There's the non-weak version, assigning 2 to 0x8049634.
You have to look carefully for the weak version, because the linker will omit its symbol.
The reason this happens is because normally, each function text is emitted into the .text section of the object file, one after each other. When the image is linked, the linker copies the contents of the complete input section to the output file, and then resolves the symbolic information and relocations.
There is a way around this - the gcc -ffunction-sections flag, which causes each function to be emitted into a separate section, and then in conjunction with the --gc-sections linker flag, the linker can remove unreferenced input sections from the output file.
This also has the effect that unreferenced functions will also be removed from the output file - using --gc-sections may also result in the linker-built sections (such as the initcall list) being gc'd away.
I haven't experimented with it myself, but I think David Woodhouse has some experience in this area.
On Wed, 2015-02-04 at 16:25 +0000, Russell King - ARM Linux wrote:
I haven't experimented with it myself, but I think David Woodhouse has some experience in this area.
In many kernel configurations there are actually quite a lot of functions that are never called, and I was quite surprised the first time I played with this stuff.
There are a few ways of dealing with it. One is to use -ffunction-section -fdata-sections --gc-sections as you noted. I once also played with using GCC's --combine during the brief period that it was supported and not *entirely* broken, with similar effects: https://lwn.net/Articles/197097/
These days, the better answer is probably LTO. We could potentially still look at --gc-sections, but I suspect we're better off using LTO and just filing toolchain bugs until everything that --gc-sections *would* have dropped is also dropped from the LTO build :)
Unless --gc-sections actually speeds up the build in a significant way; a full LTO link of the kernel takes insane amounts of memory IIRC.
On Wed, Feb 4, 2015 at 10:25 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Wed, Feb 04, 2015 at 09:53:28AM -0600, Bjorn Helgaas wrote:
On Wed, Feb 4, 2015 at 4:48 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
Moreover, __weak is positively harmful when you consider it adds bloat and dead code - the overriden __weak function is left behind in the resulting final image.
Huh, I didn't realize that. Is that a linker bug, or is there some reason the weak function has to be in the final image? I tried a trivial test on x86 with gcc-4.8.2/ld-2.24, and I think the weak function text was omitted, but a string constant used only by the weak function was included.
... The reason this happens is because normally, each function text is emitted into the .text section of the object file, one after each other. When the image is linked, the linker copies the contents of the complete input section to the output file, and then resolves the symbolic information and relocations.
OK, that makes sense. Thanks a lot for the detailed explanation!
On Tue, Feb 03, 2015 at 05:29:36PM +0000, Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
That's one way of doing this, I'm not too bothered with the approach (define the function name, an ARCH_HAS macro or a Kconfig option, it's up to Rafael).
But a question I already asked is what we need ioremap_cache() for? We don't use NVS on arm64 yet, so is there anything else requiring cacheable mapping?
On Wed, 2015-02-04 at 11:25 +0000, Catalin Marinas wrote:
On Tue, Feb 03, 2015 at 05:29:36PM +0000, Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
That's one way of doing this, I'm not too bothered with the approach (define the function name, an ARCH_HAS macro or a Kconfig option, it's up to Rafael).
But a question I already asked is what we need ioremap_cache() for? We don't use NVS on arm64 yet, so is there anything else requiring cacheable mapping?
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
On 02/04/2015 10:08 AM, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
Would it be possible to modify ioremap() so that it can tell whether the memory is already mapped in some way, and then use a compatible remapping?
On Wed, Feb 04, 2015 at 04:16:34PM +0000, Timur Tabi wrote:
On 02/04/2015 10:08 AM, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
Would it be possible to modify ioremap() so that it can tell whether the memory is already mapped in some way, and then use a compatible remapping?
No. We have some semantics for ioremap() and it should return non-cacheable mapping.
ioremap_cache() checks whether the page is RAM already and returns the existing kernel linear mapping on arm64.
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 11:25 +0000, Catalin Marinas wrote:
On Tue, Feb 03, 2015 at 05:29:36PM +0000, Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
That's one way of doing this, I'm not too bothered with the approach (define the function name, an ARCH_HAS macro or a Kconfig option, it's up to Rafael).
But a question I already asked is what we need ioremap_cache() for? We don't use NVS on arm64 yet, so is there anything else requiring cacheable mapping?
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 11:25 +0000, Catalin Marinas wrote:
On Tue, Feb 03, 2015 at 05:29:36PM +0000, Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
That's one way of doing this, I'm not too bothered with the approach (define the function name, an ARCH_HAS macro or a Kconfig option, it's up to Rafael).
But a question I already asked is what we need ioremap_cache() for? We don't use NVS on arm64 yet, so is there anything else requiring cacheable mapping?
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen. I'm not sure about others.
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
On 5 February 2015 at 10:41, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Regardless of whether you hit any WARN_ON()s now, we still need to distinguish between MMIO ranges with device semantics, and ACPI or other tables whose data may not be naturally aligned all the time, and hence requiring memory semantics. acpi_os_ioremap() may be used for both, afaik
On Thu, Feb 05, 2015 at 10:47:23AM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 10:41, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Regardless of whether you hit any WARN_ON()s now,
Actually following the WARN_ON(), ioremap() returns NULL, so it may not go entirely unnoticed.
we still need to distinguish between MMIO ranges with device semantics, and ACPI or other tables whose data may not be naturally aligned all the time, and hence requiring memory semantics. acpi_os_ioremap() may be used for both, afaik
Is acpi_os_ioremap() called directly (outside acpi_map()) to map RAM that already part of the kernel linear memory? If yes, then I agree that we need to do such check.
Another question, can we distinguish, in the ACPI core code, whether the mapping is for an ACPI table in RAM or some I/O space?
On Thu, Feb 05, 2015 at 10:59:45AM +0000, Catalin Marinas wrote:
On Thu, Feb 05, 2015 at 10:47:23AM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 10:41, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Regardless of whether you hit any WARN_ON()s now,
Actually following the WARN_ON(), ioremap() returns NULL, so it may not go entirely unnoticed.
we still need to distinguish between MMIO ranges with device semantics, and ACPI or other tables whose data may not be naturally aligned all the time, and hence requiring memory semantics. acpi_os_ioremap() may be used for both, afaik
Is acpi_os_ioremap() called directly (outside acpi_map()) to map RAM that already part of the kernel linear memory? If yes, then I agree that we need to do such check.
Another question, can we distinguish, in the ACPI core code, whether the mapping is for an ACPI table in RAM or some I/O space?
Yes I think we do,
acpi_os_map_memory() is called to map tables
acpi_os_map_iomem() is called to map device IO
currently both end up in acpi_map but I guess they do not have to or we can add extra arguments as its an internal API.
But I have not checked that no user sneaks in direct calls.
Graeme
On Thu, Feb 05, 2015 at 11:14:43AM +0000, Graeme Gregory wrote:
On Thu, Feb 05, 2015 at 10:59:45AM +0000, Catalin Marinas wrote:
On Thu, Feb 05, 2015 at 10:47:23AM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 10:41, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: > acpi_os_remap() is used to map ACPI tables. These tables may be in ram > which are already included in the kernel's linear RAM mapping. So we > need ioremap_cache to avoid two mappings to the same physical page > having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Regardless of whether you hit any WARN_ON()s now,
Actually following the WARN_ON(), ioremap() returns NULL, so it may not go entirely unnoticed.
we still need to distinguish between MMIO ranges with device semantics, and ACPI or other tables whose data may not be naturally aligned all the time, and hence requiring memory semantics. acpi_os_ioremap() may be used for both, afaik
Is acpi_os_ioremap() called directly (outside acpi_map()) to map RAM that already part of the kernel linear memory? If yes, then I agree that we need to do such check.
Another question, can we distinguish, in the ACPI core code, whether the mapping is for an ACPI table in RAM or some I/O space?
Yes I think we do,
acpi_os_map_memory() is called to map tables
acpi_os_map_iomem() is called to map device IO
currently both end up in acpi_map but I guess they do not have to or we can add extra arguments as its an internal API.
Ending up in acpi_map() is ok as this function checks whether it should use kmap() or acpi_os_ioremap().
But I have not checked that no user sneaks in direct calls.
Grep'ing for acpi_os_ioremap():
suspend_nvs_save() - we don't care about this yet for arm64 as the function is only compiled in if CONFIG_ACPI_SLEEP
acpi_os_read_memory() and acpi_os_write_memory() - do you know what kind of memory are these used on?
couple of intel drm files that are not used on arm.
On Thu, Feb 05, 2015 at 12:07:20PM +0000, Catalin Marinas wrote:
On Thu, Feb 05, 2015 at 11:14:43AM +0000, Graeme Gregory wrote:
On Thu, Feb 05, 2015 at 10:59:45AM +0000, Catalin Marinas wrote:
On Thu, Feb 05, 2015 at 10:47:23AM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 10:41, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote: > On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: > > acpi_os_remap() is used to map ACPI tables. These tables may be in ram > > which are already included in the kernel's linear RAM mapping. So we > > need ioremap_cache to avoid two mappings to the same physical page > > having different caching attributes. > > What's the call path to acpi_os_ioremap() on such tables already in the > linear mapping? I can see an acpi_map() function which already takes > care of the RAM mapping case but there are other cases where > acpi_os_ioremap() is called directly. For example, > acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Regardless of whether you hit any WARN_ON()s now,
Actually following the WARN_ON(), ioremap() returns NULL, so it may not go entirely unnoticed.
we still need to distinguish between MMIO ranges with device semantics, and ACPI or other tables whose data may not be naturally aligned all the time, and hence requiring memory semantics. acpi_os_ioremap() may be used for both, afaik
Is acpi_os_ioremap() called directly (outside acpi_map()) to map RAM that already part of the kernel linear memory? If yes, then I agree that we need to do such check.
Another question, can we distinguish, in the ACPI core code, whether the mapping is for an ACPI table in RAM or some I/O space?
Yes I think we do,
acpi_os_map_memory() is called to map tables
acpi_os_map_iomem() is called to map device IO
currently both end up in acpi_map but I guess they do not have to or we can add extra arguments as its an internal API.
Ending up in acpi_map() is ok as this function checks whether it should use kmap() or acpi_os_ioremap().
But I have not checked that no user sneaks in direct calls.
Grep'ing for acpi_os_ioremap():
suspend_nvs_save() - we don't care about this yet for arm64 as the function is only compiled in if CONFIG_ACPI_SLEEP
acpi_os_read_memory() and acpi_os_write_memory() - do you know what kind of memory are these used on?
They are used when an operating region is set to SystemMemory type.
From table 19-326
Region Type: SystemMemory Permitted Access Type: ByteAcc, WordAcc, DWordAcc, QWordAcc, or AnyAcc Description: All access allowed
Graeme
couple of intel drm files that are not used on arm.
-- Catalin
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Thu, Feb 05, 2015 at 12:52:08PM +0000, Graeme Gregory wrote:
On Thu, Feb 05, 2015 at 12:07:20PM +0000, Catalin Marinas wrote:
On Thu, Feb 05, 2015 at 11:14:43AM +0000, Graeme Gregory wrote:
On Thu, Feb 05, 2015 at 10:59:45AM +0000, Catalin Marinas wrote:
On Thu, Feb 05, 2015 at 10:47:23AM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 10:41, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote: > On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote: > > On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: > > > acpi_os_remap() is used to map ACPI tables. These tables may be in ram > > > which are already included in the kernel's linear RAM mapping. So we > > > need ioremap_cache to avoid two mappings to the same physical page > > > having different caching attributes. > > > > What's the call path to acpi_os_ioremap() on such tables already in the > > linear mapping? I can see an acpi_map() function which already takes > > care of the RAM mapping case but there are other cases where > > acpi_os_ioremap() is called directly. For example, > > acpi_os_read_memory(), can it be called on both RAM and I/O? > > acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
> I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Regardless of whether you hit any WARN_ON()s now,
Actually following the WARN_ON(), ioremap() returns NULL, so it may not go entirely unnoticed.
we still need to distinguish between MMIO ranges with device semantics, and ACPI or other tables whose data may not be naturally aligned all the time, and hence requiring memory semantics. acpi_os_ioremap() may be used for both, afaik
Is acpi_os_ioremap() called directly (outside acpi_map()) to map RAM that already part of the kernel linear memory? If yes, then I agree that we need to do such check.
Another question, can we distinguish, in the ACPI core code, whether the mapping is for an ACPI table in RAM or some I/O space?
Yes I think we do,
acpi_os_map_memory() is called to map tables
acpi_os_map_iomem() is called to map device IO
currently both end up in acpi_map but I guess they do not have to or we can add extra arguments as its an internal API.
Ending up in acpi_map() is ok as this function checks whether it should use kmap() or acpi_os_ioremap().
But I have not checked that no user sneaks in direct calls.
Grep'ing for acpi_os_ioremap():
suspend_nvs_save() - we don't care about this yet for arm64 as the function is only compiled in if CONFIG_ACPI_SLEEP
acpi_os_read_memory() and acpi_os_write_memory() - do you know what kind of memory are these used on?
They are used when an operating region is set to SystemMemory type.
From table 19-326
Region Type: SystemMemory Permitted Access Type: ByteAcc, WordAcc, DWordAcc, QWordAcc, or AnyAcc Description: All access allowed
OK. So I guess these would fall under the page_is_ram() category in Linux.
On 5 February 2015 at 12:07, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 11:14:43AM +0000, Graeme Gregory wrote:
On Thu, Feb 05, 2015 at 10:59:45AM +0000, Catalin Marinas wrote:
On Thu, Feb 05, 2015 at 10:47:23AM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 10:41, Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote: > On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: > > acpi_os_remap() is used to map ACPI tables. These tables may be in ram > > which are already included in the kernel's linear RAM mapping. So we > > need ioremap_cache to avoid two mappings to the same physical page > > having different caching attributes. > > What's the call path to acpi_os_ioremap() on such tables already in the > linear mapping? I can see an acpi_map() function which already takes > care of the RAM mapping case but there are other cases where > acpi_os_ioremap() is called directly. For example, > acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Regardless of whether you hit any WARN_ON()s now,
Actually following the WARN_ON(), ioremap() returns NULL, so it may not go entirely unnoticed.
we still need to distinguish between MMIO ranges with device semantics, and ACPI or other tables whose data may not be naturally aligned all the time, and hence requiring memory semantics. acpi_os_ioremap() may be used for both, afaik
Is acpi_os_ioremap() called directly (outside acpi_map()) to map RAM that already part of the kernel linear memory? If yes, then I agree that we need to do such check.
Another question, can we distinguish, in the ACPI core code, whether the mapping is for an ACPI table in RAM or some I/O space?
Yes I think we do,
acpi_os_map_memory() is called to map tables
acpi_os_map_iomem() is called to map device IO
currently both end up in acpi_map but I guess they do not have to or we can add extra arguments as its an internal API.
Ending up in acpi_map() is ok as this function checks whether it should use kmap() or acpi_os_ioremap().
This still only addresses the mismatched attributes part: regions that require memory semantics may still end up being mapped as device memory if they are not covered by the linear mapping, which could happen if the region resides below the kernel in memory, or if we passed a mem= parameter and it sits at the very top.
But I have not checked that no user sneaks in direct calls.
Grep'ing for acpi_os_ioremap():
suspend_nvs_save() - we don't care about this yet for arm64 as the function is only compiled in if CONFIG_ACPI_SLEEP
acpi_os_read_memory() and acpi_os_write_memory() - do you know what kind of memory are these used on?
couple of intel drm files that are not used on arm.
On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
The problem with kmap() is that it only maps a single page. I've seen tables over 4k which is why I patched acpi_map() not to use kmap() on arm64.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
On 02/05/2015 06:54 AM, Mark Salter wrote:
On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
The problem with kmap() is that it only maps a single page. I've seen tables over 4k which is why I patched acpi_map() not to use kmap() on arm64.
Right. Mark replied to this before I could; using kmap() enforced a 4k (one page) limit that we kept breaking with some ACPI tables being larger than that (DSDTs and SSDTs, fwiw). This would lead to some very odd behaviors when most but not all of a device definition was within the page; using the table checksums was one way of detecting the issues.
I'm not sure about others.
Question for the ARM ACPI guys: what happens if you implement acpi_os_ioremap() on arm64 as just ioremap()? Do you get any WARN_ON() (__ioremap_caller() checks whether the memory is RAM)?
Linaro-acpi mailing list Linaro-acpi@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-acpi
On Thu, Feb 05, 2015 at 04:42:19PM +0000, Al Stone wrote:
On 02/05/2015 06:54 AM, Mark Salter wrote:
On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote:
acpi_os_remap() is used to map ACPI tables. These tables may be in ram which are already included in the kernel's linear RAM mapping. So we need ioremap_cache to avoid two mappings to the same physical page having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
The problem with kmap() is that it only maps a single page. I've seen tables over 4k which is why I patched acpi_map() not to use kmap() on arm64.
Right. Mark replied to this before I could; using kmap() enforced a 4k (one page) limit that we kept breaking with some ACPI tables being larger than that (DSDTs and SSDTs, fwiw). This would lead to some very odd behaviors when most but not all of a device definition was within the page; using the table checksums was one way of detecting the issues.
OK. So I think Mark's original patch was ok, assuming that the System Memory cases mentioned by Graeme are detected with page_is_ram().
On 5 February 2015 at 17:48, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 04:42:19PM +0000, Al Stone wrote:
On 02/05/2015 06:54 AM, Mark Salter wrote:
On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: > acpi_os_remap() is used to map ACPI tables. These tables may be in ram > which are already included in the kernel's linear RAM mapping. So we > need ioremap_cache to avoid two mappings to the same physical page > having different caching attributes.
What's the call path to acpi_os_ioremap() on such tables already in the linear mapping? I can see an acpi_map() function which already takes care of the RAM mapping case but there are other cases where acpi_os_ioremap() is called directly. For example, acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
The problem with kmap() is that it only maps a single page. I've seen tables over 4k which is why I patched acpi_map() not to use kmap() on arm64.
Right. Mark replied to this before I could; using kmap() enforced a 4k (one page) limit that we kept breaking with some ACPI tables being larger than that (DSDTs and SSDTs, fwiw). This would lead to some very odd behaviors when most but not all of a device definition was within the page; using the table checksums was one way of detecting the issues.
OK. So I think Mark's original patch was ok, assuming that the System Memory cases mentioned by Graeme are detected with page_is_ram().
page_is_ram() returns whether a pfn is covered by the linear mapping, so memory before the kernel or after a mem= limit will be misidentified.
On Thu, Feb 05, 2015 at 10:16:03PM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 17:48, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 04:42:19PM +0000, Al Stone wrote:
On 02/05/2015 06:54 AM, Mark Salter wrote:
On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote:
On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote: > On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: >> acpi_os_remap() is used to map ACPI tables. These tables may be in ram >> which are already included in the kernel's linear RAM mapping. So we >> need ioremap_cache to avoid two mappings to the same physical page >> having different caching attributes. > > What's the call path to acpi_os_ioremap() on such tables already in the > linear mapping? I can see an acpi_map() function which already takes > care of the RAM mapping case but there are other cases where > acpi_os_ioremap() is called directly. For example, > acpi_os_read_memory(), can it be called on both RAM and I/O?
acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
The problem with kmap() is that it only maps a single page. I've seen tables over 4k which is why I patched acpi_map() not to use kmap() on arm64.
Right. Mark replied to this before I could; using kmap() enforced a 4k (one page) limit that we kept breaking with some ACPI tables being larger than that (DSDTs and SSDTs, fwiw). This would lead to some very odd behaviors when most but not all of a device definition was within the page; using the table checksums was one way of detecting the issues.
OK. So I think Mark's original patch was ok, assuming that the System Memory cases mentioned by Graeme are detected with page_is_ram().
page_is_ram() returns whether a pfn is covered by the linear mapping, so memory before the kernel or after a mem= limit will be misidentified.
OK. So in conclusion acpi_os_ioremap() may need to create a cacheable mapping even when !page_is_ram() but it has no way of knowing that unless we change the core ACPI code to differentiate between ioremap_cache and ioremap_nocache. Did I get it right?
On 6 February 2015 at 10:36, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 10:16:03PM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 17:48, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 04:42:19PM +0000, Al Stone wrote:
On 02/05/2015 06:54 AM, Mark Salter wrote:
On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote: > On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote: >> On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: >>> acpi_os_remap() is used to map ACPI tables. These tables may be in ram >>> which are already included in the kernel's linear RAM mapping. So we >>> need ioremap_cache to avoid two mappings to the same physical page >>> having different caching attributes. >> >> What's the call path to acpi_os_ioremap() on such tables already in the >> linear mapping? I can see an acpi_map() function which already takes >> care of the RAM mapping case but there are other cases where >> acpi_os_ioremap() is called directly. For example, >> acpi_os_read_memory(), can it be called on both RAM and I/O? > > acpi_map() is the one I've seen.
By default, if should_use_kmap() is not patched for arm64, it translates to page_is_ram(); acpi_map() would simply use a kmap() which returns the current kernel linear mapping on arm64.
The problem with kmap() is that it only maps a single page. I've seen tables over 4k which is why I patched acpi_map() not to use kmap() on arm64.
Right. Mark replied to this before I could; using kmap() enforced a 4k (one page) limit that we kept breaking with some ACPI tables being larger than that (DSDTs and SSDTs, fwiw). This would lead to some very odd behaviors when most but not all of a device definition was within the page; using the table checksums was one way of detecting the issues.
OK. So I think Mark's original patch was ok, assuming that the System Memory cases mentioned by Graeme are detected with page_is_ram().
page_is_ram() returns whether a pfn is covered by the linear mapping, so memory before the kernel or after a mem= limit will be misidentified.
OK. So in conclusion acpi_os_ioremap() may need to create a cacheable mapping even when !page_is_ram() but it has no way of knowing that unless we change the core ACPI code to differentiate between ioremap_cache and ioremap_nocache. Did I get it right?
Yes and no. Your analysis about the core issue is correct, but it is something we can fix on our end if we like. This issue has been on our radar for a while, and we proposed a way to fix it here
http://thread.gmane.org/gmane.linux.kernel.efi/5133
(The 'other series' the cover letter refers to is the virtmap series you pulled for 3.20)
There is a known issue on APM with this series, reported by Dave Young, and I was hoping digging into that next week at Connect.
On Fri, Feb 06, 2015 at 11:08:51AM +0000, Ard Biesheuvel wrote:
On 6 February 2015 at 10:36, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 10:16:03PM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 17:48, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 04:42:19PM +0000, Al Stone wrote:
On 02/05/2015 06:54 AM, Mark Salter wrote:
On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote: > On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote: >> On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote: >>> On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: >>>> acpi_os_remap() is used to map ACPI tables. These tables may be in ram >>>> which are already included in the kernel's linear RAM mapping. So we >>>> need ioremap_cache to avoid two mappings to the same physical page >>>> having different caching attributes. >>> >>> What's the call path to acpi_os_ioremap() on such tables already in the >>> linear mapping? I can see an acpi_map() function which already takes >>> care of the RAM mapping case but there are other cases where >>> acpi_os_ioremap() is called directly. For example, >>> acpi_os_read_memory(), can it be called on both RAM and I/O? >> >> acpi_map() is the one I've seen. > > By default, if should_use_kmap() is not patched for arm64, it translates > to page_is_ram(); acpi_map() would simply use a kmap() which returns the > current kernel linear mapping on arm64.
The problem with kmap() is that it only maps a single page. I've seen tables over 4k which is why I patched acpi_map() not to use kmap() on arm64.
Right. Mark replied to this before I could; using kmap() enforced a 4k (one page) limit that we kept breaking with some ACPI tables being larger than that (DSDTs and SSDTs, fwiw). This would lead to some very odd behaviors when most but not all of a device definition was within the page; using the table checksums was one way of detecting the issues.
OK. So I think Mark's original patch was ok, assuming that the System Memory cases mentioned by Graeme are detected with page_is_ram().
page_is_ram() returns whether a pfn is covered by the linear mapping, so memory before the kernel or after a mem= limit will be misidentified.
OK. So in conclusion acpi_os_ioremap() may need to create a cacheable mapping even when !page_is_ram() but it has no way of knowing that unless we change the core ACPI code to differentiate between ioremap_cache and ioremap_nocache. Did I get it right?
Yes and no. Your analysis about the core issue is correct, but it is something we can fix on our end if we like. This issue has been on our radar for a while, and we proposed a way to fix it here
I looked at it briefly but it had ACPI in the subject and decided it's not urgent ;).
IIUC, it relies on the EFI system table to be available and the kernel will register the appropriate "System RAM" resources. This assumes in general that the kernel is booted via the EFI stub. Do we expect Xen or kexec to pass an EFI system table when not booting via EFI stub?
On 6 February 2015 at 14:16, Catalin Marinas catalin.marinas@arm.com wrote:
On Fri, Feb 06, 2015 at 11:08:51AM +0000, Ard Biesheuvel wrote:
On 6 February 2015 at 10:36, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 10:16:03PM +0000, Ard Biesheuvel wrote:
On 5 February 2015 at 17:48, Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Feb 05, 2015 at 04:42:19PM +0000, Al Stone wrote:
On 02/05/2015 06:54 AM, Mark Salter wrote: > On Thu, 2015-02-05 at 10:41 +0000, Catalin Marinas wrote: >> On Wed, Feb 04, 2015 at 06:58:14PM +0000, Mark Salter wrote: >>> On Wed, 2015-02-04 at 17:57 +0000, Catalin Marinas wrote: >>>> On Wed, Feb 04, 2015 at 04:08:27PM +0000, Mark Salter wrote: >>>>> acpi_os_remap() is used to map ACPI tables. These tables may be in ram >>>>> which are already included in the kernel's linear RAM mapping. So we >>>>> need ioremap_cache to avoid two mappings to the same physical page >>>>> having different caching attributes. >>>> >>>> What's the call path to acpi_os_ioremap() on such tables already in the >>>> linear mapping? I can see an acpi_map() function which already takes >>>> care of the RAM mapping case but there are other cases where >>>> acpi_os_ioremap() is called directly. For example, >>>> acpi_os_read_memory(), can it be called on both RAM and I/O? >>> >>> acpi_map() is the one I've seen. >> >> By default, if should_use_kmap() is not patched for arm64, it translates >> to page_is_ram(); acpi_map() would simply use a kmap() which returns the >> current kernel linear mapping on arm64. > > The problem with kmap() is that it only maps a single page. I've seen > tables over 4k which is why I patched acpi_map() not to use kmap() on > arm64.
Right. Mark replied to this before I could; using kmap() enforced a 4k (one page) limit that we kept breaking with some ACPI tables being larger than that (DSDTs and SSDTs, fwiw). This would lead to some very odd behaviors when most but not all of a device definition was within the page; using the table checksums was one way of detecting the issues.
OK. So I think Mark's original patch was ok, assuming that the System Memory cases mentioned by Graeme are detected with page_is_ram().
page_is_ram() returns whether a pfn is covered by the linear mapping, so memory before the kernel or after a mem= limit will be misidentified.
OK. So in conclusion acpi_os_ioremap() may need to create a cacheable mapping even when !page_is_ram() but it has no way of knowing that unless we change the core ACPI code to differentiate between ioremap_cache and ioremap_nocache. Did I get it right?
Yes and no. Your analysis about the core issue is correct, but it is something we can fix on our end if we like. This issue has been on our radar for a while, and we proposed a way to fix it here
I looked at it briefly but it had ACPI in the subject and decided it's not urgent ;).
IIUC, it relies on the EFI system table to be available and the kernel will register the appropriate "System RAM" resources. This assumes in general that the kernel is booted via the EFI stub. Do we expect Xen or kexec to pass an EFI system table when not booting via EFI stub?
That's just one of the patches, and it is not actually the one that addresses this issue. (Registering the iomem resources is mainly to ensure MMIO regions for the NOR flash or RTC are not claimed by a kernel driver if they are being driven by the firmware at runtime)
The point of the series is to wire up the 'physmem' memblock table to record what we know is system RAM, and use that to decide what flavor of mapping to use. The series as-is addresses the non-UEFI case as well, the only thing missing is wiring up page_is_ram() to memblock_is_physmem() (the former is __weak already in the core code, but perhaps it would be better to just use the latter directly)
With these changes, we no longer have to care whether a reserved region sits below PHYS_OFFSET or above a mem= limit
Note that, in the non-UEFI case, we may need to consider removing memreserve regions from the linear mapping. Code that assumes it is mapped is broken anyway, due to the same concerns outlined above (i.e., < PHYS_OFFSET or > mem=).
On Tuesday, February 03, 2015 12:29:36 PM Mark Salter wrote:
On Mon, 2015-02-02 at 23:14 +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:30 PM Hanjun Guo wrote:
From: Mark Salter msalter@redhat.com
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
include/acpi/acpi_io.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..9d573db 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -1,11 +1,17 @@ #ifndef _ACPI_IO_H_ #define _ACPI_IO_H_ +#include <linux/mm.h> #include <linux/io.h> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { +#ifdef CONFIG_ARM64
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
+#endif
I don't want to see #ifdef CONFIG_ARM64 in this file.
How about something like:
From: Mark Salter msalter@redhat.com Date: Tue, 3 Feb 2015 10:51:16 -0500 Subject: [PATCH] acpi: fix acpi_os_ioremap for arm64
The acpi_os_ioremap() function may be used to map normal RAM or IO regions. The current implementation simply uses ioremap_cache(). This will work for some architectures, but arm64 ioremap_cache() cannot be used to map IO regions which don't support caching. So for arm64, use ioremap() for non-RAM regions.
CC: Rafael J Wysocki rjw@rjwysocki.net Signed-off-by: Mark Salter msalter@redhat.com
arch/arm64/include/asm/acpi.h | 14 ++++++++++++++ include/acpi/acpi_io.h | 3 +++ 2 files changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b3..db82bc3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -14,6 +14,7 @@ #include <linux/irqchip/arm-gic-acpi.h> +#include <linux/mm.h> #include <asm/smp_plat.h> /* Basic configuration for ACPI */ @@ -100,4 +101,17 @@ static inline bool acpi_psci_use_hvc(void) { return false; } static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */ +/*
- ACPI table mapping
- */
+static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
acpi_size size)
+{
- if (!page_is_ram(phys >> PAGE_SHIFT))
return ioremap(phys, size);
return ioremap_cache(phys, size);
+} +#define acpi_os_ioremap acpi_os_ioremap
Actually, I see that we use similar #defines in other places too, so the patch is fine by me as is (modulo the other concerns that people seem to have about this).
#endif /*_ASM_ACPI_H*/ diff --git a/include/acpi/acpi_io.h b/include/acpi/acpi_io.h index 444671e..48f504a 100644 --- a/include/acpi/acpi_io.h +++ b/include/acpi/acpi_io.h @@ -2,12 +2,15 @@ #define _ACPI_IO_H_ #include <linux/io.h> +#include <asm/acpi.h> +#ifndef acpi_os_ioremap static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size) { return ioremap_cache(phys, size); } +#endif void __iomem *__init_refok acpi_os_map_iomem(acpi_physical_address phys, acpi_size size);
From: Mark Salter msalter@redhat.com
Commit 0e63ea48b4d8 (arm64/efi: add missing call to early_ioremap_reset()) added a missing call to early_ioremap_reset(). This triggers a BUG if code tries using early_ioremap() after the early_ioremap_reset(). This is a problem for some ACPI code which needs short-lived temporary mappings after paging_init() but before acpi_early_init() in start_kernel(). This patch adds definitions for the __late_set_fixmap() and __late_clear_fixmap() which avoids the BUG by allowing later use of early_ioremap().
CC: Leif Lindholm leif.lindholm@linaro.org CC: Ard Biesheuvel ard.biesheuvel@linaro.org Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Signed-off-by: Mark Salter msalter@redhat.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/include/asm/fixmap.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h index 9ef6eca..e629c70 100644 --- a/arch/arm64/include/asm/fixmap.h +++ b/arch/arm64/include/asm/fixmap.h @@ -61,6 +61,9 @@ void __init early_fixmap_init(void);
#define __early_set_fixmap __set_fixmap
+#define __late_set_fixmap __set_fixmap +#define __late_clear_fixmap(idx) __set_fixmap((idx), 0, FIXMAP_PAGE_CLEAR) + extern void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
#include <asm-generic/fixmap.h>
From: Al Stone al.stone@linaro.org
As we want to get ACPI tables to parse and then use the information for system initialization, we should get the RSDP (Root System Description Pointer) first, it then locates Extended Root Description Table (XSDT) which contains all the 64-bit physical address that pointer to other boot-time tables.
Introduce acpi.c and its related head file in this patch to provide fundamental needs of extern variables and functions for ACPI core, and then get boot-time tables as needed. - asm/acenv.h for arch specific ACPICA environments and implementation, It is needed unconditionally by ACPI core; - asm/acpi.h for arch specific variables and functions needed by ACPI driver core; - acpi.c for ARM64 related ACPI implementation for ACPI driver core;
acpi_boot_table_init() is introduced to get RSDP and boot-time tables, it will be called in setup_arch() before paging_init(), so we should use eary_memremap() mechanism here to get the RSDP and all the table pointers.
CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/include/asm/acenv.h | 18 +++++++++++ arch/arm64/include/asm/acpi.h | 45 +++++++++++++++++++++++++++ arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/acpi.c | 69 ++++++++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 4 +++ 5 files changed, 137 insertions(+) create mode 100644 arch/arm64/include/asm/acenv.h create mode 100644 arch/arm64/include/asm/acpi.h create mode 100644 arch/arm64/kernel/acpi.c
diff --git a/arch/arm64/include/asm/acenv.h b/arch/arm64/include/asm/acenv.h new file mode 100644 index 0000000..b49166f --- /dev/null +++ b/arch/arm64/include/asm/acenv.h @@ -0,0 +1,18 @@ +/* + * ARM64 specific ACPICA environments and implementation + * + * Copyright (C) 2014, Linaro Ltd. + * Author: Hanjun Guo hanjun.guo@linaro.org + * Author: Graeme Gregory graeme.gregory@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef _ASM_ACENV_H +#define _ASM_ACENV_H + +/* It is required unconditionally by ACPI core, update it when needed. */ + +#endif /* _ASM_ACENV_H */ diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h new file mode 100644 index 0000000..8b837ab --- /dev/null +++ b/arch/arm64/include/asm/acpi.h @@ -0,0 +1,45 @@ +/* + * Copyright (C) 2013-2014, Linaro Ltd. + * Author: Al Stone al.stone@linaro.org + * Author: Graeme Gregory graeme.gregory@linaro.org + * Author: Hanjun Guo hanjun.guo@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation; + */ + +#ifndef _ASM_ACPI_H +#define _ASM_ACPI_H + +/* Basic configuration for ACPI */ +#ifdef CONFIG_ACPI +#define acpi_strict 1 /* No out-of-spec workarounds on ARM64 */ +extern int acpi_disabled; +extern int acpi_noirq; +extern int acpi_pci_disabled; + +static inline void disable_acpi(void) +{ + acpi_disabled = 1; + acpi_pci_disabled = 1; + acpi_noirq = 1; +} + +/* + * It's used from ACPI core in kdump to boot UP system with SMP kernel, + * with this check the ACPI core will not override the CPU index + * obtained from GICC with 0 and not print some error message as well. + * Since MADT must provide at least one GICC structure for GIC + * initialization, CPU will be always available in MADT on ARM64. + */ +static inline bool acpi_has_cpu_in_madt(void) +{ + return true; +} + +static inline void arch_fix_phys_package_id(int num, u32 slot) { } + +#endif /* CONFIG_ACPI */ + +#endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index eaa77ed..8bdc6bd 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -34,6 +34,7 @@ arm64-obj-$(CONFIG_KGDB) += kgdb.o arm64-obj-$(CONFIG_EFI) += efi.o efi-stub.o efi-entry.o arm64-obj-$(CONFIG_PCI) += pci.o arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o +arm64-obj-$(CONFIG_ACPI) += acpi.o
obj-y += $(arm64-obj-y) vdso/ obj-m += $(arm64-obj-m) diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c new file mode 100644 index 0000000..7f67c01 --- /dev/null +++ b/arch/arm64/kernel/acpi.c @@ -0,0 +1,69 @@ +/* + * ARM64 Specific Low-Level ACPI Boot Support + * + * Copyright (C) 2013-2014, Linaro Ltd. + * Author: Al Stone al.stone@linaro.org + * Author: Graeme Gregory graeme.gregory@linaro.org + * Author: Hanjun Guo hanjun.guo@linaro.org + * Author: Tomasz Nowicki tomasz.nowicki@linaro.org + * Author: Naresh Bhat naresh.bhat@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/acpi.h> +#include <linux/bootmem.h> +#include <linux/cpumask.h> +#include <linux/init.h> +#include <linux/irq.h> +#include <linux/irqdomain.h> +#include <linux/memblock.h> +#include <linux/smp.h> + +int acpi_noirq; /* skip ACPI IRQ initialization */ +int acpi_disabled; +EXPORT_SYMBOL(acpi_disabled); + +int acpi_pci_disabled; /* skip ACPI PCI scan and IRQ initialization */ +EXPORT_SYMBOL(acpi_pci_disabled); + +/* + * __acpi_map_table() will be called before page_init(), so early_ioremap() + * or early_memremap() should be called here to for ACPI table mapping. + */ +char *__init __acpi_map_table(unsigned long phys, unsigned long size) +{ + if (!phys || !size) + return NULL; + + return early_memremap(phys, size); +} + +void __init __acpi_unmap_table(char *map, unsigned long size) +{ + if (!map || !size) + return; + + early_memunmap(map, size); +} + +/* + * acpi_boot_table_init() called from setup_arch(), always. + * 1. find RSDP and get its address, and then find XSDT + * 2. extract all tables and checksums them all + * + * We can parse ACPI boot-time tables such as MADT after + * this function is called. + */ +void __init acpi_boot_table_init(void) +{ + /* If acpi_disabled, bail out */ + if (acpi_disabled) + return; + + /* Initialize the ACPI boot-time table parser. */ + if (acpi_table_init()) + disable_acpi(); +} diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 20fe293..726b019 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -17,6 +17,7 @@ * along with this program. If not, see http://www.gnu.org/licenses/. */
+#include <linux/acpi.h> #include <linux/export.h> #include <linux/kernel.h> #include <linux/stddef.h> @@ -398,6 +399,9 @@ void __init setup_arch(char **cmdline_p) efi_init(); arm64_memblock_init();
+ /* Parse the ACPI tables for possible boot-time configuration */ + acpi_boot_table_init(); + paging_init(); request_standard_resources();
From: Graeme Gregory graeme.gregory@linaro.org
ACPI 5.1 does not currently support S states for ARM64 hardware but ACPI code will call acpi_target_system_state() for device power managment, so introduce sleep_arm.c to allow other drivers to function until S states are defined.
CC: Rafael J. Wysocki rjw@rjwysocki.net Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- drivers/acpi/Makefile | 4 ++++ drivers/acpi/sleep_arm.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) create mode 100644 drivers/acpi/sleep_arm.c
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index f74317c..bcec54e 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -23,7 +23,11 @@ acpi-y += nvs.o
# Power management related files acpi-y += wakeup.o +ifeq ($(ARCH), arm64) +acpi-y += sleep_arm.o +else # X86, IA64 acpi-y += sleep.o +endif acpi-y += device_pm.o acpi-$(CONFIG_ACPI_SLEEP) += proc.o
diff --git a/drivers/acpi/sleep_arm.c b/drivers/acpi/sleep_arm.c new file mode 100644 index 0000000..54578ef --- /dev/null +++ b/drivers/acpi/sleep_arm.c @@ -0,0 +1,28 @@ +/* + * ARM64 Specific Sleep Functionality + * + * Copyright (C) 2013-2014, Linaro Ltd. + * Author: Graeme Gregory graeme.gregory@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/acpi.h> + +/* + * Currently the ACPI 5.1 standard does not define S states in a + * manner which is usable for ARM64. These two stubs are sufficient + * that system initialises and device PM works. + */ +u32 acpi_target_system_state(void) +{ + return ACPI_STATE_S0; +} +EXPORT_SYMBOL_GPL(acpi_target_system_state); + +int __init acpi_sleep_init(void) +{ + return -ENOSYS; +}
On Monday, February 02, 2015 08:45:33 PM Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
ACPI 5.1 does not currently support S states for ARM64 hardware but ACPI code will call acpi_target_system_state() for device power managment, so introduce sleep_arm.c to allow other drivers to function until S states are defined.
CC: Rafael J. Wysocki rjw@rjwysocki.net Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
drivers/acpi/Makefile | 4 ++++ drivers/acpi/sleep_arm.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) create mode 100644 drivers/acpi/sleep_arm.c
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index f74317c..bcec54e 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -23,7 +23,11 @@ acpi-y += nvs.o # Power management related files acpi-y += wakeup.o +ifeq ($(ARCH), arm64) +acpi-y += sleep_arm.o +else # X86, IA64 acpi-y += sleep.o +endif acpi-y += device_pm.o acpi-$(CONFIG_ACPI_SLEEP) += proc.o diff --git a/drivers/acpi/sleep_arm.c b/drivers/acpi/sleep_arm.c new file mode 100644 index 0000000..54578ef --- /dev/null +++ b/drivers/acpi/sleep_arm.c @@ -0,0 +1,28 @@ +/*
- ARM64 Specific Sleep Functionality
- Copyright (C) 2013-2014, Linaro Ltd.
Author: Graeme Gregory <graeme.gregory@linaro.org>
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#include <linux/acpi.h>
+/*
- Currently the ACPI 5.1 standard does not define S states in a
- manner which is usable for ARM64. These two stubs are sufficient
- that system initialises and device PM works.
- */
+u32 acpi_target_system_state(void) +{
- return ACPI_STATE_S0;
+} +EXPORT_SYMBOL_GPL(acpi_target_system_state);
+int __init acpi_sleep_init(void) +{
- return -ENOSYS;
+}
Why does this need to be in drivers/acpi/ ?
On Mon, Feb 02, 2015 at 11:18:24PM +0100, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:33 PM Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
ACPI 5.1 does not currently support S states for ARM64 hardware but ACPI code will call acpi_target_system_state() for device power managment, so introduce sleep_arm.c to allow other drivers to function until S states are defined.
CC: Rafael J. Wysocki rjw@rjwysocki.net Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
drivers/acpi/Makefile | 4 ++++ drivers/acpi/sleep_arm.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) create mode 100644 drivers/acpi/sleep_arm.c
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index f74317c..bcec54e 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -23,7 +23,11 @@ acpi-y += nvs.o # Power management related files acpi-y += wakeup.o +ifeq ($(ARCH), arm64) +acpi-y += sleep_arm.o +else # X86, IA64 acpi-y += sleep.o +endif acpi-y += device_pm.o acpi-$(CONFIG_ACPI_SLEEP) += proc.o diff --git a/drivers/acpi/sleep_arm.c b/drivers/acpi/sleep_arm.c new file mode 100644 index 0000000..54578ef --- /dev/null +++ b/drivers/acpi/sleep_arm.c @@ -0,0 +1,28 @@ +/*
- ARM64 Specific Sleep Functionality
- Copyright (C) 2013-2014, Linaro Ltd.
Author: Graeme Gregory <graeme.gregory@linaro.org>
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#include <linux/acpi.h>
+/*
- Currently the ACPI 5.1 standard does not define S states in a
- manner which is usable for ARM64. These two stubs are sufficient
- that system initialises and device PM works.
- */
+u32 acpi_target_system_state(void) +{
- return ACPI_STATE_S0;
+} +EXPORT_SYMBOL_GPL(acpi_target_system_state);
+int __init acpi_sleep_init(void) +{
- return -ENOSYS;
+}
Why does this need to be in drivers/acpi/ ?
Sorry it doesn't it got left behind when we moved some other stuff.
Graeme
CONFIG_ACPI depends CONFIG_PCI on x86 and ia64, in ARM64 server world we will have PCIe in most cases, but some of them may not, make CONFIG_ACPI depend CONFIG_PCI on ARM64 will satisfy both.
With that case, we need some arch dependent PCI functions to access the config space before the PCI root bridge is created, and pci_acpi_scan_root() to create the PCI root bus. So introduce some stub function here to make ACPI core compile and revisit them later when implemented on ARM64.
CC: Liviu Dudau Liviu.Dudau@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/include/asm/pci.h | 6 ++++++ arch/arm64/kernel/pci.c | 25 +++++++++++++++++++++++++ 2 files changed, 31 insertions(+)
diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h index 872ba93..fded096 100644 --- a/arch/arm64/include/asm/pci.h +++ b/arch/arm64/include/asm/pci.h @@ -24,6 +24,12 @@ */ #define PCI_DMA_BUS_IS_PHYS (0)
+static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel) +{ + /* no legacy IRQ on arm64 */ + return -ENODEV; +} + extern int isa_dma_bridge_buggy;
#ifdef CONFIG_PCI diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index ce5836c..c17e7ea 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -10,6 +10,7 @@ * */
+#include <linux/acpi.h> #include <linux/init.h> #include <linux/io.h> #include <linux/kernel.h> @@ -68,3 +69,27 @@ void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent) bus->domain_nr = domain; } #endif + +/* + * raw_pci_read/write - Platform-specific PCI config space access. + */ +int raw_pci_read(unsigned int domain, unsigned int bus, + unsigned int devfn, int reg, int len, u32 *val) +{ + return -EINVAL; +} + +int raw_pci_write(unsigned int domain, unsigned int bus, + unsigned int devfn, int reg, int len, u32 val) +{ + return -EINVAL; +} + +#ifdef CONFIG_ACPI +/* Root bridge scanning */ +struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root) +{ + /* TODO: Should be revisited when implementing PCI on ACPI */ + return NULL; +} +#endif
On Mon, Feb 02, 2015 at 12:45:34PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index ce5836c..c17e7ea 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -10,6 +10,7 @@
*/ +#include <linux/acpi.h> #include <linux/init.h> #include <linux/io.h> #include <linux/kernel.h> @@ -68,3 +69,27 @@ void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent) bus->domain_nr = domain; } #endif
+/*
- raw_pci_read/write - Platform-specific PCI config space access.
- */
+int raw_pci_read(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 *val)
+{
- return -EINVAL;
+}
+int raw_pci_write(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 val)
+{
- return -EINVAL;
+}
You said you'll make these return -ENOSYS, which I think makes more sense.
On 2015年02月03日 20:15, Catalin Marinas wrote:
On Mon, Feb 02, 2015 at 12:45:34PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index ce5836c..c17e7ea 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -10,6 +10,7 @@
*/
+#include <linux/acpi.h> #include <linux/init.h> #include <linux/io.h> #include <linux/kernel.h> @@ -68,3 +69,27 @@ void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent) bus->domain_nr = domain; } #endif
+/*
- raw_pci_read/write - Platform-specific PCI config space access.
- */
+int raw_pci_read(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 *val)
+{
- return -EINVAL;
+}
+int raw_pci_write(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 val)
+{
- return -EINVAL;
+}
You said you'll make these return -ENOSYS, which I think makes more sense.
I'm sorry, I missed that, my bad, I will fix that in next version.
Thanks Hanjun
On Tuesday, February 03, 2015 09:30:00 PM Hanjun Guo wrote:
On 2015年02月03日 20:15, Catalin Marinas wrote:
On Mon, Feb 02, 2015 at 12:45:34PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index ce5836c..c17e7ea 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -10,6 +10,7 @@
*/
+#include <linux/acpi.h> #include <linux/init.h> #include <linux/io.h> #include <linux/kernel.h> @@ -68,3 +69,27 @@ void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent) bus->domain_nr = domain; } #endif
+/*
- raw_pci_read/write - Platform-specific PCI config space access.
- */
+int raw_pci_read(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 *val)
+{
- return -EINVAL;
+}
+int raw_pci_write(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 val)
+{
- return -EINVAL;
+}
You said you'll make these return -ENOSYS, which I think makes more sense.
I'm sorry, I missed that, my bad, I will fix that in next version.
Actually, -ENOSYS *specifically* means "not implemented system call". It should not be used for anything other than that. -ENXIO is what should be used instead.
On 2015年02月03日 22:55, Rafael J. Wysocki wrote:
On Tuesday, February 03, 2015 09:30:00 PM Hanjun Guo wrote:
On 2015年02月03日 20:15, Catalin Marinas wrote:
On Mon, Feb 02, 2015 at 12:45:34PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index ce5836c..c17e7ea 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -10,6 +10,7 @@ * */
+#include <linux/acpi.h> #include <linux/init.h> #include <linux/io.h> #include <linux/kernel.h> @@ -68,3 +69,27 @@ void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent) bus->domain_nr = domain; } #endif
+/*
- raw_pci_read/write - Platform-specific PCI config space access.
- */
+int raw_pci_read(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 *val)
+{
- return -EINVAL;
+}
+int raw_pci_write(unsigned int domain, unsigned int bus,
unsigned int devfn, int reg, int len, u32 val)
+{
- return -EINVAL;
+}
You said you'll make these return -ENOSYS, which I think makes more sense.
I'm sorry, I missed that, my bad, I will fix that in next version.
Actually, -ENOSYS *specifically* means "not implemented system call". It should not be used for anything other than that. -ENXIO is what should be used instead.
Thanks for the suggestion :)
Hanjun
From: Al Stone al.stone@linaro.org
Introduce two early parameters "off" and "force" for "acpi", acpi=off will be the default behavior for ARM64, so introduce acpi=force to enable ACPI on ARM64.
Disable ACPI before early parameters parsed, and enable it to pass "acpi=force" if people want use ACPI on ARM64. This ensures DT be the prefer one if ACPI table and DT both are provided at this moment.
CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Rafael J. Wysocki rjw@rjwysocki.net Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- Documentation/kernel-parameters.txt | 3 ++- arch/arm64/include/asm/acpi.h | 9 +++++++++ arch/arm64/kernel/acpi.c | 17 +++++++++++++++++ arch/arm64/kernel/setup.c | 8 ++++++++ 4 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 176d4fe..d6a952e 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -165,7 +165,7 @@ multipliers 'Kilo', 'Mega', and 'Giga', equalling 2^10, 2^20, and 2^30 bytes respectively. Such letter suffixes can also be entirely omitted.
- acpi= [HW,ACPI,X86] + acpi= [HW,ACPI,X86,ARM64] Advanced Configuration and Power Interface Format: { force | off | strict | noirq | rsdt } force -- enable ACPI if default was off @@ -175,6 +175,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. strictly ACPI specification compliant. rsdt -- prefer RSDT over (default) XSDT copy_dsdt -- copy DSDT to memory + For ARM64, ONLY "acpi=off" or "acpi=force" are available
See also Documentation/power/runtime_pm.txt, pci=noacpi
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 8b837ab..496c33b 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -26,6 +26,13 @@ static inline void disable_acpi(void) acpi_noirq = 1; }
+static inline void enable_acpi(void) +{ + acpi_disabled = 0; + acpi_pci_disabled = 0; + acpi_noirq = 0; +} + /* * It's used from ACPI core in kdump to boot UP system with SMP kernel, * with this check the ACPI core will not override the CPU index @@ -40,6 +47,8 @@ static inline bool acpi_has_cpu_in_madt(void)
static inline void arch_fix_phys_package_id(int num, u32 slot) { }
+#else +static inline void disable_acpi(void) { } #endif /* CONFIG_ACPI */
#endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index 7f67c01..afe10b4 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -67,3 +67,20 @@ void __init acpi_boot_table_init(void) if (acpi_table_init()) disable_acpi(); } + +static int __init parse_acpi(char *arg) +{ + if (!arg) + return -EINVAL; + + /* "acpi=off" disables both ACPI table parsing and interpreter */ + if (strcmp(arg, "off") == 0) + disable_acpi(); + else if (strcmp(arg, "force") == 0) /* force ACPI to be enabled */ + enable_acpi(); + else + return -EINVAL; /* Core will print when we return error */ + + return 0; +} +early_param("acpi", parse_acpi); diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 726b019..fc4fb7b 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -62,6 +62,7 @@ #include <asm/memblock.h> #include <asm/psci.h> #include <asm/efi.h> +#include <asm/acpi.h>
unsigned int processor_id; EXPORT_SYMBOL(processor_id); @@ -388,6 +389,13 @@ void __init setup_arch(char **cmdline_p) early_fixmap_init(); early_ioremap_init();
+ /* + * Disable ACPI before early parameters parsed and + * it will be enabled in parse_early_param() if + * "acpi=force" is passed + */ + disable_acpi(); + parse_early_param();
/*
When system supporting both DT and ACPI but firmware providing no dtb, we can use this linux,uefi-stub-generated-dtb property to let kernel know that we can try ACPI configuration data even if no "acpi=force" is passed in early parameters.
CC: Mark Rutland mark.rutland@arm.com CC: Jonathan Corbet corbet@lwn.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Leif Lindholm leif.lindholm@linaro.org CC: Grant Likely grant.likely@linaro.org CC: Matt Fleming matt.fleming@intel.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- Documentation/arm/uefi.txt | 3 +++ arch/arm64/include/asm/acpi.h | 1 + arch/arm64/kernel/setup.c | 30 ++++++++++++++++++++++++++++++ drivers/firmware/efi/libstub/fdt.c | 8 ++++++++ 4 files changed, 42 insertions(+)
diff --git a/Documentation/arm/uefi.txt b/Documentation/arm/uefi.txt index d60030a..5f86eae 100644 --- a/Documentation/arm/uefi.txt +++ b/Documentation/arm/uefi.txt @@ -60,5 +60,8 @@ linux,uefi-mmap-desc-ver | 32-bit | Version of the mmap descriptor format. -------------------------------------------------------------------------------- linux,uefi-stub-kern-ver | string | Copy of linux_banner from build. -------------------------------------------------------------------------------- +linux,uefi-stub-generated-dtb | bool | Indication for no DTB provided by + | | firmware. +--------------------------------------------------------------------------------
For verbose debug messages, specify 'uefi_debug' on the kernel command line. diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 496c33b..9fcf632 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,6 +49,7 @@ static inline void arch_fix_phys_package_id(int num, u32 slot) { }
#else static inline void disable_acpi(void) { } +static inline void enable_acpi(void) { } #endif /* CONFIG_ACPI */
#endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index fc4fb7b..510a681 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -371,6 +371,29 @@ static void __init request_standard_resources(void) } }
+static int __init dt_scan_chosen(unsigned long node, const char *uname, + int depth, void *data) +{ + const char *p; + + if (depth != 1 || !data || (strcmp(uname, "chosen") != 0)) + return 0; + + p = of_get_flat_dt_prop(node, "linux,uefi-stub-generated-dtb", NULL); + *(bool *)data = p ? true : false; + + return 1; +} + +static bool __init is_uefi_stub_generated_dtb(void) +{ + bool flag = false; + + of_scan_flat_dt(dt_scan_chosen, &flag); + + return flag; +} + u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
void __init setup_arch(char **cmdline_p) @@ -399,6 +422,13 @@ void __init setup_arch(char **cmdline_p) parse_early_param();
/* + * If no dtb provided by firmware, enable ACPI and give system a + * chance to boot with ACPI configuration data + */ + if (is_uefi_stub_generated_dtb() && acpi_disabled) + enable_acpi(); + + /* * Unmask asynchronous aborts after bringing up possible earlycon. * (Report possible System Errors once we can report this occurred) */ diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c index c846a96..3777d50 100644 --- a/drivers/firmware/efi/libstub/fdt.c +++ b/drivers/firmware/efi/libstub/fdt.c @@ -154,6 +154,14 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt, if (status) goto fdt_set_fail;
+ /* Add a property to show the dtb is generated by uefi stub */ + if (!orig_fdt) { + status = fdt_setprop(fdt, node, + "linux,uefi-stub-generated-dtb", NULL, 0); + if (status) + goto fdt_set_fail; + } + return EFI_SUCCESS;
fdt_set_fail:
On Mon, Feb 02, 2015 at 08:45:36PM +0800, Hanjun Guo wrote:
When system supporting both DT and ACPI but firmware providing no dtb, we can use this linux,uefi-stub-generated-dtb property to let kernel know that we can try ACPI configuration data even if no "acpi=force" is passed in early parameters.
CC: Mark Rutland mark.rutland@arm.com CC: Jonathan Corbet corbet@lwn.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Leif Lindholm leif.lindholm@linaro.org CC: Grant Likely grant.likely@linaro.org CC: Matt Fleming matt.fleming@intel.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
Documentation/arm/uefi.txt | 3 +++ arch/arm64/include/asm/acpi.h | 1 + arch/arm64/kernel/setup.c | 30 ++++++++++++++++++++++++++++++ drivers/firmware/efi/libstub/fdt.c | 8 ++++++++ 4 files changed, 42 insertions(+)
diff --git a/Documentation/arm/uefi.txt b/Documentation/arm/uefi.txt index d60030a..5f86eae 100644 --- a/Documentation/arm/uefi.txt +++ b/Documentation/arm/uefi.txt @@ -60,5 +60,8 @@ linux,uefi-mmap-desc-ver | 32-bit | Version of the mmap descriptor format.
linux,uefi-stub-kern-ver | string | Copy of linux_banner from build.
+linux,uefi-stub-generated-dtb | bool | Indication for no DTB provided by
| | firmware.
+--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
For verbose debug messages, specify 'uefi_debug' on the kernel command line. diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 496c33b..9fcf632 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,6 +49,7 @@ static inline void arch_fix_phys_package_id(int num, u32 slot) { } #else static inline void disable_acpi(void) { } +static inline void enable_acpi(void) { } #endif /* CONFIG_ACPI */ #endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index fc4fb7b..510a681 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -371,6 +371,29 @@ static void __init request_standard_resources(void) } } +static int __init dt_scan_chosen(unsigned long node, const char *uname,
int depth, void *data)
+{
- const char *p;
- if (depth != 1 || !data || (strcmp(uname, "chosen") != 0))
return 0;
- p = of_get_flat_dt_prop(node, "linux,uefi-stub-generated-dtb", NULL);
- *(bool *)data = p ? true : false;
- return 1;
+}
+static bool __init is_uefi_stub_generated_dtb(void) +{
- bool flag = false;
- of_scan_flat_dt(dt_scan_chosen, &flag);
- return flag;
+}
u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID }; void __init setup_arch(char **cmdline_p) @@ -399,6 +422,13 @@ void __init setup_arch(char **cmdline_p) parse_early_param(); /*
* If no dtb provided by firmware, enable ACPI and give system a
* chance to boot with ACPI configuration data
*/
- if (is_uefi_stub_generated_dtb() && acpi_disabled)
enable_acpi();
- /*
*/
- Unmask asynchronous aborts after bringing up possible earlycon.
- (Report possible System Errors once we can report this occurred)
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c index c846a96..3777d50 100644 --- a/drivers/firmware/efi/libstub/fdt.c +++ b/drivers/firmware/efi/libstub/fdt.c @@ -154,6 +154,14 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt, if (status) goto fdt_set_fail;
- /* Add a property to show the dtb is generated by uefi stub */
- if (!orig_fdt) {
status = fdt_setprop(fdt, node,
"linux,uefi-stub-generated-dtb", NULL, 0);
if (status)
goto fdt_set_fail;
- }
- return EFI_SUCCESS;
fdt_set_fail: -- 1.9.1
On Mon, Feb 02, 2015 at 01:40:33PM +0000, Leif Lindholm wrote:
On Mon, Feb 02, 2015 at 08:45:36PM +0800, Hanjun Guo wrote:
When system supporting both DT and ACPI but firmware providing no dtb, we can use this linux,uefi-stub-generated-dtb property to let kernel know that we can try ACPI configuration data even if no "acpi=force" is passed in early parameters.
CC: Mark Rutland mark.rutland@arm.com CC: Jonathan Corbet corbet@lwn.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Leif Lindholm leif.lindholm@linaro.org CC: Grant Likely grant.likely@linaro.org CC: Matt Fleming matt.fleming@intel.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
Documentation/arm/uefi.txt | 3 +++ arch/arm64/include/asm/acpi.h | 1 + arch/arm64/kernel/setup.c | 30 ++++++++++++++++++++++++++++++ drivers/firmware/efi/libstub/fdt.c | 8 ++++++++ 4 files changed, 42 insertions(+)
diff --git a/Documentation/arm/uefi.txt b/Documentation/arm/uefi.txt index d60030a..5f86eae 100644 --- a/Documentation/arm/uefi.txt +++ b/Documentation/arm/uefi.txt @@ -60,5 +60,8 @@ linux,uefi-mmap-desc-ver | 32-bit | Version of the mmap descriptor format.
linux,uefi-stub-kern-ver | string | Copy of linux_banner from build.
+linux,uefi-stub-generated-dtb | bool | Indication for no DTB provided by
| | firmware.
+--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
I agree.
Graeme
For verbose debug messages, specify 'uefi_debug' on the kernel command line. diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 496c33b..9fcf632 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,6 +49,7 @@ static inline void arch_fix_phys_package_id(int num, u32 slot) { } #else static inline void disable_acpi(void) { } +static inline void enable_acpi(void) { } #endif /* CONFIG_ACPI */ #endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index fc4fb7b..510a681 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -371,6 +371,29 @@ static void __init request_standard_resources(void) } } +static int __init dt_scan_chosen(unsigned long node, const char *uname,
int depth, void *data)
+{
- const char *p;
- if (depth != 1 || !data || (strcmp(uname, "chosen") != 0))
return 0;
- p = of_get_flat_dt_prop(node, "linux,uefi-stub-generated-dtb", NULL);
- *(bool *)data = p ? true : false;
- return 1;
+}
+static bool __init is_uefi_stub_generated_dtb(void) +{
- bool flag = false;
- of_scan_flat_dt(dt_scan_chosen, &flag);
- return flag;
+}
u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID }; void __init setup_arch(char **cmdline_p) @@ -399,6 +422,13 @@ void __init setup_arch(char **cmdline_p) parse_early_param(); /*
* If no dtb provided by firmware, enable ACPI and give system a
* chance to boot with ACPI configuration data
*/
- if (is_uefi_stub_generated_dtb() && acpi_disabled)
enable_acpi();
- /*
*/
- Unmask asynchronous aborts after bringing up possible earlycon.
- (Report possible System Errors once we can report this occurred)
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c index c846a96..3777d50 100644 --- a/drivers/firmware/efi/libstub/fdt.c +++ b/drivers/firmware/efi/libstub/fdt.c @@ -154,6 +154,14 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt, if (status) goto fdt_set_fail;
- /* Add a property to show the dtb is generated by uefi stub */
- if (!orig_fdt) {
status = fdt_setprop(fdt, node,
"linux,uefi-stub-generated-dtb", NULL, 0);
if (status)
goto fdt_set_fail;
- }
- return EFI_SUCCESS;
fdt_set_fail: -- 1.9.1
On Mon, Feb 02, 2015 at 01:50:52PM +0000, Graeme Gregory wrote:
On Mon, Feb 02, 2015 at 01:40:33PM +0000, Leif Lindholm wrote:
On Mon, Feb 02, 2015 at 08:45:36PM +0800, Hanjun Guo wrote:
When system supporting both DT and ACPI but firmware providing no dtb, we can use this linux,uefi-stub-generated-dtb property to let kernel know that we can try ACPI configuration data even if no "acpi=force" is passed in early parameters.
CC: Mark Rutland mark.rutland@arm.com CC: Jonathan Corbet corbet@lwn.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Leif Lindholm leif.lindholm@linaro.org CC: Grant Likely grant.likely@linaro.org CC: Matt Fleming matt.fleming@intel.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
Documentation/arm/uefi.txt | 3 +++ arch/arm64/include/asm/acpi.h | 1 + arch/arm64/kernel/setup.c | 30 ++++++++++++++++++++++++++++++ drivers/firmware/efi/libstub/fdt.c | 8 ++++++++ 4 files changed, 42 insertions(+)
diff --git a/Documentation/arm/uefi.txt b/Documentation/arm/uefi.txt index d60030a..5f86eae 100644 --- a/Documentation/arm/uefi.txt +++ b/Documentation/arm/uefi.txt @@ -60,5 +60,8 @@ linux,uefi-mmap-desc-ver | 32-bit | Version of the mmap descriptor format.
linux,uefi-stub-kern-ver | string | Copy of linux_banner from build.
+linux,uefi-stub-generated-dtb | bool | Indication for no DTB provided by
| | firmware.
+--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
Thanks, Mark.
On 2 February 2015 at 16:32, Mark Rutland mark.rutland@arm.com wrote:
On Mon, Feb 02, 2015 at 01:50:52PM +0000, Graeme Gregory wrote:
On Mon, Feb 02, 2015 at 01:40:33PM +0000, Leif Lindholm wrote:
On Mon, Feb 02, 2015 at 08:45:36PM +0800, Hanjun Guo wrote:
When system supporting both DT and ACPI but firmware providing no dtb, we can use this linux,uefi-stub-generated-dtb property to let kernel know that we can try ACPI configuration data even if no "acpi=force" is passed in early parameters.
CC: Mark Rutland mark.rutland@arm.com CC: Jonathan Corbet corbet@lwn.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Leif Lindholm leif.lindholm@linaro.org CC: Grant Likely grant.likely@linaro.org CC: Matt Fleming matt.fleming@intel.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
Documentation/arm/uefi.txt | 3 +++ arch/arm64/include/asm/acpi.h | 1 + arch/arm64/kernel/setup.c | 30 ++++++++++++++++++++++++++++++ drivers/firmware/efi/libstub/fdt.c | 8 ++++++++ 4 files changed, 42 insertions(+)
diff --git a/Documentation/arm/uefi.txt b/Documentation/arm/uefi.txt index d60030a..5f86eae 100644 --- a/Documentation/arm/uefi.txt +++ b/Documentation/arm/uefi.txt @@ -60,5 +60,8 @@ linux,uefi-mmap-desc-ver | 32-bit | Version of the mmap descriptor format.
linux,uefi-stub-kern-ver | string | Copy of linux_banner from build.
+linux,uefi-stub-generated-dtb | bool | Indication for no DTB provided by
| | firmware.
+--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Graeme
On 2015年02月06日 18:34, G Gregory wrote: [...]
linux,uefi-stub-kern-ver | string | Copy of linux_banner from build.
+linux,uefi-stub-generated-dtb | bool | Indication for no DTB provided by
| | firmware.
+--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Leif, Mark, any comments on this?
Thanks Hanjun
On 7 February 2015 at 03:36, Hanjun Guo hanjun.guo@linaro.org wrote:
On 2015年02月06日 18:34, G Gregory wrote: [...]
linux,uefi-stub-kern-ver | string | Copy of linux_banner from build.
+linux,uefi-stub-generated-dtb | bool | Indication for no DTB provided by
| | firmware.
+--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Leif, Mark, any comments on this?
As far as I remember, we did not finalize the decision to go with a stub generated property instead of some other means to infer that the device tree is not suitable for booting and ACPI should be preferred.
We will be discussing the 'stub<->kernel interface as a boot protocol' topic this week at Connect, so let's discuss it in that context before signing off on patches like these.
On 2015年02月07日 13:03, Ard Biesheuvel wrote:
On 7 February 2015 at 03:36, Hanjun Guo hanjun.guo@linaro.org wrote:
On 2015年02月06日 18:34, G Gregory wrote: [...]
> > -------------------------------------------------------------------------------- > linux,uefi-stub-kern-ver | string | Copy of linux_banner from > build. > > -------------------------------------------------------------------------------- > +linux,uefi-stub-generated-dtb | bool | Indication for no DTB > provided by > + | | firmware. > > +--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Leif, Mark, any comments on this?
As far as I remember, we did not finalize the decision to go with a stub generated property instead of some other means to infer that the device tree is not suitable for booting and ACPI should be preferred.
We will be discussing the 'stub<->kernel interface as a boot protocol' topic this week at Connect, so let's discuss it in that context before signing off on patches like these.
OK, see you guys in Hongkong.
Thanks Hanjun
On Sat, Feb 07, 2015 at 05:03:44AM +0000, Ard Biesheuvel wrote:
On 7 February 2015 at 03:36, Hanjun Guo hanjun.guo@linaro.org wrote:
On 2015年02月06日 18:34, G Gregory wrote: [...]
> > -------------------------------------------------------------------------------- > linux,uefi-stub-kern-ver | string | Copy of linux_banner from > build. > > -------------------------------------------------------------------------------- > +linux,uefi-stub-generated-dtb | bool | Indication for no DTB > provided by > + | | firmware. > > +--------------------------------------------------------------------------------
Apologies for the late bikeshedding, but the discussion on this topic previsously was lively enough that I thought I'd let it die down a bit before seeing if I had anything to add.
That, and I just realised something: One alternative to this added DT entry is that we could treat the absence of a registered UEFI configuration table as the indication that no HW description was provided from firmware, since the stub does not call InstallConfigurationTable() on the DT it generates. This does move the ability to detect to after efi_init(), but this should be fine for ACPI-purposes.
That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
If that is deemed undesirable, I would still prefer Catalin's suggested name ("linux,bare-dtb"), which describes the state rather than the route we took to get there.
I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Leif, Mark, any comments on this?
As far as I remember, we did not finalize the decision to go with a stub generated property instead of some other means to infer that the device tree is not suitable for booting and ACPI should be preferred.
We will be discussing the 'stub<->kernel interface as a boot protocol' topic this week at Connect, so let's discuss it in that context before signing off on patches like these.
As some of us (at least myself) aren't at connect, it would be nice if those discussions could be at least mirrored on the mailing list. I have some concerns regarding how this is going to work long-term, and I'd like to make sure we don't get stuck with something that limits what we can do long-term.
Is there a session set aside for this, or is this a hallway track topic?
Thanks, Mark.
On 9 February 2015 at 19:46, Mark Rutland mark.rutland@arm.com wrote:
On Sat, Feb 07, 2015 at 05:03:44AM +0000, Ard Biesheuvel wrote:
On 7 February 2015 at 03:36, Hanjun Guo hanjun.guo@linaro.org wrote:
On 2015年02月06日 18:34, G Gregory wrote: [...]
>> >> -------------------------------------------------------------------------------- >> linux,uefi-stub-kern-ver | string | Copy of linux_banner from >> build. >> >> -------------------------------------------------------------------------------- >> +linux,uefi-stub-generated-dtb | bool | Indication for no DTB >> provided by >> + | | firmware. >> >> +-------------------------------------------------------------------------------- > > > Apologies for the late bikeshedding, but the discussion on this topic > previsously was lively enough that I thought I'd let it die down a bit > before seeing if I had anything to add. > > That, and I just realised something: > One alternative to this added DT entry is that we could treat the > absence of a registered UEFI configuration table as the indication > that no HW description was provided from firmware, since the stub does > not call InstallConfigurationTable() on the DT it generates. This does > move the ability to detect to after efi_init(), but this should be > fine for ACPI-purposes. > That would not work as expected in the kexec/Xen use case though as they may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
> If that is deemed undesirable, I would still prefer Catalin's > suggested name ("linux,bare-dtb"), which describes the state rather > than the route we took to get there. > I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Leif, Mark, any comments on this?
As far as I remember, we did not finalize the decision to go with a stub generated property instead of some other means to infer that the device tree is not suitable for booting and ACPI should be preferred.
We will be discussing the 'stub<->kernel interface as a boot protocol' topic this week at Connect, so let's discuss it in that context before signing off on patches like these.
As some of us (at least myself) aren't at connect, it would be nice if those discussions could be at least mirrored on the mailing list. I have some concerns regarding how this is going to work long-term, and I'd like to make sure we don't get stuck with something that limits what we can do long-term.
Is there a session set aside for this, or is this a hallway track topic?
Hello all,
(added team-Xen to cc)
We had our meeting yesterday: allow me to summarize what we discussed, and we can proceed with the discussion on-list if desired.
Present: Grant Likely Al Stone Hanjun Guo Leif Lindholm Roy Franz Ard Biesheuvel
Topic #1: booting the arm64 kernel with ACPI but no UEFI
We have identified Xen as the only use case: there is a need to boot dom0 using the host's ACPI tables but without allowing the dom0 kernel to interface directly with the UEFI firmware. There may be other valid use cases, though, so this use case should be addressed generically regardless.
First, it was proposed to allow the ACPI root pointer to be added to a /chosen node property, and the kernel would use this property instead of going through the UEFI tables. However, there is a similar case that could be made for SMBIOS: unlike x86, where there is a 'legacy' method to locate either table by scanning some special physical memory regions, the respective specifications only provide a single method to perform table discovery, which is through UEFI. This means that passing the ACPI root pointer to the kernel using a property in the /chosen node doesn't scale well, as we would need to do the same for SMBIOS at least, and potentially other tables in the future.
There are two other concerns related to passing the ACPI root pointer directly: - the actual discovery occurs in core code, and we are reluctant to change it to accommodate arm64 specific behavior - it would create separate paths through the early boot code which complicates testing and validation
So instead, we think it is reasonable to mandate a minimal subset of the UEFI environment to be present, either natively or emulated/mocked up by Xen, kexec etc.
- an EFI system table (and a /chosen/linux,uefi-system-table that points to it) containing at least * a fully populated header with version >= 2.0 and correct CRC * populated fw_vendor string * configuration table pointer and count, pointing to the ACPI and SMBIOS configuration tables with their respective GUIDs * NULL runtime services function table pointer
- an EFI format memory map (and the /chosen/linux,uefi-mmap-* properties that go with it) covering all of system RAM, with ACPI and SMBIOS reserved regions marked as appropriate
As this basically promotes the stub<->kernel interface to an external ABI, the current documentation about the /chosen node properties should also be promoted to a proper binding, with the above mandated minimal subset added as well.
There are some minimal changes required to the current kernel code to adhere to the above: primarlly to deal with a NULL runtime services pointer, which is arguably an improvement anyway.
Topic #2: how to identify an 'empty' DTB
The proposed policy regarding whether DT or ACPI should be preferred if both methods are available hinges on being able to identify a DTB as containing a platform description or not. One suggested way of doing this is to make the stub add a /chosen node property that indicates that it didn't receive a DTB from the firmware, nor loaded one from the file system, but created an empty one from scratch.
Considering the previous topic, i.e., the promotion of the stub<->kernel interface to external ABI, we should not be frivolous about adding new properties, and adding a 'stub-generated-dtb' property should be avoided if there is a better way to deal with this. Also, e.g., when booting via GRUB, it may in fact be GRUB and not the stub that creates the DTB (when booting with an initrd, for instance) so GRUB would have to be modified as well. (If not, simply adding a initrd= property to the command line would result in the kernel preferring DT over ACPI all of a sudden, which surely, we all agree is undesirable behavior)
So instead, we propose to use a heuristic to decide whether a DTB should be considered empty or not: If /chosen is the only level 1 node in the tree, the DTB is empty, otherwise it is not.
This can be trivially implemented into the existing EFI early FDT discovery code, and does not require any other changes to the stub or GRUB.
Please, could those affected by this comment whether this is feasible or not? Other comments/remarks also highly appreciated, of course,
Regards, Ard.
On Wed, 11 Feb 2015, Ard Biesheuvel wrote:
On 9 February 2015 at 19:46, Mark Rutland mark.rutland@arm.com wrote:
On Sat, Feb 07, 2015 at 05:03:44AM +0000, Ard Biesheuvel wrote:
On 7 February 2015 at 03:36, Hanjun Guo hanjun.guo@linaro.org wrote:
On 2015年02月06日 18:34, G Gregory wrote: [...]
>>> >>> -------------------------------------------------------------------------------- >>> linux,uefi-stub-kern-ver | string | Copy of linux_banner from >>> build. >>> >>> -------------------------------------------------------------------------------- >>> +linux,uefi-stub-generated-dtb | bool | Indication for no DTB >>> provided by >>> + | | firmware. >>> >>> +-------------------------------------------------------------------------------- >> >> >> Apologies for the late bikeshedding, but the discussion on this topic >> previsously was lively enough that I thought I'd let it die down a bit >> before seeing if I had anything to add. >> >> That, and I just realised something: >> One alternative to this added DT entry is that we could treat the >> absence of a registered UEFI configuration table as the indication >> that no HW description was provided from firmware, since the stub does >> not call InstallConfigurationTable() on the DT it generates. This does >> move the ability to detect to after efi_init(), but this should be >> fine for ACPI-purposes. >> > That would not work as expected in the kexec/Xen use case though as they > may genuinely boot with DT from an ACPI host without UEFI.
I'm a little concerned by this case. How do we intend to pass stuff from Xen to the kernel in this case? When we initially discussed the stub prior to merging, we weren't quite sure if ACPI without UEFI was entirely safe.
The linux,uefi-stub-kern-ver property was originally intended as a sanity-check feature to ensure nothing (including Xen) masqueraded as the stub, but for some reason the actual sanity check was never implemented.
>> If that is deemed undesirable, I would still prefer Catalin's >> suggested name ("linux,bare-dtb"), which describes the state rather >> than the route we took to get there. >> > I agree.
I guess this would be ok, though it would be nice to know which agent generated the DTB.
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Leif, Mark, any comments on this?
As far as I remember, we did not finalize the decision to go with a stub generated property instead of some other means to infer that the device tree is not suitable for booting and ACPI should be preferred.
We will be discussing the 'stub<->kernel interface as a boot protocol' topic this week at Connect, so let's discuss it in that context before signing off on patches like these.
As some of us (at least myself) aren't at connect, it would be nice if those discussions could be at least mirrored on the mailing list. I have some concerns regarding how this is going to work long-term, and I'd like to make sure we don't get stuck with something that limits what we can do long-term.
Is there a session set aside for this, or is this a hallway track topic?
Hello all,
(added team-Xen to cc)
We had our meeting yesterday: allow me to summarize what we discussed, and we can proceed with the discussion on-list if desired.
Present: Grant Likely Al Stone Hanjun Guo Leif Lindholm Roy Franz Ard Biesheuvel
Topic #1: booting the arm64 kernel with ACPI but no UEFI
We have identified Xen as the only use case: there is a need to boot dom0 using the host's ACPI tables but without allowing the dom0 kernel to interface directly with the UEFI firmware. There may be other valid use cases, though, so this use case should be addressed generically regardless.
First, it was proposed to allow the ACPI root pointer to be added to a /chosen node property, and the kernel would use this property instead of going through the UEFI tables. However, there is a similar case that could be made for SMBIOS: unlike x86, where there is a 'legacy' method to locate either table by scanning some special physical memory regions, the respective specifications only provide a single method to perform table discovery, which is through UEFI. This means that passing the ACPI root pointer to the kernel using a property in the /chosen node doesn't scale well, as we would need to do the same for SMBIOS at least, and potentially other tables in the future.
TBH there are no other tables and even if there were, this method would still scale linearly with the number of tables, that is not bad in my view.
There are two other concerns related to passing the ACPI root pointer directly:
- the actual discovery occurs in core code, and we are reluctant to
change it to accommodate arm64 specific behavior
- it would create separate paths through the early boot code which
complicates testing and validation
OK
So instead, we think it is reasonable to mandate a minimal subset of the UEFI environment to be present, either natively or emulated/mocked up by Xen, kexec etc.
No emulation please :-)
- an EFI system table (and a /chosen/linux,uefi-system-table that
points to it) containing at least
- a fully populated header with version >= 2.0 and correct CRC
- populated fw_vendor string
- configuration table pointer and count, pointing to the ACPI and
SMBIOS configuration tables with their respective GUIDs
- NULL runtime services function table pointer
- an EFI format memory map (and the /chosen/linux,uefi-mmap-*
properties that go with it) covering all of system RAM, with ACPI and SMBIOS reserved regions marked as appropriate
Runtime services actually are going to be available via hypercalls, see drivers/xen/efi.c, but Boot Services are not going to be.
Given that Dom0 is not booted via EFI but as zImage, how are we going to pass the two EFI table pointers to Linux? Via Device Tree? It doesn't look like a great improvement to me.
Generating those two EFI tables shouldn't be a problem though.
As this basically promotes the stub<->kernel interface to an external ABI, the current documentation about the /chosen node properties should also be promoted to a proper binding, with the above mandated minimal subset added as well.
There are some minimal changes required to the current kernel code to adhere to the above: primarlly to deal with a NULL runtime services pointer, which is arguably an improvement anyway.
This is not needed, if not for the first generation of patches
Topic #2: how to identify an 'empty' DTB
The proposed policy regarding whether DT or ACPI should be preferred if both methods are available hinges on being able to identify a DTB as containing a platform description or not. One suggested way of doing this is to make the stub add a /chosen node property that indicates that it didn't receive a DTB from the firmware, nor loaded one from the file system, but created an empty one from scratch.
Considering the previous topic, i.e., the promotion of the stub<->kernel interface to external ABI, we should not be frivolous about adding new properties, and adding a 'stub-generated-dtb' property should be avoided if there is a better way to deal with this. Also, e.g., when booting via GRUB, it may in fact be GRUB and not the stub that creates the DTB (when booting with an initrd, for instance) so GRUB would have to be modified as well. (If not, simply adding a initrd= property to the command line would result in the kernel preferring DT over ACPI all of a sudden, which surely, we all agree is undesirable behavior)
So instead, we propose to use a heuristic to decide whether a DTB should be considered empty or not: If /chosen is the only level 1 node in the tree, the DTB is empty, otherwise it is not.
This can be trivially implemented into the existing EFI early FDT discovery code, and does not require any other changes to the stub or GRUB.
Please, could those affected by this comment whether this is feasible or not? Other comments/remarks also highly appreciated, of course,
Wouldn't it make sense to use the same interface between Xen and Dom0 and between stub and kernel?
On 11 February 2015 at 14:33, Stefano Stabellini stefano.stabellini@eu.citrix.com wrote:
On Wed, 11 Feb 2015, Ard Biesheuvel wrote:
On 9 February 2015 at 19:46, Mark Rutland mark.rutland@arm.com wrote:
On Sat, Feb 07, 2015 at 05:03:44AM +0000, Ard Biesheuvel wrote:
On 7 February 2015 at 03:36, Hanjun Guo hanjun.guo@linaro.org wrote:
On 2015年02月06日 18:34, G Gregory wrote: [...]
>>>> >>>> -------------------------------------------------------------------------------- >>>> linux,uefi-stub-kern-ver | string | Copy of linux_banner from >>>> build. >>>> >>>> -------------------------------------------------------------------------------- >>>> +linux,uefi-stub-generated-dtb | bool | Indication for no DTB >>>> provided by >>>> + | | firmware. >>>> >>>> +-------------------------------------------------------------------------------- >>> >>> >>> Apologies for the late bikeshedding, but the discussion on this topic >>> previsously was lively enough that I thought I'd let it die down a bit >>> before seeing if I had anything to add. >>> >>> That, and I just realised something: >>> One alternative to this added DT entry is that we could treat the >>> absence of a registered UEFI configuration table as the indication >>> that no HW description was provided from firmware, since the stub does >>> not call InstallConfigurationTable() on the DT it generates. This does >>> move the ability to detect to after efi_init(), but this should be >>> fine for ACPI-purposes. >>> >> That would not work as expected in the kexec/Xen use case though as they >> may genuinely boot with DT from an ACPI host without UEFI. > > > I'm a little concerned by this case. How do we intend to pass stuff from > Xen to the kernel in this case? When we initially discussed the stub > prior to merging, we weren't quite sure if ACPI without UEFI was > entirely safe. > > The linux,uefi-stub-kern-ver property was originally intended as a > sanity-check feature to ensure nothing (including Xen) masqueraded as > the stub, but for some reason the actual sanity check was never > implemented. > >>> If that is deemed undesirable, I would still prefer Catalin's >>> suggested name ("linux,bare-dtb"), which describes the state rather >>> than the route we took to get there. >>> >> I agree. > > > I guess this would be ok, though it would be nice to know which agent > generated the DTB. >
The most obvious scheme then is
linux,bare-dtb = "uefi-stub";
otherwise we generate a new binding for every component in the boot path.
Leif, Mark, any comments on this?
As far as I remember, we did not finalize the decision to go with a stub generated property instead of some other means to infer that the device tree is not suitable for booting and ACPI should be preferred.
We will be discussing the 'stub<->kernel interface as a boot protocol' topic this week at Connect, so let's discuss it in that context before signing off on patches like these.
As some of us (at least myself) aren't at connect, it would be nice if those discussions could be at least mirrored on the mailing list. I have some concerns regarding how this is going to work long-term, and I'd like to make sure we don't get stuck with something that limits what we can do long-term.
Is there a session set aside for this, or is this a hallway track topic?
Hello all,
(added team-Xen to cc)
We had our meeting yesterday: allow me to summarize what we discussed, and we can proceed with the discussion on-list if desired.
Present: Grant Likely Al Stone Hanjun Guo Leif Lindholm Roy Franz Ard Biesheuvel
Topic #1: booting the arm64 kernel with ACPI but no UEFI
We have identified Xen as the only use case: there is a need to boot dom0 using the host's ACPI tables but without allowing the dom0 kernel to interface directly with the UEFI firmware. There may be other valid use cases, though, so this use case should be addressed generically regardless.
First, it was proposed to allow the ACPI root pointer to be added to a /chosen node property, and the kernel would use this property instead of going through the UEFI tables. However, there is a similar case that could be made for SMBIOS: unlike x86, where there is a 'legacy' method to locate either table by scanning some special physical memory regions, the respective specifications only provide a single method to perform table discovery, which is through UEFI. This means that passing the ACPI root pointer to the kernel using a property in the /chosen node doesn't scale well, as we would need to do the same for SMBIOS at least, and potentially other tables in the future.
TBH there are no other tables and even if there were, this method would still scale linearly with the number of tables, that is not bad in my view.
Fair enough
There are two other concerns related to passing the ACPI root pointer directly:
- the actual discovery occurs in core code, and we are reluctant to
change it to accommodate arm64 specific behavior
- it would create separate paths through the early boot code which
complicates testing and validation
OK
So instead, we think it is reasonable to mandate a minimal subset of the UEFI environment to be present, either natively or emulated/mocked up by Xen, kexec etc.
No emulation please :-)
Well, perhaps 'emulated' is not the right term to use here. It is just about presenting a couple of data items in the same way that UEFI presents them
- an EFI system table (and a /chosen/linux,uefi-system-table that
points to it) containing at least
- a fully populated header with version >= 2.0 and correct CRC
- populated fw_vendor string
- configuration table pointer and count, pointing to the ACPI and
SMBIOS configuration tables with their respective GUIDs
- NULL runtime services function table pointer
- an EFI format memory map (and the /chosen/linux,uefi-mmap-*
properties that go with it) covering all of system RAM, with ACPI and SMBIOS reserved regions marked as appropriate
Runtime services actually are going to be available via hypercalls, see drivers/xen/efi.c, but Boot Services are not going to be.
I was aware of that, but it does mean that the early UEFI code should not try to install virtual remappings of the UEFI runtime memory regions, as all the machinery is either in the kernel or in the hypervisor. So even if the Xen guest code installs a Runtime Services dispatch table at some point, the existing code should still be defused.
In the native UEFI case, the stub calls ExitBootServices() so boot services are not part of the equation anyway. (They are never available when running in the kernel proper)
Given that Dom0 is not booted via EFI but as zImage, how are we going to pass the two EFI table pointers to Linux? Via Device Tree? It doesn't look like a great improvement to me.
The EFI system table and memory map pointers shall be passed to the kernel in the exact same way as the stub does: via the /chosen node. This is currently documented in Documentation/arm/uefi.txt but it should be promoted to a proper binding.
Generating those two EFI tables shouldn't be a problem though.
Good.
As this basically promotes the stub<->kernel interface to an external ABI, the current documentation about the /chosen node properties should also be promoted to a proper binding, with the above mandated minimal subset added as well.
There are some minimal changes required to the current kernel code to adhere to the above: primarlly to deal with a NULL runtime services pointer, which is arguably an improvement anyway.
This is not needed, if not for the first generation of patches
It is needed: the UEFI code needs to understand that the runtime pointer in the EFI system table may be NULL, in which case no virtual remapping or installation of the runtime services should take place. Whether Xen ends up installing its own runtime services is a separate matter.
Topic #2: how to identify an 'empty' DTB
The proposed policy regarding whether DT or ACPI should be preferred if both methods are available hinges on being able to identify a DTB as containing a platform description or not. One suggested way of doing this is to make the stub add a /chosen node property that indicates that it didn't receive a DTB from the firmware, nor loaded one from the file system, but created an empty one from scratch.
Considering the previous topic, i.e., the promotion of the stub<->kernel interface to external ABI, we should not be frivolous about adding new properties, and adding a 'stub-generated-dtb' property should be avoided if there is a better way to deal with this. Also, e.g., when booting via GRUB, it may in fact be GRUB and not the stub that creates the DTB (when booting with an initrd, for instance) so GRUB would have to be modified as well. (If not, simply adding a initrd= property to the command line would result in the kernel preferring DT over ACPI all of a sudden, which surely, we all agree is undesirable behavior)
So instead, we propose to use a heuristic to decide whether a DTB should be considered empty or not: If /chosen is the only level 1 node in the tree, the DTB is empty, otherwise it is not.
This can be trivially implemented into the existing EFI early FDT discovery code, and does not require any other changes to the stub or GRUB.
Please, could those affected by this comment whether this is feasible or not? Other comments/remarks also highly appreciated, of course,
Wouldn't it make sense to use the same interface between Xen and Dom0 and between stub and kernel?
That is exactly the point: the stub communicates the EFI entry points (system table and memmap) via the device tree. If you are not booting via UEFI, there is no way you can execute the stub, so Xen needs to add those properties to the /chosen node directly, and make them point to data that the UEFI layer can understand.
On Wed, 11 Feb 2015, Ard Biesheuvel wrote:
Given that Dom0 is not booted via EFI but as zImage, how are we going to pass the two EFI table pointers to Linux? Via Device Tree? It doesn't look like a great improvement to me.
The EFI system table and memory map pointers shall be passed to the kernel in the exact same way as the stub does: via the /chosen node. This is currently documented in Documentation/arm/uefi.txt but it should be promoted to a proper binding.
Ah, right.
Generating those two EFI tables shouldn't be a problem though.
Good.
As this basically promotes the stub<->kernel interface to an external ABI, the current documentation about the /chosen node properties should also be promoted to a proper binding, with the above mandated minimal subset added as well.
There are some minimal changes required to the current kernel code to adhere to the above: primarlly to deal with a NULL runtime services pointer, which is arguably an improvement anyway.
This is not needed, if not for the first generation of patches
It is needed: the UEFI code needs to understand that the runtime pointer in the EFI system table may be NULL, in which case no virtual remapping or installation of the runtime services should take place. Whether Xen ends up installing its own runtime services is a separate matter.
Topic #2: how to identify an 'empty' DTB
The proposed policy regarding whether DT or ACPI should be preferred if both methods are available hinges on being able to identify a DTB as containing a platform description or not. One suggested way of doing this is to make the stub add a /chosen node property that indicates that it didn't receive a DTB from the firmware, nor loaded one from the file system, but created an empty one from scratch.
Considering the previous topic, i.e., the promotion of the stub<->kernel interface to external ABI, we should not be frivolous about adding new properties, and adding a 'stub-generated-dtb' property should be avoided if there is a better way to deal with this. Also, e.g., when booting via GRUB, it may in fact be GRUB and not the stub that creates the DTB (when booting with an initrd, for instance) so GRUB would have to be modified as well. (If not, simply adding a initrd= property to the command line would result in the kernel preferring DT over ACPI all of a sudden, which surely, we all agree is undesirable behavior)
So instead, we propose to use a heuristic to decide whether a DTB should be considered empty or not: If /chosen is the only level 1 node in the tree, the DTB is empty, otherwise it is not.
This can be trivially implemented into the existing EFI early FDT discovery code, and does not require any other changes to the stub or GRUB.
Please, could those affected by this comment whether this is feasible or not? Other comments/remarks also highly appreciated, of course,
Wouldn't it make sense to use the same interface between Xen and Dom0 and between stub and kernel?
That is exactly the point: the stub communicates the EFI entry points (system table and memmap) via the device tree. If you are not booting via UEFI, there is no way you can execute the stub, so Xen needs to add those properties to the /chosen node directly, and make them point to data that the UEFI layer can understand.
OK, it looks like the right way of doing it to me.
FADT Major.Minor version was introduced in ACPI 5.1, it is the same as ACPI version.
In ACPI 5.1, some major gaps are fixed for ARM, such as updates in MADT table for GIC and SMP init, without those updates, we can not get the MPIDR for SMP init, and GICv2/3 related init information, so we can't boot arm64 ACPI properly with table versions predating 5.1.
If firmware provides ACPI tables with ACPI version less than 5.1, OS has no way to retrieve the configuration data that is necessary to init SMP boot protocol and the GIC properly, so disable ACPI if we get an FADT table with version less that 5.1.
CC: Lorenzo Pieralisi lorenzo.pieralisi@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/kernel/acpi.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index afe10b4..b9f64ec 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -13,6 +13,8 @@ * published by the Free Software Foundation. */
+#define pr_fmt(fmt) "ACPI: " fmt + #include <linux/acpi.h> #include <linux/bootmem.h> #include <linux/cpumask.h> @@ -49,10 +51,32 @@ void __init __acpi_unmap_table(char *map, unsigned long size) early_memunmap(map, size); }
+static int __init acpi_parse_fadt(struct acpi_table_header *table) +{ + struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table; + + /* + * Revision in table header is the FADT Major revision, and there + * is a minor revision of FADT which was introduced by ACPI 5.1, + * we only deal with ACPI 5.1 or newer revision to get GIC and SMP + * boot protocol configuration data, or we will disable ACPI. + */ + if (table->revision > 5 || + (table->revision == 5 && fadt->minor_revision >= 1)) + return 0; + + pr_warn("Unsupported FADT revision %d.%d, should be 5.1+, will disable ACPI\n", + table->revision, fadt->minor_revision); + disable_acpi(); + + return -EINVAL; +} + /* * acpi_boot_table_init() called from setup_arch(), always. * 1. find RSDP and get its address, and then find XSDT * 2. extract all tables and checksums them all + * 3. check ACPI FADT revision * * We can parse ACPI boot-time tables such as MADT after * this function is called. @@ -64,8 +88,16 @@ void __init acpi_boot_table_init(void) return;
/* Initialize the ACPI boot-time table parser. */ - if (acpi_table_init()) + if (acpi_table_init()) { + disable_acpi(); + return; + } + + if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt)) { + /* disable ACPI if no FADT is found */ disable_acpi(); + pr_err("Can't find FADT\n"); + } }
static int __init parse_acpi(char *arg)
On Mon, Feb 02, 2015 at 12:45:37PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index afe10b4..b9f64ec 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -13,6 +13,8 @@
- published by the Free Software Foundation.
*/ +#define pr_fmt(fmt) "ACPI: " fmt
#include <linux/acpi.h> #include <linux/bootmem.h> #include <linux/cpumask.h> @@ -49,10 +51,32 @@ void __init __acpi_unmap_table(char *map, unsigned long size) early_memunmap(map, size); } +static int __init acpi_parse_fadt(struct acpi_table_header *table) +{
- struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table;
- /*
* Revision in table header is the FADT Major revision, and there
* is a minor revision of FADT which was introduced by ACPI 5.1,
* we only deal with ACPI 5.1 or newer revision to get GIC and SMP
* boot protocol configuration data, or we will disable ACPI.
*/
- if (table->revision > 5 ||
(table->revision == 5 && fadt->minor_revision >= 1))
return 0;
- pr_warn("Unsupported FADT revision %d.%d, should be 5.1+, will disable ACPI\n",
table->revision, fadt->minor_revision);
- disable_acpi();
- return -EINVAL;
+}
/*
- acpi_boot_table_init() called from setup_arch(), always.
- find RSDP and get its address, and then find XSDT
- extract all tables and checksums them all
- check ACPI FADT revision
- We can parse ACPI boot-time tables such as MADT after
- this function is called.
@@ -64,8 +88,16 @@ void __init acpi_boot_table_init(void) return; /* Initialize the ACPI boot-time table parser. */
- if (acpi_table_init())
- if (acpi_table_init()) {
disable_acpi();
return;
- }
- if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt)) {
disable_acpi();/* disable ACPI if no FADT is found */
pr_err("Can't find FADT\n");
- }
}
It looks fine to call disable_acpi() here but a bit weird to call it again in acpi_parse_fadt(). I guess that's because acpi_table_parse() ignores the return value of the handler() call. I think it's better to fix the core code (can be an additional patch on top of this series).
On 2015年02月04日 01:20, Catalin Marinas wrote:
On Mon, Feb 02, 2015 at 12:45:37PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index afe10b4..b9f64ec 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -13,6 +13,8 @@
- published by the Free Software Foundation.
*/
+#define pr_fmt(fmt) "ACPI: " fmt
- #include <linux/acpi.h> #include <linux/bootmem.h> #include <linux/cpumask.h>
@@ -49,10 +51,32 @@ void __init __acpi_unmap_table(char *map, unsigned long size) early_memunmap(map, size); }
+static int __init acpi_parse_fadt(struct acpi_table_header *table) +{
- struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table;
- /*
* Revision in table header is the FADT Major revision, and there
* is a minor revision of FADT which was introduced by ACPI 5.1,
* we only deal with ACPI 5.1 or newer revision to get GIC and SMP
* boot protocol configuration data, or we will disable ACPI.
*/
- if (table->revision > 5 ||
(table->revision == 5 && fadt->minor_revision >= 1))
return 0;
- pr_warn("Unsupported FADT revision %d.%d, should be 5.1+, will disable ACPI\n",
table->revision, fadt->minor_revision);
- disable_acpi();
- return -EINVAL;
+}
- /*
- acpi_boot_table_init() called from setup_arch(), always.
- find RSDP and get its address, and then find XSDT
- extract all tables and checksums them all
- check ACPI FADT revision
- We can parse ACPI boot-time tables such as MADT after
- this function is called.
@@ -64,8 +88,16 @@ void __init acpi_boot_table_init(void) return;
/* Initialize the ACPI boot-time table parser. */
- if (acpi_table_init())
- if (acpi_table_init()) {
disable_acpi();
return;
- }
- if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt)) {
disable_acpi();/* disable ACPI if no FADT is found */
pr_err("Can't find FADT\n");
- } }
It looks fine to call disable_acpi() here but a bit weird to call it again in acpi_parse_fadt(). I guess that's because acpi_table_parse() ignores the return value of the handler() call. I think it's better to fix the core code (can be an additional patch on top of this series).
I checked all the code calling acpi_table_parse() and I found that it will be no functional change if we return the value of handler(), but I need Rafael's confirm on it.
Thanks Hanjun
On Wed, Feb 04, 2015 at 09:38:25AM +0000, Hanjun Guo wrote:
On 2015年02月04日 01:20, Catalin Marinas wrote:
On Mon, Feb 02, 2015 at 12:45:37PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index afe10b4..b9f64ec 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -13,6 +13,8 @@
- published by the Free Software Foundation.
*/
+#define pr_fmt(fmt) "ACPI: " fmt
- #include <linux/acpi.h> #include <linux/bootmem.h> #include <linux/cpumask.h>
@@ -49,10 +51,32 @@ void __init __acpi_unmap_table(char *map, unsigned long size) early_memunmap(map, size); }
+static int __init acpi_parse_fadt(struct acpi_table_header *table) +{
- struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table;
- /*
* Revision in table header is the FADT Major revision, and there
* is a minor revision of FADT which was introduced by ACPI 5.1,
* we only deal with ACPI 5.1 or newer revision to get GIC and SMP
* boot protocol configuration data, or we will disable ACPI.
*/
- if (table->revision > 5 ||
(table->revision == 5 && fadt->minor_revision >= 1))
return 0;
- pr_warn("Unsupported FADT revision %d.%d, should be 5.1+, will disable ACPI\n",
table->revision, fadt->minor_revision);
- disable_acpi();
- return -EINVAL;
+}
- /*
- acpi_boot_table_init() called from setup_arch(), always.
- find RSDP and get its address, and then find XSDT
- extract all tables and checksums them all
- check ACPI FADT revision
- We can parse ACPI boot-time tables such as MADT after
- this function is called.
@@ -64,8 +88,16 @@ void __init acpi_boot_table_init(void) return;
/* Initialize the ACPI boot-time table parser. */
- if (acpi_table_init())
- if (acpi_table_init()) {
disable_acpi();
return;
- }
- if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt)) {
disable_acpi();/* disable ACPI if no FADT is found */
pr_err("Can't find FADT\n");
- } }
It looks fine to call disable_acpi() here but a bit weird to call it again in acpi_parse_fadt(). I guess that's because acpi_table_parse() ignores the return value of the handler() call. I think it's better to fix the core code (can be an additional patch on top of this series).
I checked all the code calling acpi_table_parse() and I found that it will be no functional change if we return the value of handler(), but I need Rafael's confirm on it.
Are you sure ? All calls to acpi_table_parse() that checks the return value are affected. I guess that depends on what an error return from the handler means, from acpi_table_parse():
* Return 0 if table found, -errno if not.
So, if table is found but parsing fails that acpi_table_parse() signature should be changed if the handler barfs with an error and it is propagated. Still, I share Catalin's comment.
Have you thought about getting the FADT through:
acpi_get_table_with_size()
and check the revision there instead of going through acpi_table_parse() for that ?
I wonder if the revision information is not already available without needing to retrieve the FADT again.
On top of that, this patch should be squashed, I have a feeling that between patch 4 and 9, there is a window where ACPI versions predating 5.1 are ok on arm64, which is not the case. I do not think that's a bisectable issue, but keep this in mind please.
Thanks, Lorenzo
On 2015年02月04日 21:06, Lorenzo Pieralisi wrote:
On Wed, Feb 04, 2015 at 09:38:25AM +0000, Hanjun Guo wrote:
On 2015年02月04日 01:20, Catalin Marinas wrote:
On Mon, Feb 02, 2015 at 12:45:37PM +0000, Hanjun Guo wrote:
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index afe10b4..b9f64ec 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -13,6 +13,8 @@ * published by the Free Software Foundation. */
+#define pr_fmt(fmt) "ACPI: " fmt
- #include <linux/acpi.h> #include <linux/bootmem.h> #include <linux/cpumask.h>
@@ -49,10 +51,32 @@ void __init __acpi_unmap_table(char *map, unsigned long size) early_memunmap(map, size); }
+static int __init acpi_parse_fadt(struct acpi_table_header *table) +{
- struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table;
- /*
* Revision in table header is the FADT Major revision, and there
* is a minor revision of FADT which was introduced by ACPI 5.1,
* we only deal with ACPI 5.1 or newer revision to get GIC and SMP
* boot protocol configuration data, or we will disable ACPI.
*/
- if (table->revision > 5 ||
(table->revision == 5 && fadt->minor_revision >= 1))
return 0;
- pr_warn("Unsupported FADT revision %d.%d, should be 5.1+, will disable ACPI\n",
table->revision, fadt->minor_revision);
- disable_acpi();
- return -EINVAL;
+}
- /*
- acpi_boot_table_init() called from setup_arch(), always.
- find RSDP and get its address, and then find XSDT
- extract all tables and checksums them all
- check ACPI FADT revision
- We can parse ACPI boot-time tables such as MADT after
- this function is called.
@@ -64,8 +88,16 @@ void __init acpi_boot_table_init(void) return;
/* Initialize the ACPI boot-time table parser. */
- if (acpi_table_init())
- if (acpi_table_init()) {
disable_acpi();
return;
- }
- if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt)) {
/* disable ACPI if no FADT is found */ disable_acpi();
pr_err("Can't find FADT\n");
- } }
It looks fine to call disable_acpi() here but a bit weird to call it again in acpi_parse_fadt(). I guess that's because acpi_table_parse() ignores the return value of the handler() call. I think it's better to fix the core code (can be an additional patch on top of this series).
I checked all the code calling acpi_table_parse() and I found that it will be no functional change if we return the value of handler(), but I need Rafael's confirm on it.
Are you sure ? All calls to acpi_table_parse() that checks the return value are affected. I guess that depends on what an error return from the handler means, from acpi_table_parse():
- Return 0 if table found, -errno if not.
Yes, you are right. What I mean for the "no functional change" because of most handler passed to acpi_table_parse() just return 0, I didn't describe it clearly, my bad.
In ARM64 case, I find that we can not disable ACPI even if we return error for the handler, for example, we return -EOPNOTSUPP when there is no PSCI support, we can go on with cpu0 boot only.
So, if table is found but parsing fails that acpi_table_parse() signature should be changed if the handler barfs with an error and it is propagated. Still, I share Catalin's comment.
Sorry, I don't understand the last sentence, do you mean you agree with Catalin to return the result of handler()?
Thanks Hanjun
From: Graeme Gregory graeme.gregory@linaro.org
If the early boot methods of acpi are happy that we have valid ACPI tables and acpi=force has been passed, then do not unflat devicetree effectively disabling further hardware probing from DT.
CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/kernel/setup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 510a681..553967d 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -446,7 +446,8 @@ void __init setup_arch(char **cmdline_p) efi_idmap_init(); early_ioremap_reset();
- unflatten_device_tree(); + if (acpi_disabled) + unflatten_device_tree();
psci_init();
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
Since ACPI 5.1 doesn't support self defined PSCI function IDs, which means that only PSCI 0.2+ is supported in ACPI.
CC: Lorenzo Pieralisi lorenzo.pieralisi@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/include/asm/acpi.h | 14 ++++++++ arch/arm64/include/asm/psci.h | 3 +- arch/arm64/kernel/psci.c | 78 ++++++++++++++++++++++++++++++------------- arch/arm64/kernel/setup.c | 8 +++-- 4 files changed, 75 insertions(+), 28 deletions(-)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 9fcf632..1aea87c 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -19,6 +19,18 @@ extern int acpi_disabled; extern int acpi_noirq; extern int acpi_pci_disabled;
+/* 1 to indicate PSCI 0.2+ is implemented */ +static inline bool acpi_psci_present(void) +{ + return acpi_gbl_FADT.arm_boot_flags & ACPI_FADT_PSCI_COMPLIANT; +} + +/* 1 to indicate HVC must be used instead of SMC as the PSCI conduit */ +static inline bool acpi_psci_use_hvc(void) +{ + return acpi_gbl_FADT.arm_boot_flags & ACPI_FADT_PSCI_USE_HVC; +} + static inline void disable_acpi(void) { acpi_disabled = 1; @@ -50,6 +62,8 @@ static inline void arch_fix_phys_package_id(int num, u32 slot) { } #else static inline void disable_acpi(void) { } static inline void enable_acpi(void) { } +static inline bool acpi_psci_present(void) { return false; } +static inline bool acpi_psci_use_hvc(void) { return false; } #endif /* CONFIG_ACPI */
#endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/include/asm/psci.h b/arch/arm64/include/asm/psci.h index e5312ea..2454bc5 100644 --- a/arch/arm64/include/asm/psci.h +++ b/arch/arm64/include/asm/psci.h @@ -14,6 +14,7 @@ #ifndef __ASM_PSCI_H #define __ASM_PSCI_H
-int psci_init(void); +int psci_dt_init(void); +int psci_acpi_init(void);
#endif /* __ASM_PSCI_H */ diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c index f1dbca7..0ec0dc5 100644 --- a/arch/arm64/kernel/psci.c +++ b/arch/arm64/kernel/psci.c @@ -15,6 +15,7 @@
#define pr_fmt(fmt) "psci: " fmt
+#include <linux/acpi.h> #include <linux/init.h> #include <linux/of.h> #include <linux/smp.h> @@ -24,6 +25,7 @@ #include <linux/slab.h> #include <uapi/linux/psci.h>
+#include <asm/acpi.h> #include <asm/compiler.h> #include <asm/cpu_ops.h> #include <asm/errno.h> @@ -304,6 +306,33 @@ static void psci_sys_poweroff(void) invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0); }
+static void __init psci_0_2_set_functions(void) +{ + pr_info("Using standard PSCI v0.2 function IDs\n"); + psci_function_id[PSCI_FN_CPU_SUSPEND] = PSCI_0_2_FN64_CPU_SUSPEND; + psci_ops.cpu_suspend = psci_cpu_suspend; + + psci_function_id[PSCI_FN_CPU_OFF] = PSCI_0_2_FN_CPU_OFF; + psci_ops.cpu_off = psci_cpu_off; + + psci_function_id[PSCI_FN_CPU_ON] = PSCI_0_2_FN64_CPU_ON; + psci_ops.cpu_on = psci_cpu_on; + + psci_function_id[PSCI_FN_MIGRATE] = PSCI_0_2_FN64_MIGRATE; + psci_ops.migrate = psci_migrate; + + psci_function_id[PSCI_FN_AFFINITY_INFO] = PSCI_0_2_FN64_AFFINITY_INFO; + psci_ops.affinity_info = psci_affinity_info; + + psci_function_id[PSCI_FN_MIGRATE_INFO_TYPE] = + PSCI_0_2_FN_MIGRATE_INFO_TYPE; + psci_ops.migrate_info_type = psci_migrate_info_type; + + arm_pm_restart = psci_sys_reset; + + pm_power_off = psci_sys_poweroff; +} + /* * PSCI Function IDs for v0.2+ are well defined so use * standard values. @@ -337,29 +366,7 @@ static int __init psci_0_2_init(struct device_node *np) } }
- pr_info("Using standard PSCI v0.2 function IDs\n"); - psci_function_id[PSCI_FN_CPU_SUSPEND] = PSCI_0_2_FN64_CPU_SUSPEND; - psci_ops.cpu_suspend = psci_cpu_suspend; - - psci_function_id[PSCI_FN_CPU_OFF] = PSCI_0_2_FN_CPU_OFF; - psci_ops.cpu_off = psci_cpu_off; - - psci_function_id[PSCI_FN_CPU_ON] = PSCI_0_2_FN64_CPU_ON; - psci_ops.cpu_on = psci_cpu_on; - - psci_function_id[PSCI_FN_MIGRATE] = PSCI_0_2_FN64_MIGRATE; - psci_ops.migrate = psci_migrate; - - psci_function_id[PSCI_FN_AFFINITY_INFO] = PSCI_0_2_FN64_AFFINITY_INFO; - psci_ops.affinity_info = psci_affinity_info; - - psci_function_id[PSCI_FN_MIGRATE_INFO_TYPE] = - PSCI_0_2_FN_MIGRATE_INFO_TYPE; - psci_ops.migrate_info_type = psci_migrate_info_type; - - arm_pm_restart = psci_sys_reset; - - pm_power_off = psci_sys_poweroff; + psci_0_2_set_functions();
out_put_node: of_node_put(np); @@ -412,7 +419,7 @@ static const struct of_device_id psci_of_match[] __initconst = { {}, };
-int __init psci_init(void) +int __init psci_dt_init(void) { struct device_node *np; const struct of_device_id *matched_np; @@ -427,6 +434,29 @@ int __init psci_init(void) return init_fn(np); }
+/* + * We use PSCI 0.2+ when ACPI is deployed on ARM64 and it's + * explicitly clarified in SBBR + */ +int __init psci_acpi_init(void) +{ + if (!acpi_psci_present()) { + pr_info("is not implemented in ACPI.\n"); + return -EOPNOTSUPP; + } + + pr_info("probing for conduit method from ACPI.\n"); + + if (acpi_psci_use_hvc()) + invoke_psci_fn = __invoke_psci_fn_hvc; + else + invoke_psci_fn = __invoke_psci_fn_smc; + + psci_0_2_set_functions(); + + return 0; +} + #ifdef CONFIG_SMP
static int __init cpu_psci_cpu_init(struct device_node *dn, unsigned int cpu) diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 553967d..43ae914 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -446,10 +446,12 @@ void __init setup_arch(char **cmdline_p) efi_idmap_init(); early_ioremap_reset();
- if (acpi_disabled) + if (acpi_disabled) { unflatten_device_tree(); - - psci_init(); + psci_dt_init(); + } else { + psci_acpi_init(); + }
cpu_read_bootcpu_ops(); #ifdef CONFIG_SMP
On Mon, Feb 02, 2015 at 12:45:39PM +0000, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
So you do rely on a global FADT being available, if you use it for PSCI detection you can use it for ACPI revision detection too, right ?
Point is, either we should not use the global FADT table, or we use it consistently, or there is something I am unaware of that prevents you from using in some code paths and I would like to understand why.
Thanks, Lorenzo
Since ACPI 5.1 doesn't support self defined PSCI function IDs, which means that only PSCI 0.2+ is supported in ACPI.
CC: Lorenzo Pieralisi lorenzo.pieralisi@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/include/asm/acpi.h | 14 ++++++++ arch/arm64/include/asm/psci.h | 3 +- arch/arm64/kernel/psci.c | 78 ++++++++++++++++++++++++++++++------------- arch/arm64/kernel/setup.c | 8 +++-- 4 files changed, 75 insertions(+), 28 deletions(-)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 9fcf632..1aea87c 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -19,6 +19,18 @@ extern int acpi_disabled; extern int acpi_noirq; extern int acpi_pci_disabled; +/* 1 to indicate PSCI 0.2+ is implemented */ +static inline bool acpi_psci_present(void) +{
- return acpi_gbl_FADT.arm_boot_flags & ACPI_FADT_PSCI_COMPLIANT;
+}
+/* 1 to indicate HVC must be used instead of SMC as the PSCI conduit */ +static inline bool acpi_psci_use_hvc(void) +{
- return acpi_gbl_FADT.arm_boot_flags & ACPI_FADT_PSCI_USE_HVC;
+}
static inline void disable_acpi(void) { acpi_disabled = 1; @@ -50,6 +62,8 @@ static inline void arch_fix_phys_package_id(int num, u32 slot) { } #else static inline void disable_acpi(void) { } static inline void enable_acpi(void) { } +static inline bool acpi_psci_present(void) { return false; } +static inline bool acpi_psci_use_hvc(void) { return false; } #endif /* CONFIG_ACPI */ #endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/include/asm/psci.h b/arch/arm64/include/asm/psci.h index e5312ea..2454bc5 100644 --- a/arch/arm64/include/asm/psci.h +++ b/arch/arm64/include/asm/psci.h @@ -14,6 +14,7 @@ #ifndef __ASM_PSCI_H #define __ASM_PSCI_H -int psci_init(void); +int psci_dt_init(void); +int psci_acpi_init(void); #endif /* __ASM_PSCI_H */ diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c index f1dbca7..0ec0dc5 100644 --- a/arch/arm64/kernel/psci.c +++ b/arch/arm64/kernel/psci.c @@ -15,6 +15,7 @@ #define pr_fmt(fmt) "psci: " fmt +#include <linux/acpi.h> #include <linux/init.h> #include <linux/of.h> #include <linux/smp.h> @@ -24,6 +25,7 @@ #include <linux/slab.h> #include <uapi/linux/psci.h> +#include <asm/acpi.h> #include <asm/compiler.h> #include <asm/cpu_ops.h> #include <asm/errno.h> @@ -304,6 +306,33 @@ static void psci_sys_poweroff(void) invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0); } +static void __init psci_0_2_set_functions(void) +{
- pr_info("Using standard PSCI v0.2 function IDs\n");
- psci_function_id[PSCI_FN_CPU_SUSPEND] = PSCI_0_2_FN64_CPU_SUSPEND;
- psci_ops.cpu_suspend = psci_cpu_suspend;
- psci_function_id[PSCI_FN_CPU_OFF] = PSCI_0_2_FN_CPU_OFF;
- psci_ops.cpu_off = psci_cpu_off;
- psci_function_id[PSCI_FN_CPU_ON] = PSCI_0_2_FN64_CPU_ON;
- psci_ops.cpu_on = psci_cpu_on;
- psci_function_id[PSCI_FN_MIGRATE] = PSCI_0_2_FN64_MIGRATE;
- psci_ops.migrate = psci_migrate;
- psci_function_id[PSCI_FN_AFFINITY_INFO] = PSCI_0_2_FN64_AFFINITY_INFO;
- psci_ops.affinity_info = psci_affinity_info;
- psci_function_id[PSCI_FN_MIGRATE_INFO_TYPE] =
PSCI_0_2_FN_MIGRATE_INFO_TYPE;
- psci_ops.migrate_info_type = psci_migrate_info_type;
- arm_pm_restart = psci_sys_reset;
- pm_power_off = psci_sys_poweroff;
+}
/*
- PSCI Function IDs for v0.2+ are well defined so use
- standard values.
@@ -337,29 +366,7 @@ static int __init psci_0_2_init(struct device_node *np) } }
- pr_info("Using standard PSCI v0.2 function IDs\n");
- psci_function_id[PSCI_FN_CPU_SUSPEND] = PSCI_0_2_FN64_CPU_SUSPEND;
- psci_ops.cpu_suspend = psci_cpu_suspend;
- psci_function_id[PSCI_FN_CPU_OFF] = PSCI_0_2_FN_CPU_OFF;
- psci_ops.cpu_off = psci_cpu_off;
- psci_function_id[PSCI_FN_CPU_ON] = PSCI_0_2_FN64_CPU_ON;
- psci_ops.cpu_on = psci_cpu_on;
- psci_function_id[PSCI_FN_MIGRATE] = PSCI_0_2_FN64_MIGRATE;
- psci_ops.migrate = psci_migrate;
- psci_function_id[PSCI_FN_AFFINITY_INFO] = PSCI_0_2_FN64_AFFINITY_INFO;
- psci_ops.affinity_info = psci_affinity_info;
- psci_function_id[PSCI_FN_MIGRATE_INFO_TYPE] =
PSCI_0_2_FN_MIGRATE_INFO_TYPE;
- psci_ops.migrate_info_type = psci_migrate_info_type;
- arm_pm_restart = psci_sys_reset;
- pm_power_off = psci_sys_poweroff;
- psci_0_2_set_functions();
out_put_node: of_node_put(np); @@ -412,7 +419,7 @@ static const struct of_device_id psci_of_match[] __initconst = { {}, }; -int __init psci_init(void) +int __init psci_dt_init(void) { struct device_node *np; const struct of_device_id *matched_np; @@ -427,6 +434,29 @@ int __init psci_init(void) return init_fn(np); } +/*
- We use PSCI 0.2+ when ACPI is deployed on ARM64 and it's
- explicitly clarified in SBBR
- */
+int __init psci_acpi_init(void) +{
- if (!acpi_psci_present()) {
pr_info("is not implemented in ACPI.\n");
return -EOPNOTSUPP;
- }
- pr_info("probing for conduit method from ACPI.\n");
- if (acpi_psci_use_hvc())
invoke_psci_fn = __invoke_psci_fn_hvc;
- else
invoke_psci_fn = __invoke_psci_fn_smc;
- psci_0_2_set_functions();
- return 0;
+}
#ifdef CONFIG_SMP static int __init cpu_psci_cpu_init(struct device_node *dn, unsigned int cpu) diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 553967d..43ae914 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -446,10 +446,12 @@ void __init setup_arch(char **cmdline_p) efi_idmap_init(); early_ioremap_reset();
- if (acpi_disabled)
- if (acpi_disabled) { unflatten_device_tree();
- psci_init();
psci_dt_init();
- } else {
psci_acpi_init();
- }
cpu_read_bootcpu_ops();
#ifdef CONFIG_SMP
1.9.1
On 2015年02月05日 00:43, Lorenzo Pieralisi wrote:
On Mon, Feb 02, 2015 at 12:45:39PM +0000, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
So you do rely on a global FADT being available, if you use it for PSCI detection you can use it for ACPI revision detection too, right ?
Yes, I think so.
Point is, either we should not use the global FADT table, or we use it consistently, or there is something I am unaware of that prevents you from using in some code paths and I would like to understand why.
global FADT table is initialized when parsing the tables from RSDP in ACPICA core, and it should be work on ARM64 too.
Thanks Hanjun
On 02/04/2015 09:43 AM, Lorenzo Pieralisi wrote:
On Mon, Feb 02, 2015 at 12:45:39PM +0000, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
So you do rely on a global FADT being available, if you use it for PSCI detection you can use it for ACPI revision detection too, right ?
Point is, either we should not use the global FADT table, or we use it consistently, or there is something I am unaware of that prevents you from using in some code paths and I would like to understand why.
The FADT is a required table for arm64, as noted in the documentation and the SBBR. While unfortunately the spec does not say it is mandatory, even x86 systems are pretty useless without it. So yes, we rely on it being available, not only for the PSCI info, but other flags such as HW_REDUCED_ACPI.
I suppose it does not have to be globally scoped. However, the FADT is frequently used, especially on x86, so it makes sense to me from an efficiency standpoint to have a global reference to it.
I'm not sure I understand what is meant by using FADT for ACPI revision detection; there are fields in the FADT that provide a major and minor number for the FADT itself, but I don't believe there's any guarantee those will be the same as the level of the specification that is being supported by the kernel (chances are they will, but it's not mandatory).
I've probably just missed a part of a thread somewhere; could you point me to where the inconsistency lies? I'm just not understanding right this second....
Hi Al,
On Thu, Feb 05, 2015 at 05:11:31PM +0000, Al Stone wrote:
On 02/04/2015 09:43 AM, Lorenzo Pieralisi wrote:
On Mon, Feb 02, 2015 at 12:45:39PM +0000, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
So you do rely on a global FADT being available, if you use it for PSCI detection you can use it for ACPI revision detection too, right ?
Point is, either we should not use the global FADT table, or we use it consistently, or there is something I am unaware of that prevents you from using in some code paths and I would like to understand why.
The FADT is a required table for arm64, as noted in the documentation and the SBBR. While unfortunately the spec does not say it is mandatory, even x86 systems are pretty useless without it. So yes, we rely on it being available, not only for the PSCI info, but other flags such as HW_REDUCED_ACPI.
I suppose it does not have to be globally scoped. However, the FADT is frequently used, especially on x86, so it makes sense to me from an efficiency standpoint to have a global reference to it.
I'm not sure I understand what is meant by using FADT for ACPI revision detection; there are fields in the FADT that provide a major and minor number for the FADT itself, but I don't believe there's any guarantee those will be the same as the level of the specification that is being supported by the kernel (chances are they will, but it's not mandatory).
I've probably just missed a part of a thread somewhere; could you point me to where the inconsistency lies? I'm just not understanding right this second....
Yes, it is my fault, I was referring to another thread/patch (9), where you need to check the FADT revision to "validate it" (ie >= 5.1) for the arm64 kernel. What I am saying is: if the global FADT is there to parse PSCI info, it is there to check the FADT revision too, I do not necessarily see the need for calling acpi_table_parse() again to do it, the FADT revision checking can be carried out as for PSCI, that's all I wanted to say.
Thanks, Lorenzo
On 02/05/2015 10:49 AM, Lorenzo Pieralisi wrote:
Hi Al,
Howdy, Lorenzo.
On Thu, Feb 05, 2015 at 05:11:31PM +0000, Al Stone wrote:
On 02/04/2015 09:43 AM, Lorenzo Pieralisi wrote:
On Mon, Feb 02, 2015 at 12:45:39PM +0000, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
So you do rely on a global FADT being available, if you use it for PSCI detection you can use it for ACPI revision detection too, right ?
Point is, either we should not use the global FADT table, or we use it consistently, or there is something I am unaware of that prevents you from using in some code paths and I would like to understand why.
The FADT is a required table for arm64, as noted in the documentation and the SBBR. While unfortunately the spec does not say it is mandatory, even x86 systems are pretty useless without it. So yes, we rely on it being available, not only for the PSCI info, but other flags such as HW_REDUCED_ACPI.
I suppose it does not have to be globally scoped. However, the FADT is frequently used, especially on x86, so it makes sense to me from an efficiency standpoint to have a global reference to it.
I'm not sure I understand what is meant by using FADT for ACPI revision detection; there are fields in the FADT that provide a major and minor number for the FADT itself, but I don't believe there's any guarantee those will be the same as the level of the specification that is being supported by the kernel (chances are they will, but it's not mandatory).
I've probably just missed a part of a thread somewhere; could you point me to where the inconsistency lies? I'm just not understanding right this second....
Yes, it is my fault, I was referring to another thread/patch (9), where you need to check the FADT revision to "validate it" (ie >= 5.1) for the arm64 kernel. What I am saying is: if the global FADT is there to parse PSCI info, it is there to check the FADT revision too, I do not necessarily see the need for calling acpi_table_parse() again to do it, the FADT revision checking can be carried out as for PSCI, that's all I wanted to say.
Thanks, Lorenzo
Aha. I understand now. Another colleague was also trying to explain this to me and I think I just hadn't had enough coffee yet. The underlying ACPI code maps tables into the kernel in two phases; it may be that when the code in patch 9 is run, the global table is not yet available, while here it is; I don't recall off-hand.
I'll take a look at this and talk it over with Hanjun. If the global table is available, it would indeed make sense to be consistent.
Thanks for the explanation; that really helped me.
Hi Lorenzo, Al,
On 2015年02月06日 03:03, Al Stone wrote:
On 02/05/2015 10:49 AM, Lorenzo Pieralisi wrote:
Hi Al,
Howdy, Lorenzo.
On Thu, Feb 05, 2015 at 05:11:31PM +0000, Al Stone wrote:
On 02/04/2015 09:43 AM, Lorenzo Pieralisi wrote:
On Mon, Feb 02, 2015 at 12:45:39PM +0000, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
So you do rely on a global FADT being available, if you use it for PSCI detection you can use it for ACPI revision detection too, right ?
Point is, either we should not use the global FADT table, or we use it consistently, or there is something I am unaware of that prevents you from using in some code paths and I would like to understand why.
The FADT is a required table for arm64, as noted in the documentation and the SBBR. While unfortunately the spec does not say it is mandatory, even x86 systems are pretty useless without it. So yes, we rely on it being available, not only for the PSCI info, but other flags such as HW_REDUCED_ACPI.
I suppose it does not have to be globally scoped. However, the FADT is frequently used, especially on x86, so it makes sense to me from an efficiency standpoint to have a global reference to it.
I'm not sure I understand what is meant by using FADT for ACPI revision detection; there are fields in the FADT that provide a major and minor number for the FADT itself, but I don't believe there's any guarantee those will be the same as the level of the specification that is being supported by the kernel (chances are they will, but it's not mandatory).
I've probably just missed a part of a thread somewhere; could you point me to where the inconsistency lies? I'm just not understanding right this second....
Yes, it is my fault, I was referring to another thread/patch (9), where you need to check the FADT revision to "validate it" (ie >= 5.1) for the arm64 kernel. What I am saying is: if the global FADT is there to parse PSCI info, it is there to check the FADT revision too, I do not necessarily see the need for calling acpi_table_parse() again to do it, the FADT revision checking can be carried out as for PSCI, that's all I wanted to say.
Aha. I understand now. Another colleague was also trying to explain this to me and I think I just hadn't had enough coffee yet. The underlying ACPI code maps tables into the kernel in two phases; it may be that when the code in patch 9 is run, the global table is not yet available, while here it is; I don't recall off-hand.
I'll take a look at this and talk it over with Hanjun. If the global table is available, it would indeed make sense to be consistent.
I had dig into the code and found out that struct acpi_gbl_FADT will be available with correct value only if FADT is presented by firmware.
acpi_table_init() will be called before parsing FADT for PSCI flag in this patch set.
In acpi_table_init() acpi_initialize_tables() acpi_tb_parse_root_table()
In acpi_tb_parse_root_table()
if (ACPI_SUCCESS(status) && ACPI_COMPARE_NAME(&acpi_gbl_root_table_list. tables[table_index].signature, ACPI_SIG_FADT)) { acpi_tb_parse_fadt(table_index); }
And acpi_tb_parse_fadt(table_index) will copy the fadt table to global struct acpi_gbl_FADT.
so it seems that we can use global struct acpi_gbl_FADT directly to check the FADT revision, but it is only available with firmware presented the FADT table, so check for the FADT table is still needed for some bad firmware without FADT.
Why PSCI flag can be used without any check for the availability of FADT? because we already disable ACPI if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt)) failed (no FADT tabled found), and PSCI flag will not be used later.
So I think we can keep the code as it is for now, and I think it is the safest way to do it, does it make sense?
Thanks Hanjun
On Fri, Feb 06, 2015 at 07:56:07AM +0000, Hanjun Guo wrote:
Hi Lorenzo, Al,
On 2015年02月06日 03:03, Al Stone wrote:
On 02/05/2015 10:49 AM, Lorenzo Pieralisi wrote:
Hi Al,
Howdy, Lorenzo.
On Thu, Feb 05, 2015 at 05:11:31PM +0000, Al Stone wrote:
On 02/04/2015 09:43 AM, Lorenzo Pieralisi wrote:
On Mon, Feb 02, 2015 at 12:45:39PM +0000, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the firmware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC).
FADT table contains such information in ACPI 5.1, FADT table was parsed in ACPI table init and copy to struct acpi_gbl_FADT, so use the flags in struct acpi_gbl_FADT for PSCI init.
So you do rely on a global FADT being available, if you use it for PSCI detection you can use it for ACPI revision detection too, right ?
Point is, either we should not use the global FADT table, or we use it consistently, or there is something I am unaware of that prevents you from using in some code paths and I would like to understand why.
The FADT is a required table for arm64, as noted in the documentation and the SBBR. While unfortunately the spec does not say it is mandatory, even x86 systems are pretty useless without it. So yes, we rely on it being available, not only for the PSCI info, but other flags such as HW_REDUCED_ACPI.
I suppose it does not have to be globally scoped. However, the FADT is frequently used, especially on x86, so it makes sense to me from an efficiency standpoint to have a global reference to it.
I'm not sure I understand what is meant by using FADT for ACPI revision detection; there are fields in the FADT that provide a major and minor number for the FADT itself, but I don't believe there's any guarantee those will be the same as the level of the specification that is being supported by the kernel (chances are they will, but it's not mandatory).
I've probably just missed a part of a thread somewhere; could you point me to where the inconsistency lies? I'm just not understanding right this second....
Yes, it is my fault, I was referring to another thread/patch (9), where you need to check the FADT revision to "validate it" (ie >= 5.1) for the arm64 kernel. What I am saying is: if the global FADT is there to parse PSCI info, it is there to check the FADT revision too, I do not necessarily see the need for calling acpi_table_parse() again to do it, the FADT revision checking can be carried out as for PSCI, that's all I wanted to say.
Aha. I understand now. Another colleague was also trying to explain this to me and I think I just hadn't had enough coffee yet. The underlying ACPI code maps tables into the kernel in two phases; it may be that when the code in patch 9 is run, the global table is not yet available, while here it is; I don't recall off-hand.
I'll take a look at this and talk it over with Hanjun. If the global table is available, it would indeed make sense to be consistent.
I had dig into the code and found out that struct acpi_gbl_FADT will be available with correct value only if FADT is presented by firmware.
acpi_table_init() will be called before parsing FADT for PSCI flag in this patch set.
In acpi_table_init() acpi_initialize_tables() acpi_tb_parse_root_table()
In acpi_tb_parse_root_table()
if (ACPI_SUCCESS(status) && ACPI_COMPARE_NAME(&acpi_gbl_root_table_list. tables[table_index].signature, ACPI_SIG_FADT)) { acpi_tb_parse_fadt(table_index); }
And acpi_tb_parse_fadt(table_index) will copy the fadt table to global struct acpi_gbl_FADT.
so it seems that we can use global struct acpi_gbl_FADT directly to check the FADT revision, but it is only available with firmware presented the FADT table, so check for the FADT table is still needed for some bad firmware without FADT.
Why PSCI flag can be used without any check for the availability of FADT? because we already disable ACPI if (acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt)) failed (no FADT tabled found), and PSCI flag will not be used later.
So I think we can keep the code as it is for now, and I think it is the safest way to do it, does it make sense?
Understood. Basically, given current ACPI code, you have to call acpi_table_parse() to make sure FADT is there, even if the handler to parse it can be left to a void empty function, and while at it within the handler passed to acpi_table_parse() you check the revision; it makes sense but we end up having disable_acpi() scattered all over the place.
You can leave your code as it is, or we check with Rafael if acpi_table_parse() can be made to propagate the handler return value, my fear is that the acpi_table_parse() is not expected to return failure if the handler fails, only if table is not found, I will have a look into this.
Thanks, Lorenzo
When MADT is parsed, print GIC information to make the boot log look pretty:
ACPI: GICC (acpi_id[0x0000] address[00000000e112f000] MPIDR[0x0] enabled) ACPI: GICC (acpi_id[0x0001] address[00000000e112f000] MPIDR[0x1] enabled) ... ACPI: GICC (acpi_id[0x0201] address[00000000e112f000] MPIDR[0x201] enabled)
These information will be very helpful to bring up early systems to see if acpi_id and MPIDR are matched or not as spec defined.
CC: Rafael J. Wysocki rjw@rjwysocki.net Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org --- drivers/acpi/tables.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+)
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index 93b8152..42d314f 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -183,6 +183,49 @@ void acpi_table_print_madt_entry(struct acpi_subtable_header *header) } break;
+ case ACPI_MADT_TYPE_GENERIC_INTERRUPT: + { + struct acpi_madt_generic_interrupt *p = + (struct acpi_madt_generic_interrupt *)header; + pr_info("GICC (acpi_id[0x%04x] address[%p] MPIDR[0x%llx] %s)\n", + p->uid, (void *)(unsigned long)p->base_address, + p->arm_mpidr, + (p->flags & ACPI_MADT_ENABLED) ? "enabled" : "disabled"); + + } + break; + + case ACPI_MADT_TYPE_GENERIC_DISTRIBUTOR: + { + struct acpi_madt_generic_distributor *p = + (struct acpi_madt_generic_distributor *)header; + pr_info("GIC Distributor (gic_id[0x%04x] address[%p] gsi_base[%d])\n", + p->gic_id, + (void *)(unsigned long)p->base_address, + p->global_irq_base); + } + break; + + case ACPI_MADT_TYPE_GENERIC_MSI_FRAME: + { + struct acpi_madt_generic_msi_frame *p = + (struct acpi_madt_generic_msi_frame *)header; + pr_info("GIC MSI Frame (msi_fame_id[%d] address[%p])\n", + p->msi_frame_id, + (void *)(unsigned long)p->base_address); + } + break; + + case ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR: + { + struct acpi_madt_generic_redistributor *p = + (struct acpi_madt_generic_redistributor *)header; + pr_info("GIC Redistributor (address[%p] region_size[0x%x])\n", + (void *)(unsigned long)p->base_address, + p->length); + } + break; + default: pr_warn("Found unsupported MADT entry (type = 0x%x)\n", header->type);
MADT contains the information for MPIDR which is essential for SMP initialization, parse the GIC cpu interface structures to get the MPIDR value and map it to cpu_logical_map(), and add enabled cpu with valid MPIDR into cpu_possible_map.
ACPI 5.1 only has two explicit methods to boot up SMP, PSCI and Parking protocol, but the Parking protocol is only specified for ARMv7 now, so make PSCI as the only way for the SMP boot protocol before some updates for the ACPI spec or the Parking protocol spec.
Parking protocol patches for SMP boot will be sent to upstream when the new version of Parking protocol is ready.
CC: Lorenzo Pieralisi lorenzo.pieralisi@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Mark Rutland mark.rutland@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org --- arch/arm64/include/asm/acpi.h | 2 + arch/arm64/include/asm/cpu_ops.h | 1 + arch/arm64/include/asm/smp.h | 5 +- arch/arm64/kernel/acpi.c | 150 ++++++++++++++++++++++++++++++++++++++- arch/arm64/kernel/cpu_ops.c | 2 +- arch/arm64/kernel/setup.c | 7 +- arch/arm64/kernel/smp.c | 2 +- 7 files changed, 161 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 1aea87c..8984aa5 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -58,12 +58,14 @@ static inline bool acpi_has_cpu_in_madt(void) }
static inline void arch_fix_phys_package_id(int num, u32 slot) { } +void __init acpi_init_cpus(void);
#else static inline void disable_acpi(void) { } static inline void enable_acpi(void) { } static inline bool acpi_psci_present(void) { return false; } static inline bool acpi_psci_use_hvc(void) { return false; } +static inline void acpi_init_cpus(void) { } #endif /* CONFIG_ACPI */
#endif /*_ASM_ACPI_H*/ diff --git a/arch/arm64/include/asm/cpu_ops.h b/arch/arm64/include/asm/cpu_ops.h index 6f8e2ef..5615970 100644 --- a/arch/arm64/include/asm/cpu_ops.h +++ b/arch/arm64/include/asm/cpu_ops.h @@ -66,5 +66,6 @@ struct cpu_operations { extern const struct cpu_operations *cpu_ops[NR_CPUS]; int __init cpu_read_ops(struct device_node *dn, int cpu); void __init cpu_read_bootcpu_ops(void); +const struct cpu_operations *cpu_get_ops(const char *name);
#endif /* ifndef __ASM_CPU_OPS_H */ diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 780f82c..bf22650 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -39,9 +39,10 @@ extern void show_ipi_list(struct seq_file *p, int prec); extern void handle_IPI(int ipinr, struct pt_regs *regs);
/* - * Setup the set of possible CPUs (via set_cpu_possible) + * Discover the set of possible CPUs and determine their + * SMP operations. */ -extern void smp_init_cpus(void); +extern void of_smp_init_cpus(void);
/* * Provide a function to raise an IPI cross call on CPUs in callmap. diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index b9f64ec..f80caef 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -24,6 +24,10 @@ #include <linux/memblock.h> #include <linux/smp.h>
+#include <asm/cputype.h> +#include <asm/cpu_ops.h> +#include <asm/smp_plat.h> + int acpi_noirq; /* skip ACPI IRQ initialization */ int acpi_disabled; EXPORT_SYMBOL(acpi_disabled); @@ -31,6 +35,8 @@ EXPORT_SYMBOL(acpi_disabled); int acpi_pci_disabled; /* skip ACPI PCI scan and IRQ initialization */ EXPORT_SYMBOL(acpi_pci_disabled);
+static int enabled_cpus; /* Processors (GICC) with enabled flag in MADT */ + /* * __acpi_map_table() will be called before page_init(), so early_ioremap() * or early_memremap() should be called here to for ACPI table mapping. @@ -51,6 +57,134 @@ void __init __acpi_unmap_table(char *map, unsigned long size) early_memunmap(map, size); }
+/** + * acpi_map_gic_cpu_interface - generates a logical cpu number + * and map to MPIDR represented by GICC structure + * @mpidr: CPU's hardware id to register, MPIDR represented in MADT + * @enabled: this cpu is enabled or not + * + * Returns the logical cpu number which maps to MPIDR + */ +static int __init acpi_map_gic_cpu_interface(u64 mpidr, u8 enabled) +{ + int cpu; + + if (mpidr == INVALID_HWID) { + pr_info("Skip MADT cpu entry with invalid MPIDR\n"); + return -EINVAL; + } + + total_cpus++; + if (!enabled) + return -EINVAL; + + if (enabled_cpus >= NR_CPUS) { + pr_warn("NR_CPUS limit of %d reached, Processor %d/0x%llx ignored.\n", + NR_CPUS, total_cpus, mpidr); + return -EINVAL; + } + + /* No need to check duplicate MPIDRs for the first CPU */ + if (enabled_cpus) { + /* + * Duplicate MPIDRs are a recipe for disaster. Scan + * all initialized entries and check for + * duplicates. If any is found just ignore the CPU. + */ + for_each_possible_cpu(cpu) { + if (cpu_logical_map(cpu) == mpidr) { + pr_err("Firmware bug, duplicate CPU MPIDR: 0x%llx in MADT\n", + mpidr); + return -EINVAL; + } + } + + /* allocate a logical cpu id for the new comer */ + cpu = cpumask_next_zero(-1, cpu_possible_mask); + } else { + /* + * First GICC entry must be BSP as ACPI spec said + * in section 5.2.12.15 + */ + if (cpu_logical_map(0) != mpidr) { + pr_err("First GICC entry with MPIDR 0x%llx is not BSP\n", + mpidr); + return -EINVAL; + } + + /* + * boot_cpu_init() already hold bit 0 in cpu_possible_mask + * for BSP, no need to allocate again. + */ + cpu = 0; + } + + if (!acpi_psci_present()) + return -EOPNOTSUPP; + + cpu_ops[cpu] = cpu_get_ops("psci"); + /* CPU 0 was already initialized */ + if (cpu) { + if (!cpu_ops[cpu]) + return -EINVAL; + + if (cpu_ops[cpu]->cpu_init(NULL, cpu)) + return -EOPNOTSUPP; + + /* map the logical cpu id to cpu MPIDR */ + cpu_logical_map(cpu) = mpidr; + + set_cpu_possible(cpu, true); + } + + enabled_cpus++; + return cpu; +} + +static int __init +acpi_parse_gic_cpu_interface(struct acpi_subtable_header *header, + const unsigned long end) +{ + struct acpi_madt_generic_interrupt *processor; + + processor = (struct acpi_madt_generic_interrupt *)header; + + if (BAD_MADT_ENTRY(processor, end)) + return -EINVAL; + + acpi_table_print_madt_entry(header); + + acpi_map_gic_cpu_interface(processor->arm_mpidr & MPIDR_HWID_BITMASK, + processor->flags & ACPI_MADT_ENABLED); + + return 0; +} + +/* Parse GIC cpu interface entries in MADT for SMP init */ +void __init acpi_init_cpus(void) +{ + int count; + + /* + * do a partial walk of MADT to determine how many CPUs + * we have including disabled CPUs, and get information + * we need for SMP init + */ + count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, + acpi_parse_gic_cpu_interface, 0); + + if (!count) { + pr_err("No GIC CPU interface entries present\n"); + return; + } else if (count < 0) { + pr_err("Error parsing GIC CPU interface entry\n"); + return; + } + + /* Make boot-up look pretty */ + pr_info("%d CPUs enabled, %d CPUs total\n", enabled_cpus, total_cpus); +} + static int __init acpi_parse_fadt(struct acpi_table_header *table) { struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table; @@ -62,8 +196,20 @@ static int __init acpi_parse_fadt(struct acpi_table_header *table) * boot protocol configuration data, or we will disable ACPI. */ if (table->revision > 5 || - (table->revision == 5 && fadt->minor_revision >= 1)) - return 0; + (table->revision == 5 && fadt->minor_revision >= 1)) { + /* + * ACPI 5.1 only has two explicit methods to boot up SMP, + * PSCI and Parking protocol, but the Parking protocol is + * only specified for ARMv7 now, so make PSCI as the only + * way for the SMP boot protocol before some updates for + * the Parking protocol spec. + */ + if (acpi_psci_present()) + return 0; + + pr_warn("No PSCI support, will not bring up secondary CPUs\n"); + return -EOPNOTSUPP; + }
pr_warn("Unsupported FADT revision %d.%d, should be 5.1+, will disable ACPI\n", table->revision, fadt->minor_revision); diff --git a/arch/arm64/kernel/cpu_ops.c b/arch/arm64/kernel/cpu_ops.c index cce9524..fb8ff9b 100644 --- a/arch/arm64/kernel/cpu_ops.c +++ b/arch/arm64/kernel/cpu_ops.c @@ -35,7 +35,7 @@ static const struct cpu_operations *supported_cpu_ops[] __initconst = { NULL, };
-static const struct cpu_operations * __init cpu_get_ops(const char *name) +const struct cpu_operations * __init cpu_get_ops(const char *name) { const struct cpu_operations **ops = supported_cpu_ops;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 43ae914..1099ddc 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -449,13 +449,16 @@ void __init setup_arch(char **cmdline_p) if (acpi_disabled) { unflatten_device_tree(); psci_dt_init(); + cpu_read_bootcpu_ops(); +#ifdef CONFIG_SMP + of_smp_init_cpus(); +#endif } else { psci_acpi_init(); + acpi_init_cpus(); }
- cpu_read_bootcpu_ops(); #ifdef CONFIG_SMP - smp_init_cpus(); smp_build_mpidr_hash(); #endif
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 7ae6ee0..5aaf5a4 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -323,7 +323,7 @@ void __init smp_prepare_boot_cpu(void) * cpu logical map array containing MPIDR values related to logical * cpus. Assumes that cpu_logical_map(0) has already been initialized. */ -void __init smp_init_cpus(void) +void __init of_smp_init_cpus(void) { struct device_node *dn = NULL; unsigned int i, cpu = 1;
On Mon, Feb 02, 2015 at 12:45:41PM +0000, Hanjun Guo wrote:
MADT contains the information for MPIDR which is essential for SMP initialization, parse the GIC cpu interface structures to get the MPIDR value and map it to cpu_logical_map(), and add enabled cpu with valid MPIDR into cpu_possible_map.
ACPI 5.1 only has two explicit methods to boot up SMP, PSCI and Parking protocol, but the Parking protocol is only specified for ARMv7 now, so make PSCI as the only way for the SMP boot protocol before some updates for the ACPI spec or the Parking protocol spec.
Parking protocol patches for SMP boot will be sent to upstream when the new version of Parking protocol is ready.
CC: Lorenzo Pieralisi lorenzo.pieralisi@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Mark Rutland mark.rutland@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org
arch/arm64/include/asm/acpi.h | 2 + arch/arm64/include/asm/cpu_ops.h | 1 + arch/arm64/include/asm/smp.h | 5 +- arch/arm64/kernel/acpi.c | 150 ++++++++++++++++++++++++++++++++++++++- arch/arm64/kernel/cpu_ops.c | 2 +- arch/arm64/kernel/setup.c | 7 +- arch/arm64/kernel/smp.c | 2 +- 7 files changed, 161 insertions(+), 8 deletions(-)
[...]
+/**
- acpi_map_gic_cpu_interface - generates a logical cpu number
- and map to MPIDR represented by GICC structure
- @mpidr: CPU's hardware id to register, MPIDR represented in MADT
- @enabled: this cpu is enabled or not
- Returns the logical cpu number which maps to MPIDR
- */
+static int __init acpi_map_gic_cpu_interface(u64 mpidr, u8 enabled) +{
int cpu;
if (mpidr == INVALID_HWID) {
pr_info("Skip MADT cpu entry with invalid MPIDR\n");
return -EINVAL;
}
total_cpus++;
if (!enabled)
return -EINVAL;
if (enabled_cpus >= NR_CPUS) {
pr_warn("NR_CPUS limit of %d reached, Processor %d/0x%llx ignored.\n",
NR_CPUS, total_cpus, mpidr);
return -EINVAL;
}
/* No need to check duplicate MPIDRs for the first CPU */
if (enabled_cpus) {
/*
* Duplicate MPIDRs are a recipe for disaster. Scan
* all initialized entries and check for
* duplicates. If any is found just ignore the CPU.
*/
for_each_possible_cpu(cpu) {
if (cpu_logical_map(cpu) == mpidr) {
pr_err("Firmware bug, duplicate CPU MPIDR: 0x%llx in MADT\n",
mpidr);
return -EINVAL;
}
}
/* allocate a logical cpu id for the new comer */
cpu = cpumask_next_zero(-1, cpu_possible_mask);
} else {
/*
* First GICC entry must be BSP as ACPI spec said
* in section 5.2.12.15
*/
if (cpu_logical_map(0) != mpidr) {
pr_err("First GICC entry with MPIDR 0x%llx is not BSP\n",
mpidr);
return -EINVAL;
}
/*
* boot_cpu_init() already hold bit 0 in cpu_possible_mask
* for BSP, no need to allocate again.
*/
cpu = 0;
}
If/when kexec comes, on systems where CPU0 can be hotplugged the next kernel might boot on an AP rather than the BSP. Is there a requirement Linux-side that CPU0 is the BSP, or is this just intended as a sanity check of the tables the FW provided?
if (!acpi_psci_present())
return -EOPNOTSUPP;
cpu_ops[cpu] = cpu_get_ops("psci");
/* CPU 0 was already initialized */
if (cpu) {
if (!cpu_ops[cpu])
return -EINVAL;
if (cpu_ops[cpu]->cpu_init(NULL, cpu))
return -EOPNOTSUPP;
/* map the logical cpu id to cpu MPIDR */
cpu_logical_map(cpu) = mpidr;
set_cpu_possible(cpu, true);
}
In the OF case we only set CPUs possible once we've scanned all the nodes, and only when the boot CPU was actually found in a table. We should keep the ACPI case consistent with that.
Can we not handle all of this in a later call once we've scanned all of the GICC structures?
[...]
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 43ae914..1099ddc 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -449,13 +449,16 @@ void __init setup_arch(char **cmdline_p) if (acpi_disabled) { unflatten_device_tree(); psci_dt_init();
cpu_read_bootcpu_ops();
+#ifdef CONFIG_SMP
of_smp_init_cpus();
+#endif
I was going to say that it would be a little nicer if we had empty stubs for functions in the !SMP case, rather than #ifdefs all over the place. Unfortunately it looks like the way asm/smp.h is handled is generally a mess, so this isn't so bad for now.
Thanks, Mark.
On 2015年02月03日 21:53, Mark Rutland wrote:
On Mon, Feb 02, 2015 at 12:45:41PM +0000, Hanjun Guo wrote:
MADT contains the information for MPIDR which is essential for SMP initialization, parse the GIC cpu interface structures to get the MPIDR value and map it to cpu_logical_map(), and add enabled cpu with valid MPIDR into cpu_possible_map.
ACPI 5.1 only has two explicit methods to boot up SMP, PSCI and Parking protocol, but the Parking protocol is only specified for ARMv7 now, so make PSCI as the only way for the SMP boot protocol before some updates for the ACPI spec or the Parking protocol spec.
Parking protocol patches for SMP boot will be sent to upstream when the new version of Parking protocol is ready.
CC: Lorenzo Pieralisi lorenzo.pieralisi@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Mark Rutland mark.rutland@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org
arch/arm64/include/asm/acpi.h | 2 + arch/arm64/include/asm/cpu_ops.h | 1 + arch/arm64/include/asm/smp.h | 5 +- arch/arm64/kernel/acpi.c | 150 ++++++++++++++++++++++++++++++++++++++- arch/arm64/kernel/cpu_ops.c | 2 +- arch/arm64/kernel/setup.c | 7 +- arch/arm64/kernel/smp.c | 2 +- 7 files changed, 161 insertions(+), 8 deletions(-)
[...]
+/**
- acpi_map_gic_cpu_interface - generates a logical cpu number
- and map to MPIDR represented by GICC structure
- @mpidr: CPU's hardware id to register, MPIDR represented in MADT
- @enabled: this cpu is enabled or not
- Returns the logical cpu number which maps to MPIDR
- */
+static int __init acpi_map_gic_cpu_interface(u64 mpidr, u8 enabled) +{
int cpu;
if (mpidr == INVALID_HWID) {
pr_info("Skip MADT cpu entry with invalid MPIDR\n");
return -EINVAL;
}
total_cpus++;
if (!enabled)
return -EINVAL;
if (enabled_cpus >= NR_CPUS) {
pr_warn("NR_CPUS limit of %d reached, Processor %d/0x%llx ignored.\n",
NR_CPUS, total_cpus, mpidr);
return -EINVAL;
}
/* No need to check duplicate MPIDRs for the first CPU */
if (enabled_cpus) {
/*
* Duplicate MPIDRs are a recipe for disaster. Scan
* all initialized entries and check for
* duplicates. If any is found just ignore the CPU.
*/
for_each_possible_cpu(cpu) {
if (cpu_logical_map(cpu) == mpidr) {
pr_err("Firmware bug, duplicate CPU MPIDR: 0x%llx in MADT\n",
mpidr);
return -EINVAL;
}
}
/* allocate a logical cpu id for the new comer */
cpu = cpumask_next_zero(-1, cpu_possible_mask);
} else {
/*
* First GICC entry must be BSP as ACPI spec said
* in section 5.2.12.15
*/
if (cpu_logical_map(0) != mpidr) {
pr_err("First GICC entry with MPIDR 0x%llx is not BSP\n",
mpidr);
return -EINVAL;
}
/*
* boot_cpu_init() already hold bit 0 in cpu_possible_mask
* for BSP, no need to allocate again.
*/
cpu = 0;
}
If/when kexec comes, on systems where CPU0 can be hotplugged the next kernel might boot on an AP rather than the BSP.
so cpu_logical_map(0) will be the MPIDR of AP which boot the kernel, then it will not equal to mpidr provided in the first entry of MADT, right?
It seems that DT smp init will have the same problem, could you give me some guidance how it solved?
Is there a requirement Linux-side that CPU0 is the BSP, or is this just intended as a sanity check of the tables the FW provided?
It is just the check of the table that the FW provided, so in this kexec case, I think this code need to be reworked.
On x86, no check for the first LAPIC entry must be BSP, I think we need to remove the check for ARM64 too if it makes sense.
if (!acpi_psci_present())
return -EOPNOTSUPP;
cpu_ops[cpu] = cpu_get_ops("psci");
/* CPU 0 was already initialized */
if (cpu) {
if (!cpu_ops[cpu])
return -EINVAL;
if (cpu_ops[cpu]->cpu_init(NULL, cpu))
return -EOPNOTSUPP;
/* map the logical cpu id to cpu MPIDR */
cpu_logical_map(cpu) = mpidr;
set_cpu_possible(cpu, true);
}
In the OF case we only set CPUs possible once we've scanned all the nodes, and only when the boot CPU was actually found in a table. We should keep the ACPI case consistent with that.
Can we not handle all of this in a later call once we've scanned all of the GICC structures?
we can. the code will be same as DT ones, when all the structures are scanned, we can add the init code in acpi_init_cpus():
for (i = 0; i < NR_CPUS; i++) if (cpu_logical_map(i) != INVALID_HWID) set_cpu_possible(i, true);
but I think there is no difference for the logic, maybe I missed something.
[...]
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 43ae914..1099ddc 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -449,13 +449,16 @@ void __init setup_arch(char **cmdline_p) if (acpi_disabled) { unflatten_device_tree(); psci_dt_init();
cpu_read_bootcpu_ops();
+#ifdef CONFIG_SMP
of_smp_init_cpus();
+#endif
I was going to say that it would be a little nicer if we had empty stubs for functions in the !SMP case, rather than #ifdefs all over the place. Unfortunately it looks like the way asm/smp.h is handled is generally a mess, so this isn't so bad for now.
Yes, head file asm/smp.h which includes of_smp_init_cpus() only compiled with CONFIG_SMP correctly.
Thanks Hanjun
On Wed, Feb 04, 2015 at 09:05:13AM +0000, Hanjun Guo wrote:
On 2015年02月03日 21:53, Mark Rutland wrote:
On Mon, Feb 02, 2015 at 12:45:41PM +0000, Hanjun Guo wrote:
MADT contains the information for MPIDR which is essential for SMP initialization, parse the GIC cpu interface structures to get the MPIDR value and map it to cpu_logical_map(), and add enabled cpu with valid MPIDR into cpu_possible_map.
ACPI 5.1 only has two explicit methods to boot up SMP, PSCI and Parking protocol, but the Parking protocol is only specified for ARMv7 now, so make PSCI as the only way for the SMP boot protocol before some updates for the ACPI spec or the Parking protocol spec.
Parking protocol patches for SMP boot will be sent to upstream when the new version of Parking protocol is ready.
CC: Lorenzo Pieralisi lorenzo.pieralisi@arm.com CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com CC: Mark Rutland mark.rutland@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org
arch/arm64/include/asm/acpi.h | 2 + arch/arm64/include/asm/cpu_ops.h | 1 + arch/arm64/include/asm/smp.h | 5 +- arch/arm64/kernel/acpi.c | 150 ++++++++++++++++++++++++++++++++++++++- arch/arm64/kernel/cpu_ops.c | 2 +- arch/arm64/kernel/setup.c | 7 +- arch/arm64/kernel/smp.c | 2 +- 7 files changed, 161 insertions(+), 8 deletions(-)
[...]
+/**
- acpi_map_gic_cpu_interface - generates a logical cpu number
- and map to MPIDR represented by GICC structure
- @mpidr: CPU's hardware id to register, MPIDR represented in MADT
- @enabled: this cpu is enabled or not
- Returns the logical cpu number which maps to MPIDR
- */
+static int __init acpi_map_gic_cpu_interface(u64 mpidr, u8 enabled) +{
int cpu;
if (mpidr == INVALID_HWID) {
pr_info("Skip MADT cpu entry with invalid MPIDR\n");
return -EINVAL;
}
total_cpus++;
if (!enabled)
return -EINVAL;
if (enabled_cpus >= NR_CPUS) {
pr_warn("NR_CPUS limit of %d reached, Processor %d/0x%llx ignored.\n",
NR_CPUS, total_cpus, mpidr);
return -EINVAL;
}
/* No need to check duplicate MPIDRs for the first CPU */
if (enabled_cpus) {
/*
* Duplicate MPIDRs are a recipe for disaster. Scan
* all initialized entries and check for
* duplicates. If any is found just ignore the CPU.
*/
for_each_possible_cpu(cpu) {
if (cpu_logical_map(cpu) == mpidr) {
pr_err("Firmware bug, duplicate CPU MPIDR: 0x%llx in MADT\n",
mpidr);
return -EINVAL;
}
}
/* allocate a logical cpu id for the new comer */
cpu = cpumask_next_zero(-1, cpu_possible_mask);
} else {
/*
* First GICC entry must be BSP as ACPI spec said
* in section 5.2.12.15
*/
if (cpu_logical_map(0) != mpidr) {
pr_err("First GICC entry with MPIDR 0x%llx is not BSP\n",
mpidr);
return -EINVAL;
}
/*
* boot_cpu_init() already hold bit 0 in cpu_possible_mask
* for BSP, no need to allocate again.
*/
cpu = 0;
}
If/when kexec comes, on systems where CPU0 can be hotplugged the next kernel might boot on an AP rather than the BSP.
so cpu_logical_map(0) will be the MPIDR of AP which boot the kernel, then it will not equal to mpidr provided in the first entry of MADT, right?
Yes.
It seems that DT smp init will have the same problem, could you give me some guidance how it solved?
For DT we don't rely on the first entry we see in /cpus/ being CPU0 -- we loop over all entries and expect one of them to be CPU0. I that what you're asking about, or have I misunderstood the question?
Is there a requirement Linux-side that CPU0 is the BSP, or is this just intended as a sanity check of the tables the FW provided?
It is just the check of the table that the FW provided, so in this kexec case, I think this code need to be reworked.
On x86, no check for the first LAPIC entry must be BSP, I think we need to remove the check for ARM64 too if it makes sense.
Ok. It would be nice to know that there's no implicit assumption that ACPI makes about code executing on the BSP elsewhere; if so we may need to prevent CPU0 hotplug.
On x86 CPU0 hotplug is typically inhibited for suspend/resume and PIC-specific issues, and it's not clear to me if there are other requirements for CPU0 to stay online.
If the FW requires a particular CPU to stay online, then hopefully that will be reported through PSCI MIGRATE_INFO_UP_CPU, but we don't currently check that that in the PSCI code.
if (!acpi_psci_present())
return -EOPNOTSUPP;
cpu_ops[cpu] = cpu_get_ops("psci");
/* CPU 0 was already initialized */
if (cpu) {
if (!cpu_ops[cpu])
return -EINVAL;
if (cpu_ops[cpu]->cpu_init(NULL, cpu))
return -EOPNOTSUPP;
/* map the logical cpu id to cpu MPIDR */
cpu_logical_map(cpu) = mpidr;
set_cpu_possible(cpu, true);
}
In the OF case we only set CPUs possible once we've scanned all the nodes, and only when the boot CPU was actually found in a table. We should keep the ACPI case consistent with that.
Can we not handle all of this in a later call once we've scanned all of the GICC structures?
we can. the code will be same as DT ones, when all the structures are scanned, we can add the init code in acpi_init_cpus():
for (i = 0; i < NR_CPUS; i++) if (cpu_logical_map(i) != INVALID_HWID) set_cpu_possible(i, true);
but I think there is no difference for the logic, maybe I missed something.
With the ACPI code above, we mark each CPU possible as we scan it. In the DT case, if we fail to find the current CPU in the DTB, we don't mark any other nodes as possible. So in the DT case you don't get SMP if the current CPU is not in the table provided by FW, but in the ACPI case you would (when the CPU0 == BSP test is removed).
I would prefer that we have a strong requirement that the current CPU is in the tables in the ACPI case. It safeguards against obviously wrong tables.
Thanks, Mark.
On 2015年02月04日 18:30, Mark Rutland wrote:
On Wed, Feb 04, 2015 at 09:05:13AM +0000, Hanjun Guo wrote:
On 2015年02月03日 21:53, Mark Rutland wrote:
On Mon, Feb 02, 2015 at 12:45:41PM +0000, Hanjun Guo wrote:
MADT contains the information for MPIDR which is essential for SMP initialization, parse the GIC cpu interface structures to get the MPIDR value and map it to cpu_logical_map(), and add enabled cpu with valid MPIDR into cpu_possible_map.
ACPI 5.1 only has two explicit methods to boot up SMP, PSCI and Parking protocol, but the Parking protocol is only specified for ARMv7 now, so make PSCI as the only way for the SMP boot protocol before some updates for the ACPI spec or the Parking protocol spec.
Parking protocol patches for SMP boot will be sent to upstream when the new version of Parking protocol is ready.
[...]
/* No need to check duplicate MPIDRs for the first CPU */
if (enabled_cpus) {
/*
* Duplicate MPIDRs are a recipe for disaster. Scan
* all initialized entries and check for
* duplicates. If any is found just ignore the CPU.
*/
for_each_possible_cpu(cpu) {
if (cpu_logical_map(cpu) == mpidr) {
pr_err("Firmware bug, duplicate CPU MPIDR: 0x%llx in MADT\n",
mpidr);
return -EINVAL;
}
}
/* allocate a logical cpu id for the new comer */
cpu = cpumask_next_zero(-1, cpu_possible_mask);
} else {
/*
* First GICC entry must be BSP as ACPI spec said
* in section 5.2.12.15
*/
if (cpu_logical_map(0) != mpidr) {
pr_err("First GICC entry with MPIDR 0x%llx is not BSP\n",
mpidr);
return -EINVAL;
}
/*
* boot_cpu_init() already hold bit 0 in cpu_possible_mask
* for BSP, no need to allocate again.
*/
cpu = 0;
}
If/when kexec comes, on systems where CPU0 can be hotplugged the next kernel might boot on an AP rather than the BSP.
so cpu_logical_map(0) will be the MPIDR of AP which boot the kernel, then it will not equal to mpidr provided in the first entry of MADT, right?
Yes.
It seems that DT smp init will have the same problem, could you give me some guidance how it solved?
For DT we don't rely on the first entry we see in /cpus/ being CPU0 -- we loop over all entries and expect one of them to be CPU0. I that what you're asking about, or have I misunderstood the question?
That's what I asked, thanks for the explain. I think I need to rework this code a little bit and modify the logic as well.
Is there a requirement Linux-side that CPU0 is the BSP, or is this just intended as a sanity check of the tables the FW provided?
It is just the check of the table that the FW provided, so in this kexec case, I think this code need to be reworked.
On x86, no check for the first LAPIC entry must be BSP, I think we need to remove the check for ARM64 too if it makes sense.
Ok. It would be nice to know that there's no implicit assumption that ACPI makes about code executing on the BSP elsewhere; if so we may need to prevent CPU0 hotplug.
On x86 CPU0 hotplug is typically inhibited for suspend/resume and PIC-specific issues, and it's not clear to me if there are other requirements for CPU0 to stay online.
If the FW requires a particular CPU to stay online, then hopefully that will be reported through PSCI MIGRATE_INFO_UP_CPU, but we don't currently check that that in the PSCI code.
if (!acpi_psci_present())
return -EOPNOTSUPP;
cpu_ops[cpu] = cpu_get_ops("psci");
/* CPU 0 was already initialized */
if (cpu) {
if (!cpu_ops[cpu])
return -EINVAL;
if (cpu_ops[cpu]->cpu_init(NULL, cpu))
return -EOPNOTSUPP;
/* map the logical cpu id to cpu MPIDR */
cpu_logical_map(cpu) = mpidr;
set_cpu_possible(cpu, true);
}
In the OF case we only set CPUs possible once we've scanned all the nodes, and only when the boot CPU was actually found in a table. We should keep the ACPI case consistent with that.
Can we not handle all of this in a later call once we've scanned all of the GICC structures?
we can. the code will be same as DT ones, when all the structures are scanned, we can add the init code in acpi_init_cpus():
for (i = 0; i < NR_CPUS; i++) if (cpu_logical_map(i) != INVALID_HWID) set_cpu_possible(i, true);
but I think there is no difference for the logic, maybe I missed something.
With the ACPI code above, we mark each CPU possible as we scan it. In the DT case, if we fail to find the current CPU in the DTB, we don't mark any other nodes as possible. So in the DT case you don't get SMP if the current CPU is not in the table provided by FW, but in the ACPI case you would (when the CPU0 == BSP test is removed).
I would prefer that we have a strong requirement that the current CPU is in the tables in the ACPI case. It safeguards against obviously wrong tables.
OK, make sense to me too, I will update the code.
Thanks Hanjun
Introduce a new function map_gicc_mpidr() to allow MPIDRs to be obtained from the GICC Structure introduced by ACPI 5.1.
MPIDR is the CPU hardware ID as local APIC ID on x86 platform, so we use MPIDR not the GIC CPU interface ID to identify CPUs.
Further steps would typedef a phys_id_t for in arch code(with appropriate size and a corresponding invalid value, say ~0) and use that instead of an int in drivers/acpi/processor_core.c to store phys_id, then no need for mpidr packing.
CC: Rafael J. Wysocki rjw@rjwysocki.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/include/asm/acpi.h | 30 ++++++++++++++++++++++++++++++ drivers/acpi/processor_core.c | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 8984aa5..7e825b9 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -12,6 +12,8 @@ #ifndef _ASM_ACPI_H #define _ASM_ACPI_H
+#include <asm/smp_plat.h> + /* Basic configuration for ACPI */ #ifdef CONFIG_ACPI #define acpi_strict 1 /* No out-of-spec workarounds on ARM64 */ @@ -45,6 +47,34 @@ static inline void enable_acpi(void) acpi_noirq = 0; }
+/* MPIDR value provided in GICC structure is 64 bits, but the + * existing phys_id (CPU hardware ID) using in acpi processor + * driver is 32-bit, to conform to the same datatype we need + * to repack the GICC structure MPIDR. + * + * bits other than following 32 bits are defined as 0, so it + * will be no information lost after repacked. + * + * Bits [0:7] Aff0; + * Bits [8:15] Aff1; + * Bits [16:23] Aff2; + * Bits [32:39] Aff3; + */ +static inline u32 pack_mpidr(u64 mpidr) +{ + return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr; +} + +/* + * The ACPI processor driver for ACPI core code needs this macro + * to find out this cpu was already mapped (mapping from CPU hardware + * ID to CPU logical ID) or not. + * + * cpu_logical_map(cpu) is the mapping of MPIDR and the logical cpu, + * and MPIDR is the cpu hardware ID we needed to pack. + */ +#define cpu_physical_id(cpu) pack_mpidr(cpu_logical_map(cpu)) + /* * It's used from ACPI core in kdump to boot UP system with SMP kernel, * with this check the ACPI core will not override the CPU index diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c index 02e4839..5ac12e4 100644 --- a/drivers/acpi/processor_core.c +++ b/drivers/acpi/processor_core.c @@ -64,6 +64,38 @@ static int map_lsapic_id(struct acpi_subtable_header *entry, return 0; }
+/* + * On ARM platform, MPIDR value is the hardware ID as apic ID + * on Intel platforms + */ +static int map_gicc_mpidr(struct acpi_subtable_header *entry, + int device_declaration, u32 acpi_id, int *mpidr) +{ + struct acpi_madt_generic_interrupt *gicc = + container_of(entry, struct acpi_madt_generic_interrupt, header); + + if (!(gicc->flags & ACPI_MADT_ENABLED)) + return -ENODEV; + + /* In the GIC interrupt model, logical processors are + * required to have a Processor Device object in the DSDT, + * so we should check device_declaration here + */ + if (device_declaration && (gicc->uid == acpi_id)) { + /* + * bits other than [0:7] Aff0, [8:15] Aff1, [16:23] Aff2 and + * [32:39] Aff3 must be 0 which is defined in ACPI 5.1, so pack + * the Affx fields into a single 32 bit identifier to accommodate + * the acpi processor drivers. + */ + *mpidr = ((gicc->arm_mpidr & 0xff00000000) >> 8) + | gicc->arm_mpidr; + return 0; + } + + return -EINVAL; +} + static int map_madt_entry(int type, u32 acpi_id) { unsigned long madt_end, entry; @@ -99,6 +131,9 @@ static int map_madt_entry(int type, u32 acpi_id) } else if (header->type == ACPI_MADT_TYPE_LOCAL_SAPIC) { if (!map_lsapic_id(header, type, acpi_id, &phys_id)) break; + } else if (header->type == ACPI_MADT_TYPE_GENERIC_INTERRUPT) { + if (!map_gicc_mpidr(header, type, acpi_id, &phys_id)) + break; } entry += header->length; } @@ -131,6 +166,8 @@ static int map_mat_entry(acpi_handle handle, int type, u32 acpi_id) map_lsapic_id(header, type, acpi_id, &phys_id); else if (header->type == ACPI_MADT_TYPE_LOCAL_X2APIC) map_x2apic_id(header, type, acpi_id, &phys_id); + else if (header->type == ACPI_MADT_TYPE_GENERIC_INTERRUPT) + map_gicc_mpidr(header, type, acpi_id, &phys_id);
exit: kfree(buffer.pointer);
On Mon, Feb 02, 2015 at 12:45:42PM +0000, Hanjun Guo wrote:
Introduce a new function map_gicc_mpidr() to allow MPIDRs to be obtained from the GICC Structure introduced by ACPI 5.1.
MPIDR is the CPU hardware ID as local APIC ID on x86 platform, so we use MPIDR not the GIC CPU interface ID to identify CPUs.
Further steps would typedef a phys_id_t for in arch code(with appropriate size and a corresponding invalid value, say ~0) and use that instead of an int in drivers/acpi/processor_core.c to store phys_id, then no need for mpidr packing.
I don't understand why we don't fix this now, and I'm very worried that this patch leaves much potential for FW bugs due to potential Linux bugs.
Having a function called cpu_physical_id which _does not_ return a physical ID makes no sense to me. Any time we really need a physical ID, we're still going to have to unpack it (in an architecture-specific manner).
I am very worried that we're either going to miss some packing/unpacking, or FW tables are going to end up being written incorrectly to match (broken) assumptions that this causes Linux to make.
If we don't actually need the physical ID, but just an ID that fits in an int, then we have the logical ID which we can use instead. Having an intermediary ID that is neither logical nor physical is pointless.
While I appreciate that changing the core to allow for an architecutre-defined physical ID type is not necessarily a trivial or enjoyable exercise, I believe that it is necessary for the health of ACPI on ARM.
The points above have been brought up repeatedly. Please do not sweep them under the rug this time. Please fix this properly.
Thanks, Mark.
CC: Rafael J. Wysocki rjw@rjwysocki.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/include/asm/acpi.h | 30 ++++++++++++++++++++++++++++++ drivers/acpi/processor_core.c | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 8984aa5..7e825b9 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -12,6 +12,8 @@ #ifndef _ASM_ACPI_H #define _ASM_ACPI_H +#include <asm/smp_plat.h>
/* Basic configuration for ACPI */ #ifdef CONFIG_ACPI #define acpi_strict 1 /* No out-of-spec workarounds on ARM64 */ @@ -45,6 +47,34 @@ static inline void enable_acpi(void) acpi_noirq = 0; } +/* MPIDR value provided in GICC structure is 64 bits, but the
- existing phys_id (CPU hardware ID) using in acpi processor
- driver is 32-bit, to conform to the same datatype we need
- to repack the GICC structure MPIDR.
- bits other than following 32 bits are defined as 0, so it
- will be no information lost after repacked.
- Bits [0:7] Aff0;
- Bits [8:15] Aff1;
- Bits [16:23] Aff2;
- Bits [32:39] Aff3;
- */
+static inline u32 pack_mpidr(u64 mpidr) +{
- return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr;
+}
+/*
- The ACPI processor driver for ACPI core code needs this macro
- to find out this cpu was already mapped (mapping from CPU hardware
- ID to CPU logical ID) or not.
- cpu_logical_map(cpu) is the mapping of MPIDR and the logical cpu,
- and MPIDR is the cpu hardware ID we needed to pack.
- */
+#define cpu_physical_id(cpu) pack_mpidr(cpu_logical_map(cpu))
/*
- It's used from ACPI core in kdump to boot UP system with SMP kernel,
- with this check the ACPI core will not override the CPU index
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c index 02e4839..5ac12e4 100644 --- a/drivers/acpi/processor_core.c +++ b/drivers/acpi/processor_core.c @@ -64,6 +64,38 @@ static int map_lsapic_id(struct acpi_subtable_header *entry, return 0; } +/*
- On ARM platform, MPIDR value is the hardware ID as apic ID
- on Intel platforms
- */
+static int map_gicc_mpidr(struct acpi_subtable_header *entry,
int device_declaration, u32 acpi_id, int *mpidr)
+{
- struct acpi_madt_generic_interrupt *gicc =
container_of(entry, struct acpi_madt_generic_interrupt, header);
- if (!(gicc->flags & ACPI_MADT_ENABLED))
return -ENODEV;
- /* In the GIC interrupt model, logical processors are
* required to have a Processor Device object in the DSDT,
* so we should check device_declaration here
*/
- if (device_declaration && (gicc->uid == acpi_id)) {
/*
* bits other than [0:7] Aff0, [8:15] Aff1, [16:23] Aff2 and
* [32:39] Aff3 must be 0 which is defined in ACPI 5.1, so pack
* the Affx fields into a single 32 bit identifier to accommodate
* the acpi processor drivers.
*/
*mpidr = ((gicc->arm_mpidr & 0xff00000000) >> 8)
| gicc->arm_mpidr;
return 0;
- }
- return -EINVAL;
+}
static int map_madt_entry(int type, u32 acpi_id) { unsigned long madt_end, entry; @@ -99,6 +131,9 @@ static int map_madt_entry(int type, u32 acpi_id) } else if (header->type == ACPI_MADT_TYPE_LOCAL_SAPIC) { if (!map_lsapic_id(header, type, acpi_id, &phys_id)) break;
} else if (header->type == ACPI_MADT_TYPE_GENERIC_INTERRUPT) {
if (!map_gicc_mpidr(header, type, acpi_id, &phys_id))
} entry += header->length; }break;
@@ -131,6 +166,8 @@ static int map_mat_entry(acpi_handle handle, int type, u32 acpi_id) map_lsapic_id(header, type, acpi_id, &phys_id); else if (header->type == ACPI_MADT_TYPE_LOCAL_X2APIC) map_x2apic_id(header, type, acpi_id, &phys_id);
- else if (header->type == ACPI_MADT_TYPE_GENERIC_INTERRUPT)
map_gicc_mpidr(header, type, acpi_id, &phys_id);
exit: kfree(buffer.pointer); -- 1.9.1
On Tue, Feb 03, 2015 at 02:17:49PM +0000, Mark Rutland wrote:
On Mon, Feb 02, 2015 at 12:45:42PM +0000, Hanjun Guo wrote:
Introduce a new function map_gicc_mpidr() to allow MPIDRs to be obtained from the GICC Structure introduced by ACPI 5.1.
MPIDR is the CPU hardware ID as local APIC ID on x86 platform, so we use MPIDR not the GIC CPU interface ID to identify CPUs.
Further steps would typedef a phys_id_t for in arch code(with appropriate size and a corresponding invalid value, say ~0) and use that instead of an int in drivers/acpi/processor_core.c to store phys_id, then no need for mpidr packing.
I don't understand why we don't fix this now, and I'm very worried that this patch leaves much potential for FW bugs due to potential Linux bugs.
Having a function called cpu_physical_id which _does not_ return a physical ID makes no sense to me. Any time we really need a physical ID, we're still going to have to unpack it (in an architecture-specific manner).
Do you mean something like this? Only briefly tested on Juno and I may have missed other calls:
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b35c57b..4fafd62b1b86 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,33 +49,12 @@ static inline void enable_acpi(void) acpi_noirq = 0; }
-/* MPIDR value provided in GICC structure is 64 bits, but the - * existing phys_id (CPU hardware ID) using in acpi processor - * driver is 32-bit, to conform to the same datatype we need - * to repack the GICC structure MPIDR. - * - * bits other than following 32 bits are defined as 0, so it - * will be no information lost after repacked. - * - * Bits [0:7] Aff0; - * Bits [8:15] Aff1; - * Bits [16:23] Aff2; - * Bits [32:39] Aff3; - */ -static inline u32 pack_mpidr(u64 mpidr) -{ - return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr; -} - /* * The ACPI processor driver for ACPI core code needs this macro * to find out this cpu was already mapped (mapping from CPU hardware * ID to CPU logical ID) or not. - * - * cpu_logical_map(cpu) is the mapping of MPIDR and the logical cpu, - * and MPIDR is the cpu hardware ID we needed to pack. */ -#define cpu_physical_id(cpu) pack_mpidr(cpu_logical_map(cpu)) +#define cpu_physical_id(cpu) cpu_logical_map(cpu)
/* * It's used from ACPI core in kdump to boot UP system with SMP kernel, diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h index 59e282311b58..a492276e008d 100644 --- a/arch/arm64/include/asm/smp_plat.h +++ b/arch/arm64/include/asm/smp_plat.h @@ -40,4 +40,6 @@ static inline u32 mpidr_hash_size(void) extern u64 __cpu_logical_map[NR_CPUS]; #define cpu_logical_map(cpu) __cpu_logical_map[cpu]
+typedef u64 cpuid_t; + #endif /* __ASM_SMP_PLAT_H */ diff --git a/arch/ia64/include/asm/smp.h b/arch/ia64/include/asm/smp.h index fea21e986022..251c6af899d6 100644 --- a/arch/ia64/include/asm/smp.h +++ b/arch/ia64/include/asm/smp.h @@ -41,6 +41,8 @@ ia64_get_lid (void)
#define hard_smp_processor_id() ia64_get_lid()
+typedef int cpuid_t; + #ifdef CONFIG_SMP
#define XTP_OFFSET 0x1e0008 diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index 8cd1cc3bc835..638f7562ba99 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -83,6 +83,8 @@ struct smp_ops { /* Globals due to paravirt */ extern void set_cpu_sibling_map(int cpu);
+typedef int cpuid_t; + #ifdef CONFIG_SMP #ifndef CONFIG_PARAVIRT #define startup_ipi_hook(phys_apicid, start_eip, start_esp) do { } while (0) diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 1020b1b53a17..3e1a6b3ee3b8 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -215,7 +215,8 @@ static int acpi_processor_get_info(struct acpi_device *device) union acpi_object object = { 0 }; struct acpi_buffer buffer = { sizeof(union acpi_object), &object }; struct acpi_processor *pr = acpi_driver_data(device); - int phys_id, cpu_index, device_declaration = 0; + cpuid_t phys_id; + int cpu_index, device_declaration = 0; acpi_status status = AE_OK; static int cpu0_initialized; unsigned long long value; diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c index 5ac12e40fedd..a2735c315ef7 100644 --- a/drivers/acpi/processor_core.c +++ b/drivers/acpi/processor_core.c @@ -13,7 +13,7 @@ ACPI_MODULE_NAME("processor_core");
static int map_lapic_id(struct acpi_subtable_header *entry, - u32 acpi_id, int *apic_id) + u32 acpi_id, cpuid_t *apic_id) { struct acpi_madt_local_apic *lapic = container_of(entry, struct acpi_madt_local_apic, header); @@ -29,7 +29,7 @@ static int map_lapic_id(struct acpi_subtable_header *entry, }
static int map_x2apic_id(struct acpi_subtable_header *entry, - int device_declaration, u32 acpi_id, int *apic_id) + int device_declaration, u32 acpi_id, cpuid_t *apic_id) { struct acpi_madt_local_x2apic *apic = container_of(entry, struct acpi_madt_local_x2apic, header); @@ -46,7 +46,7 @@ static int map_x2apic_id(struct acpi_subtable_header *entry, }
static int map_lsapic_id(struct acpi_subtable_header *entry, - int device_declaration, u32 acpi_id, int *apic_id) + int device_declaration, u32 acpi_id, cpuid_t *apic_id) { struct acpi_madt_local_sapic *lsapic = container_of(entry, struct acpi_madt_local_sapic, header); @@ -69,7 +69,7 @@ static int map_lsapic_id(struct acpi_subtable_header *entry, * on Intel platforms */ static int map_gicc_mpidr(struct acpi_subtable_header *entry, - int device_declaration, u32 acpi_id, int *mpidr) + int device_declaration, u32 acpi_id, cpuid_t *mpidr) { struct acpi_madt_generic_interrupt *gicc = container_of(entry, struct acpi_madt_generic_interrupt, header); @@ -84,24 +84,21 @@ static int map_gicc_mpidr(struct acpi_subtable_header *entry, if (device_declaration && (gicc->uid == acpi_id)) { /* * bits other than [0:7] Aff0, [8:15] Aff1, [16:23] Aff2 and - * [32:39] Aff3 must be 0 which is defined in ACPI 5.1, so pack - * the Affx fields into a single 32 bit identifier to accommodate - * the acpi processor drivers. + * [32:39] Aff3 must be 0 which is defined in ACPI 5.1 */ - *mpidr = ((gicc->arm_mpidr & 0xff00000000) >> 8) - | gicc->arm_mpidr; + *mpidr = gicc->arm_mpidr & 0xff00ffffffUL; return 0; }
return -EINVAL; }
-static int map_madt_entry(int type, u32 acpi_id) +static cpuid_t map_madt_entry(int type, u32 acpi_id) { unsigned long madt_end, entry; static struct acpi_table_madt *madt; static int read_madt; - int phys_id = -1; /* CPU hardware ID */ + cpuid_t phys_id = -1; /* CPU hardware ID */
if (!read_madt) { if (ACPI_FAILURE(acpi_get_table(ACPI_SIG_MADT, 0, @@ -145,7 +142,7 @@ static int map_mat_entry(acpi_handle handle, int type, u32 acpi_id) struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; union acpi_object *obj; struct acpi_subtable_header *header; - int phys_id = -1; + cpuid_t phys_id = -1;
if (ACPI_FAILURE(acpi_evaluate_object(handle, "_MAT", NULL, &buffer))) goto exit; @@ -174,7 +171,7 @@ exit: return phys_id; }
-int acpi_get_phys_id(acpi_handle handle, int type, u32 acpi_id) +cpuid_t acpi_get_phys_id(acpi_handle handle, int type, u32 acpi_id) { int phys_id;
@@ -185,7 +182,7 @@ int acpi_get_phys_id(acpi_handle handle, int type, u32 acpi_id) return phys_id; }
-int acpi_map_cpuid(int phys_id, u32 acpi_id) +int acpi_map_cpuid(cpuid_t phys_id, u32 acpi_id) { #ifdef CONFIG_SMP int i; @@ -213,9 +210,9 @@ int acpi_map_cpuid(int phys_id, u32 acpi_id) * Return -1 for other CPU's handle. */ if (nr_cpu_ids <= 1 && acpi_id == 0) - return acpi_id; + return 0; else - return phys_id; + return -1; }
#ifdef CONFIG_SMP @@ -233,7 +230,7 @@ int acpi_map_cpuid(int phys_id, u32 acpi_id)
int acpi_get_cpuid(acpi_handle handle, int type, u32 acpi_id) { - int phys_id; + cpuid_t phys_id;
phys_id = acpi_get_phys_id(handle, type, acpi_id);
diff --git a/include/acpi/processor.h b/include/acpi/processor.h index b95dc32a6e6b..30ebdbf0d961 100644 --- a/include/acpi/processor.h +++ b/include/acpi/processor.h @@ -196,7 +196,7 @@ struct acpi_processor_flags { struct acpi_processor { acpi_handle handle; u32 acpi_id; - u32 phys_id; /* CPU hardware ID such as APIC ID for x86 */ + cpuid_t phys_id; /* CPU hardware ID such as APIC ID for x86 */ u32 id; /* CPU logical ID allocated by OS */ u32 pblk; int performance_platform_limit; @@ -310,8 +310,8 @@ static inline int acpi_processor_get_bios_limit(int cpu, unsigned int *limit) #endif /* CONFIG_CPU_FREQ */
/* in processor_core.c */ -int acpi_get_phys_id(acpi_handle, int type, u32 acpi_id); -int acpi_map_cpuid(int phys_id, u32 acpi_id); +cpuid_t acpi_get_phys_id(acpi_handle, int type, u32 acpi_id); +int acpi_map_cpuid(cpuid_t phys_id, u32 acpi_id); int acpi_get_cpuid(acpi_handle, int type, u32 acpi_id);
/* in processor_pdc.c */
On 2015年02月04日 04:09, Catalin Marinas wrote:
On Tue, Feb 03, 2015 at 02:17:49PM +0000, Mark Rutland wrote:
On Mon, Feb 02, 2015 at 12:45:42PM +0000, Hanjun Guo wrote:
Introduce a new function map_gicc_mpidr() to allow MPIDRs to be obtained from the GICC Structure introduced by ACPI 5.1.
MPIDR is the CPU hardware ID as local APIC ID on x86 platform, so we use MPIDR not the GIC CPU interface ID to identify CPUs.
Further steps would typedef a phys_id_t for in arch code(with appropriate size and a corresponding invalid value, say ~0) and use that instead of an int in drivers/acpi/processor_core.c to store phys_id, then no need for mpidr packing.
I don't understand why we don't fix this now, and I'm very worried that this patch leaves much potential for FW bugs due to potential Linux bugs.
Having a function called cpu_physical_id which _does not_ return a physical ID makes no sense to me. Any time we really need a physical ID, we're still going to have to unpack it (in an architecture-specific manner).
Do you mean something like this? Only briefly tested on Juno and I may have missed other calls:
Thanks, I think it is Mark's suggestion (and also Lorenzo's)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b35c57b..4fafd62b1b86 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,33 +49,12 @@ static inline void enable_acpi(void) acpi_noirq = 0; }
-/* MPIDR value provided in GICC structure is 64 bits, but the
- existing phys_id (CPU hardware ID) using in acpi processor
- driver is 32-bit, to conform to the same datatype we need
- to repack the GICC structure MPIDR.
- bits other than following 32 bits are defined as 0, so it
- will be no information lost after repacked.
- Bits [0:7] Aff0;
- Bits [8:15] Aff1;
- Bits [16:23] Aff2;
- Bits [32:39] Aff3;
- */
-static inline u32 pack_mpidr(u64 mpidr) -{
- return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr;
-}
- /*
- The ACPI processor driver for ACPI core code needs this macro
- to find out this cpu was already mapped (mapping from CPU hardware
- ID to CPU logical ID) or not.
- cpu_logical_map(cpu) is the mapping of MPIDR and the logical cpu,
*/
- and MPIDR is the cpu hardware ID we needed to pack.
-#define cpu_physical_id(cpu) pack_mpidr(cpu_logical_map(cpu)) +#define cpu_physical_id(cpu) cpu_logical_map(cpu)
/*
- It's used from ACPI core in kdump to boot UP system with SMP kernel,
diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h index 59e282311b58..a492276e008d 100644 --- a/arch/arm64/include/asm/smp_plat.h +++ b/arch/arm64/include/asm/smp_plat.h @@ -40,4 +40,6 @@ static inline u32 mpidr_hash_size(void) extern u64 __cpu_logical_map[NR_CPUS]; #define cpu_logical_map(cpu) __cpu_logical_map[cpu]
+typedef u64 cpuid_t;
I think cpuid_t is a little confused because people may recognize it as cpu logical id, its original meaning is the physical cpu ID, so how about:
typedef u64 phys_id_t; ?
Thanks Hanjun
On Wed, Feb 04, 2015 at 09:48:05AM +0000, Hanjun Guo wrote:
On 2015年02月04日 04:09, Catalin Marinas wrote:
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b35c57b..4fafd62b1b86 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,33 +49,12 @@ static inline void enable_acpi(void) acpi_noirq = 0; }
-/* MPIDR value provided in GICC structure is 64 bits, but the
- existing phys_id (CPU hardware ID) using in acpi processor
- driver is 32-bit, to conform to the same datatype we need
- to repack the GICC structure MPIDR.
- bits other than following 32 bits are defined as 0, so it
- will be no information lost after repacked.
- Bits [0:7] Aff0;
- Bits [8:15] Aff1;
- Bits [16:23] Aff2;
- Bits [32:39] Aff3;
- */
-static inline u32 pack_mpidr(u64 mpidr) -{
- return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr;
-}
- /*
- The ACPI processor driver for ACPI core code needs this macro
- to find out this cpu was already mapped (mapping from CPU hardware
- ID to CPU logical ID) or not.
- cpu_logical_map(cpu) is the mapping of MPIDR and the logical cpu,
*/
- and MPIDR is the cpu hardware ID we needed to pack.
-#define cpu_physical_id(cpu) pack_mpidr(cpu_logical_map(cpu)) +#define cpu_physical_id(cpu) cpu_logical_map(cpu)
/*
- It's used from ACPI core in kdump to boot UP system with SMP kernel,
diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h index 59e282311b58..a492276e008d 100644 --- a/arch/arm64/include/asm/smp_plat.h +++ b/arch/arm64/include/asm/smp_plat.h @@ -40,4 +40,6 @@ static inline u32 mpidr_hash_size(void) extern u64 __cpu_logical_map[NR_CPUS]; #define cpu_logical_map(cpu) __cpu_logical_map[cpu]
+typedef u64 cpuid_t;
I think cpuid_t is a little confused because people may recognize it as cpu logical id, its original meaning is the physical cpu ID, so how about:
typedef u64 phys_id_t; ?
I would keep "cpu" somewhere in the name as "phys" is too generic, maybe phys_cpuid_t.
On 2015年02月04日 19:21, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 09:48:05AM +0000, Hanjun Guo wrote:
On 2015年02月04日 04:09, Catalin Marinas wrote:
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b35c57b..4fafd62b1b86 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,33 +49,12 @@ static inline void enable_acpi(void) acpi_noirq = 0; }
-/* MPIDR value provided in GICC structure is 64 bits, but the
- existing phys_id (CPU hardware ID) using in acpi processor
- driver is 32-bit, to conform to the same datatype we need
- to repack the GICC structure MPIDR.
- bits other than following 32 bits are defined as 0, so it
- will be no information lost after repacked.
- Bits [0:7] Aff0;
- Bits [8:15] Aff1;
- Bits [16:23] Aff2;
- Bits [32:39] Aff3;
- */
-static inline u32 pack_mpidr(u64 mpidr) -{
- return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr;
-}
- /*
- The ACPI processor driver for ACPI core code needs this macro
- to find out this cpu was already mapped (mapping from CPU hardware
- ID to CPU logical ID) or not.
- cpu_logical_map(cpu) is the mapping of MPIDR and the logical cpu,
- and MPIDR is the cpu hardware ID we needed to pack. */
-#define cpu_physical_id(cpu) pack_mpidr(cpu_logical_map(cpu)) +#define cpu_physical_id(cpu) cpu_logical_map(cpu)
/* * It's used from ACPI core in kdump to boot UP system with SMP kernel, diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h index 59e282311b58..a492276e008d 100644 --- a/arch/arm64/include/asm/smp_plat.h +++ b/arch/arm64/include/asm/smp_plat.h @@ -40,4 +40,6 @@ static inline u32 mpidr_hash_size(void) extern u64 __cpu_logical_map[NR_CPUS]; #define cpu_logical_map(cpu) __cpu_logical_map[cpu]
+typedef u64 cpuid_t;
I think cpuid_t is a little confused because people may recognize it as cpu logical id, its original meaning is the physical cpu ID, so how about:
typedef u64 phys_id_t; ?
I would keep "cpu" somewhere in the name as "phys" is too generic, maybe phys_cpuid_t.
This is pretty fine to me too :)
x86 and IA64 use 32 bit cpu phys_id (apic_id) everywhere in the arch code, but I think I don't need to touch them in this patch.
Thanks Hanjun
On Thu, Feb 05, 2015 at 09:27:15AM +0000, Hanjun Guo wrote:
On 2015年02月04日 19:21, Catalin Marinas wrote:
On Wed, Feb 04, 2015 at 09:48:05AM +0000, Hanjun Guo wrote:
On 2015年02月04日 04:09, Catalin Marinas wrote:
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index ea4d2b35c57b..4fafd62b1b86 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -49,33 +49,12 @@ static inline void enable_acpi(void) acpi_noirq = 0; }
-/* MPIDR value provided in GICC structure is 64 bits, but the
- existing phys_id (CPU hardware ID) using in acpi processor
- driver is 32-bit, to conform to the same datatype we need
- to repack the GICC structure MPIDR.
- bits other than following 32 bits are defined as 0, so it
- will be no information lost after repacked.
- Bits [0:7] Aff0;
- Bits [8:15] Aff1;
- Bits [16:23] Aff2;
- Bits [32:39] Aff3;
- */
-static inline u32 pack_mpidr(u64 mpidr) -{
- return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr;
-}
- /*
- The ACPI processor driver for ACPI core code needs this macro
- to find out this cpu was already mapped (mapping from CPU hardware
- ID to CPU logical ID) or not.
- cpu_logical_map(cpu) is the mapping of MPIDR and the logical cpu,
- and MPIDR is the cpu hardware ID we needed to pack. */
-#define cpu_physical_id(cpu) pack_mpidr(cpu_logical_map(cpu)) +#define cpu_physical_id(cpu) cpu_logical_map(cpu)
/* * It's used from ACPI core in kdump to boot UP system with SMP kernel, diff --git a/arch/arm64/include/asm/smp_plat.h b/arch/arm64/include/asm/smp_plat.h index 59e282311b58..a492276e008d 100644 --- a/arch/arm64/include/asm/smp_plat.h +++ b/arch/arm64/include/asm/smp_plat.h @@ -40,4 +40,6 @@ static inline u32 mpidr_hash_size(void) extern u64 __cpu_logical_map[NR_CPUS]; #define cpu_logical_map(cpu) __cpu_logical_map[cpu]
+typedef u64 cpuid_t;
I think cpuid_t is a little confused because people may recognize it as cpu logical id, its original meaning is the physical cpu ID, so how about:
typedef u64 phys_id_t; ?
I would keep "cpu" somewhere in the name as "phys" is too generic, maybe phys_cpuid_t.
This is pretty fine to me too :)
x86 and IA64 use 32 bit cpu phys_id (apic_id) everywhere in the arch code, but I think I don't need to touch them in this patch.
No, as long as you define phys_cpuid_t to be 32-bit (I guess an int) on these architectures.
On Mon, Feb 02, 2015 at 12:45:42PM +0000, Hanjun Guo wrote:
Introduce a new function map_gicc_mpidr() to allow MPIDRs to be obtained from the GICC Structure introduced by ACPI 5.1.
MPIDR is the CPU hardware ID as local APIC ID on x86 platform, so we use MPIDR not the GIC CPU interface ID to identify CPUs.
Further steps would typedef a phys_id_t for in arch code(with appropriate size and a corresponding invalid value, say ~0) and use that instead of an int in drivers/acpi/processor_core.c to store phys_id, then no need for mpidr packing.
CC: Rafael J. Wysocki rjw@rjwysocki.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/include/asm/acpi.h | 30 ++++++++++++++++++++++++++++++ drivers/acpi/processor_core.c | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 8984aa5..7e825b9 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -12,6 +12,8 @@ #ifndef _ASM_ACPI_H #define _ASM_ACPI_H +#include <asm/smp_plat.h>
/* Basic configuration for ACPI */ #ifdef CONFIG_ACPI #define acpi_strict 1 /* No out-of-spec workarounds on ARM64 */ @@ -45,6 +47,34 @@ static inline void enable_acpi(void) acpi_noirq = 0; } +/* MPIDR value provided in GICC structure is 64 bits, but the
- existing phys_id (CPU hardware ID) using in acpi processor
- driver is 32-bit, to conform to the same datatype we need
- to repack the GICC structure MPIDR.
- bits other than following 32 bits are defined as 0, so it
- will be no information lost after repacked.
- Bits [0:7] Aff0;
- Bits [8:15] Aff1;
- Bits [16:23] Aff2;
- Bits [32:39] Aff3;
- */
+static inline u32 pack_mpidr(u64 mpidr) +{
- return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr;
+}
I'm a bit puzzled by this packing:
- Bit 31 of the MPIDR is RES1. Do we need to mask it out first? - How does this work for uniprocessor systems where bit 30 is set? - Similarly for mythical multi-threaded implementations with bit 24 set.
Will
On Mon, Feb 09, 2015 at 06:55:12AM +0000, Will Deacon wrote:
On Mon, Feb 02, 2015 at 12:45:42PM +0000, Hanjun Guo wrote:
Introduce a new function map_gicc_mpidr() to allow MPIDRs to be obtained from the GICC Structure introduced by ACPI 5.1.
MPIDR is the CPU hardware ID as local APIC ID on x86 platform, so we use MPIDR not the GIC CPU interface ID to identify CPUs.
Further steps would typedef a phys_id_t for in arch code(with appropriate size and a corresponding invalid value, say ~0) and use that instead of an int in drivers/acpi/processor_core.c to store phys_id, then no need for mpidr packing.
CC: Rafael J. Wysocki rjw@rjwysocki.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/include/asm/acpi.h | 30 ++++++++++++++++++++++++++++++ drivers/acpi/processor_core.c | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 8984aa5..7e825b9 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -12,6 +12,8 @@ #ifndef _ASM_ACPI_H #define _ASM_ACPI_H +#include <asm/smp_plat.h>
/* Basic configuration for ACPI */ #ifdef CONFIG_ACPI #define acpi_strict 1 /* No out-of-spec workarounds on ARM64 */ @@ -45,6 +47,34 @@ static inline void enable_acpi(void) acpi_noirq = 0; } +/* MPIDR value provided in GICC structure is 64 bits, but the
- existing phys_id (CPU hardware ID) using in acpi processor
- driver is 32-bit, to conform to the same datatype we need
- to repack the GICC structure MPIDR.
- bits other than following 32 bits are defined as 0, so it
- will be no information lost after repacked.
- Bits [0:7] Aff0;
- Bits [8:15] Aff1;
- Bits [16:23] Aff2;
- Bits [32:39] Aff3;
- */
+static inline u32 pack_mpidr(u64 mpidr) +{
- return (u32) ((mpidr & 0xff00000000) >> 8) | mpidr;
+}
I'm a bit puzzled by this packing:
- Bit 31 of the MPIDR is RES1. Do we need to mask it out first?
- How does this work for uniprocessor systems where bit 30 is set?
I asked about this on a previous version of the patches but the comment was only clarified in the map_gicc_mpidr() function (which duplicates the packing here). This is not the real MPIDR but the one passed from ACPI tables, so bits 24-31 are 0.
- Similarly for mythical multi-threaded implementations with bit 24 set.
Anyway, as I posted here:
http://article.gmane.org/gmane.linux.acpi.devel/73422
I think this function should go. I don't see the point of MPIDR packing just because we can't use a proper 64-bit type here.
Introduce ACPI_IRQ_MODEL_GIC which is needed for ARM64 as GIC is used, and then register device's gsi with the core IRQ subsystem.
acpi_register_gsi() is similar to DT based irq_of_parse_and_map(), since gsi is unique in the system, so use hwirq number directly for the mapping.
We are going to implement stacked domains when GICv2m, GICv3, ITS support are added.
CC: Marc Zyngier marc.zyngier@arm.com Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/kernel/acpi.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++ drivers/acpi/bus.c | 3 ++ include/linux/acpi.h | 1 + 3 files changed, 77 insertions(+)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index f80caef..f86a982 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -38,6 +38,12 @@ EXPORT_SYMBOL(acpi_pci_disabled); static int enabled_cpus; /* Processors (GICC) with enabled flag in MADT */
/* + * Since we're on ARM, the default interrupt routing model + * clearly has to be GIC. + */ +enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_GIC; + +/* * __acpi_map_table() will be called before page_init(), so early_ioremap() * or early_memremap() should be called here to for ACPI table mapping. */ @@ -185,6 +191,73 @@ void __init acpi_init_cpus(void) pr_info("%d CPUs enabled, %d CPUs total\n", enabled_cpus, total_cpus); }
+int acpi_gsi_to_irq(u32 gsi, unsigned int *irq) +{ + *irq = irq_find_mapping(NULL, gsi); + + return 0; +} +EXPORT_SYMBOL_GPL(acpi_gsi_to_irq); + +/* + * success: return IRQ number (>0) + * failure: return =< 0 + */ +int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) +{ + unsigned int irq; + unsigned int irq_type; + + /* + * ACPI have no bindings to indicate SPI or PPI, so we + * use different mappings from DT in ACPI. + * + * For FDT + * PPI interrupt: in the range [0, 15]; + * SPI interrupt: in the range [0, 987]; + * + * For ACPI, GSI should be unique so using + * the hwirq directly for the mapping: + * PPI interrupt: in the range [16, 31]; + * SPI interrupt: in the range [32, 1019]; + */ + + if (trigger == ACPI_EDGE_SENSITIVE && + polarity == ACPI_ACTIVE_LOW) + irq_type = IRQ_TYPE_EDGE_FALLING; + else if (trigger == ACPI_EDGE_SENSITIVE && + polarity == ACPI_ACTIVE_HIGH) + irq_type = IRQ_TYPE_EDGE_RISING; + else if (trigger == ACPI_LEVEL_SENSITIVE && + polarity == ACPI_ACTIVE_LOW) + irq_type = IRQ_TYPE_LEVEL_LOW; + else if (trigger == ACPI_LEVEL_SENSITIVE && + polarity == ACPI_ACTIVE_HIGH) + irq_type = IRQ_TYPE_LEVEL_HIGH; + else + irq_type = IRQ_TYPE_NONE; + + /* + * Since only one GIC is supported in ACPI 5.0, we can + * create mapping refer to the default domain + */ + irq = irq_create_mapping(NULL, gsi); + if (!irq) + return irq; + + /* Set irq type if specified and different than the current one */ + if (irq_type != IRQ_TYPE_NONE && + irq_type != irq_get_trigger_type(irq)) + irq_set_irq_type(irq, irq_type); + return irq; +} +EXPORT_SYMBOL_GPL(acpi_register_gsi); + +void acpi_unregister_gsi(u32 gsi) +{ +} +EXPORT_SYMBOL_GPL(acpi_unregister_gsi); + static int __init acpi_parse_fadt(struct acpi_table_header *table) { struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table; diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 8b67bd0..c412fdb 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -448,6 +448,9 @@ static int __init acpi_bus_init_irq(void) case ACPI_IRQ_MODEL_IOSAPIC: message = "IOSAPIC"; break; + case ACPI_IRQ_MODEL_GIC: + message = "GIC"; + break; case ACPI_IRQ_MODEL_PLATFORM: message = "platform specific model"; break; diff --git a/include/linux/acpi.h b/include/linux/acpi.h index d459cd1..87f365e 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -72,6 +72,7 @@ enum acpi_irq_model_id { ACPI_IRQ_MODEL_IOAPIC, ACPI_IRQ_MODEL_IOSAPIC, ACPI_IRQ_MODEL_PLATFORM, + ACPI_IRQ_MODEL_GIC, ACPI_IRQ_MODEL_COUNT };
On Mon, Feb 02, 2015 at 12:45:43PM +0000, Hanjun Guo wrote:
Introduce ACPI_IRQ_MODEL_GIC which is needed for ARM64 as GIC is used, and then register device's gsi with the core IRQ subsystem.
acpi_register_gsi() is similar to DT based irq_of_parse_and_map(), since gsi is unique in the system, so use hwirq number directly for the mapping.
We are going to implement stacked domains when GICv2m, GICv3, ITS support are added.
CC: Marc Zyngier marc.zyngier@arm.com Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/kernel/acpi.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++ drivers/acpi/bus.c | 3 ++ include/linux/acpi.h | 1 + 3 files changed, 77 insertions(+)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index f80caef..f86a982 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -38,6 +38,12 @@ EXPORT_SYMBOL(acpi_pci_disabled); static int enabled_cpus; /* Processors (GICC) with enabled flag in MADT */ /*
- Since we're on ARM, the default interrupt routing model
- clearly has to be GIC.
- */
+enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_GIC;
+/*
- __acpi_map_table() will be called before page_init(), so early_ioremap()
- or early_memremap() should be called here to for ACPI table mapping.
*/ @@ -185,6 +191,73 @@ void __init acpi_init_cpus(void) pr_info("%d CPUs enabled, %d CPUs total\n", enabled_cpus, total_cpus); } +int acpi_gsi_to_irq(u32 gsi, unsigned int *irq) +{
- *irq = irq_find_mapping(NULL, gsi);
- return 0;
+} +EXPORT_SYMBOL_GPL(acpi_gsi_to_irq);
+/*
- success: return IRQ number (>0)
- failure: return =< 0
- */
+int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) +{
- unsigned int irq;
- unsigned int irq_type;
- /*
* ACPI have no bindings to indicate SPI or PPI, so we
* use different mappings from DT in ACPI.
*
* For FDT
* PPI interrupt: in the range [0, 15];
* SPI interrupt: in the range [0, 987];
*
* For ACPI, GSI should be unique so using
* the hwirq directly for the mapping:
* PPI interrupt: in the range [16, 31];
* SPI interrupt: in the range [32, 1019];
*/
- if (trigger == ACPI_EDGE_SENSITIVE &&
polarity == ACPI_ACTIVE_LOW)
irq_type = IRQ_TYPE_EDGE_FALLING;
- else if (trigger == ACPI_EDGE_SENSITIVE &&
polarity == ACPI_ACTIVE_HIGH)
irq_type = IRQ_TYPE_EDGE_RISING;
- else if (trigger == ACPI_LEVEL_SENSITIVE &&
polarity == ACPI_ACTIVE_LOW)
irq_type = IRQ_TYPE_LEVEL_LOW;
- else if (trigger == ACPI_LEVEL_SENSITIVE &&
polarity == ACPI_ACTIVE_HIGH)
irq_type = IRQ_TYPE_LEVEL_HIGH;
- else
irq_type = IRQ_TYPE_NONE;
- /*
* Since only one GIC is supported in ACPI 5.0, we can
* create mapping refer to the default domain
*/
- irq = irq_create_mapping(NULL, gsi);
- if (!irq)
return irq;
- /* Set irq type if specified and different than the current one */
- if (irq_type != IRQ_TYPE_NONE &&
irq_type != irq_get_trigger_type(irq))
irq_set_irq_type(irq, irq_type);
- return irq;
+} +EXPORT_SYMBOL_GPL(acpi_register_gsi);
+void acpi_unregister_gsi(u32 gsi) +{ +} +EXPORT_SYMBOL_GPL(acpi_unregister_gsi);
static int __init acpi_parse_fadt(struct acpi_table_header *table) { struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table;
Does this code *have* to sit under arch/arm64? I can't see anything architecture-specific about it and the bulk of the functions map directly onto irq domain callbacks. I know that the answer is probably "we can fix that in the future", but it doesn't seem like a huge amount of effort to get the right abstractions in place from the beginning so that we don't have to churn this stuff later on.
Will
On 2015年02月09日 14:34, Will Deacon wrote:
On Mon, Feb 02, 2015 at 12:45:43PM +0000, Hanjun Guo wrote:
Introduce ACPI_IRQ_MODEL_GIC which is needed for ARM64 as GIC is used, and then register device's gsi with the core IRQ subsystem.
acpi_register_gsi() is similar to DT based irq_of_parse_and_map(), since gsi is unique in the system, so use hwirq number directly for the mapping.
We are going to implement stacked domains when GICv2m, GICv3, ITS support are added.
CC: Marc Zyngier marc.zyngier@arm.com Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/kernel/acpi.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++ drivers/acpi/bus.c | 3 ++ include/linux/acpi.h | 1 + 3 files changed, 77 insertions(+)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index f80caef..f86a982 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -38,6 +38,12 @@ EXPORT_SYMBOL(acpi_pci_disabled); static int enabled_cpus; /* Processors (GICC) with enabled flag in MADT */
/*
- Since we're on ARM, the default interrupt routing model
- clearly has to be GIC.
- */
+enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_GIC;
+/*
- __acpi_map_table() will be called before page_init(), so early_ioremap()
- or early_memremap() should be called here to for ACPI table mapping.
*/ @@ -185,6 +191,73 @@ void __init acpi_init_cpus(void) pr_info("%d CPUs enabled, %d CPUs total\n", enabled_cpus, total_cpus); }
+int acpi_gsi_to_irq(u32 gsi, unsigned int *irq) +{
- *irq = irq_find_mapping(NULL, gsi);
- return 0;
+} +EXPORT_SYMBOL_GPL(acpi_gsi_to_irq);
+/*
- success: return IRQ number (>0)
- failure: return =< 0
- */
+int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) +{
- unsigned int irq;
- unsigned int irq_type;
- /*
* ACPI have no bindings to indicate SPI or PPI, so we
* use different mappings from DT in ACPI.
*
* For FDT
* PPI interrupt: in the range [0, 15];
* SPI interrupt: in the range [0, 987];
*
* For ACPI, GSI should be unique so using
* the hwirq directly for the mapping:
* PPI interrupt: in the range [16, 31];
* SPI interrupt: in the range [32, 1019];
*/
- if (trigger == ACPI_EDGE_SENSITIVE &&
polarity == ACPI_ACTIVE_LOW)
irq_type = IRQ_TYPE_EDGE_FALLING;
- else if (trigger == ACPI_EDGE_SENSITIVE &&
polarity == ACPI_ACTIVE_HIGH)
irq_type = IRQ_TYPE_EDGE_RISING;
- else if (trigger == ACPI_LEVEL_SENSITIVE &&
polarity == ACPI_ACTIVE_LOW)
irq_type = IRQ_TYPE_LEVEL_LOW;
- else if (trigger == ACPI_LEVEL_SENSITIVE &&
polarity == ACPI_ACTIVE_HIGH)
irq_type = IRQ_TYPE_LEVEL_HIGH;
- else
irq_type = IRQ_TYPE_NONE;
- /*
* Since only one GIC is supported in ACPI 5.0, we can
* create mapping refer to the default domain
*/
- irq = irq_create_mapping(NULL, gsi);
- if (!irq)
return irq;
- /* Set irq type if specified and different than the current one */
- if (irq_type != IRQ_TYPE_NONE &&
irq_type != irq_get_trigger_type(irq))
irq_set_irq_type(irq, irq_type);
- return irq;
+} +EXPORT_SYMBOL_GPL(acpi_register_gsi);
+void acpi_unregister_gsi(u32 gsi) +{ +} +EXPORT_SYMBOL_GPL(acpi_unregister_gsi);
- static int __init acpi_parse_fadt(struct acpi_table_header *table) { struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table;
Does this code *have* to sit under arch/arm64? I can't see anything architecture-specific about it and the bulk of the functions map directly onto irq domain callbacks. I know that the answer is probably "we can fix that in the future", but it doesn't seem like a huge amount of effort to get the right abstractions in place from the beginning so that we don't have to churn this stuff later on.
Do you mean move acpi_register_gsi()/acpi_unregister_gsi() to irqdomain related file?
Since x86 and IA64 have their arch specific acpi_register_gsi() /acpi_unregister_gsi(), we will got compile errors on x86 and IA64 platforms.
Thanks Hanjun
On Mon, Feb 09, 2015 at 06:53:31AM +0000, Hanjun Guo wrote:
On 2015年02月09日 14:34, Will Deacon wrote:
On Mon, Feb 02, 2015 at 12:45:43PM +0000, Hanjun Guo wrote:
Introduce ACPI_IRQ_MODEL_GIC which is needed for ARM64 as GIC is used, and then register device's gsi with the core IRQ subsystem.
acpi_register_gsi() is similar to DT based irq_of_parse_and_map(), since gsi is unique in the system, so use hwirq number directly for the mapping.
We are going to implement stacked domains when GICv2m, GICv3, ITS support are added.
CC: Marc Zyngier marc.zyngier@arm.com Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/kernel/acpi.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++ drivers/acpi/bus.c | 3 ++ include/linux/acpi.h | 1 + 3 files changed, 77 insertions(+)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index f80caef..f86a982 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -38,6 +38,12 @@ EXPORT_SYMBOL(acpi_pci_disabled); static int enabled_cpus; /* Processors (GICC) with enabled flag in MADT */
/*
- Since we're on ARM, the default interrupt routing model
- clearly has to be GIC.
- */
+enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_GIC;
+/*
- __acpi_map_table() will be called before page_init(), so early_ioremap()
- or early_memremap() should be called here to for ACPI table mapping.
*/ @@ -185,6 +191,73 @@ void __init acpi_init_cpus(void) pr_info("%d CPUs enabled, %d CPUs total\n", enabled_cpus, total_cpus); }
+int acpi_gsi_to_irq(u32 gsi, unsigned int *irq) +{
- *irq = irq_find_mapping(NULL, gsi);
- return 0;
+} +EXPORT_SYMBOL_GPL(acpi_gsi_to_irq);
+/*
- success: return IRQ number (>0)
- failure: return =< 0
- */
+int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) +{
- unsigned int irq;
- unsigned int irq_type;
- /*
* ACPI have no bindings to indicate SPI or PPI, so we
* use different mappings from DT in ACPI.
*
* For FDT
* PPI interrupt: in the range [0, 15];
* SPI interrupt: in the range [0, 987];
*
* For ACPI, GSI should be unique so using
* the hwirq directly for the mapping:
* PPI interrupt: in the range [16, 31];
* SPI interrupt: in the range [32, 1019];
*/
- if (trigger == ACPI_EDGE_SENSITIVE &&
polarity == ACPI_ACTIVE_LOW)
irq_type = IRQ_TYPE_EDGE_FALLING;
- else if (trigger == ACPI_EDGE_SENSITIVE &&
polarity == ACPI_ACTIVE_HIGH)
irq_type = IRQ_TYPE_EDGE_RISING;
- else if (trigger == ACPI_LEVEL_SENSITIVE &&
polarity == ACPI_ACTIVE_LOW)
irq_type = IRQ_TYPE_LEVEL_LOW;
- else if (trigger == ACPI_LEVEL_SENSITIVE &&
polarity == ACPI_ACTIVE_HIGH)
irq_type = IRQ_TYPE_LEVEL_HIGH;
- else
irq_type = IRQ_TYPE_NONE;
- /*
* Since only one GIC is supported in ACPI 5.0, we can
* create mapping refer to the default domain
*/
- irq = irq_create_mapping(NULL, gsi);
- if (!irq)
return irq;
- /* Set irq type if specified and different than the current one */
- if (irq_type != IRQ_TYPE_NONE &&
irq_type != irq_get_trigger_type(irq))
irq_set_irq_type(irq, irq_type);
- return irq;
+} +EXPORT_SYMBOL_GPL(acpi_register_gsi);
+void acpi_unregister_gsi(u32 gsi) +{ +} +EXPORT_SYMBOL_GPL(acpi_unregister_gsi);
- static int __init acpi_parse_fadt(struct acpi_table_header *table) { struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table;
Does this code *have* to sit under arch/arm64? I can't see anything architecture-specific about it and the bulk of the functions map directly onto irq domain callbacks. I know that the answer is probably "we can fix that in the future", but it doesn't seem like a huge amount of effort to get the right abstractions in place from the beginning so that we don't have to churn this stuff later on.
Do you mean move acpi_register_gsi()/acpi_unregister_gsi() to irqdomain related file?
Since x86 and IA64 have their arch specific acpi_register_gsi() /acpi_unregister_gsi(), we will got compile errors on x86 and IA64 platforms.
Right, but nobody builds a single kernel image supporting x86 and arm, so this doesn't sound impossible to fix.
The code here basically consists of:
- Definition of acpi_irq_model. That can stay here for now. - Empty stub for acpi_unregister_gsi -- should be in core code - acpi_gsi_to_irq -- maps directly to irq_find_mapping, core code. - Code to translate an ACPI interrupt type to a Linux IRQ subsystem type - Instantiaton of an irq mapping
None of that has anything to do with the arm64 architecture. If we have to make some small changes to core code to accommodate a non-x86 architecture, then I think we should at least consider that first.
Will
From: Tomasz Nowicki tomasz.nowicki@linaro.org
ACPI kernel uses MADT table for proper GIC initialization. It needs to parse GIC related subtables, collect CPU interface and distributor addresses and call driver initialization function (which is hardware abstraction agnostic). In a similar way, FDT initialize GICv1/2.
NOTE: This commit allow to initialize GICv1/2 basic functionality. While now simple GICv2 init call is used, any further GIC features require generic infrastructure for proper ACPI irqchip initialization. That mechanism and stacked irqdomains to support GICv2 MSI/vitalization extension, GICv3/4 and its ITS are considered as next steps.
CC: Jason Cooper jason@lakedaemon.net CC: Marc Zyngier marc.zyngier@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/include/asm/acpi.h | 2 + arch/arm64/kernel/acpi.c | 25 +++++++++ drivers/irqchip/irq-gic.c | 102 +++++++++++++++++++++++++++++++++++ drivers/irqchip/irqchip.c | 3 ++ include/linux/acpi.h | 15 ++++++ include/linux/irqchip/arm-gic-acpi.h | 31 +++++++++++ 6 files changed, 178 insertions(+) create mode 100644 include/linux/irqchip/arm-gic-acpi.h
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 7e825b9..ea4d2b3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -12,6 +12,8 @@ #ifndef _ASM_ACPI_H #define _ASM_ACPI_H
+#include <linux/irqchip/arm-gic-acpi.h> + #include <asm/smp_plat.h>
/* Basic configuration for ACPI */ diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index f86a982..437315e 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -319,6 +319,31 @@ void __init acpi_boot_table_init(void) } }
+void __init acpi_gic_init(void) +{ + struct acpi_table_header *table; + acpi_status status; + acpi_size tbl_size; + int err; + + if (acpi_disabled) + return; + + status = acpi_get_table_with_size(ACPI_SIG_MADT, 0, &table, &tbl_size); + if (ACPI_FAILURE(status)) { + const char *msg = acpi_format_exception(status); + + pr_err("Failed to get MADT table, %s\n", msg); + return; + } + + err = gic_v2_acpi_init(table); + if (err) + pr_err("Failed to initialize GIC IRQ controller"); + + early_acpi_os_unmap_memory((char *)table, tbl_size); +} + static int __init parse_acpi(char *arg) { if (!arg) diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index d617ee5..7f874d6 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -33,12 +33,14 @@ #include <linux/of.h> #include <linux/of_address.h> #include <linux/of_irq.h> +#include <linux/acpi.h> #include <linux/irqdomain.h> #include <linux/interrupt.h> #include <linux/percpu.h> #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/irqchip/arm-gic-acpi.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -1083,3 +1085,103 @@ IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init); IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);
#endif + +#ifdef CONFIG_ACPI +static phys_addr_t dist_phy_base, cpu_phy_base; +static int cpu_base_assigned; + +static int __init +gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header, + const unsigned long end) +{ + struct acpi_madt_generic_interrupt *processor; + phys_addr_t gic_cpu_base; + + processor = (struct acpi_madt_generic_interrupt *)header; + + if (BAD_MADT_ENTRY(processor, end)) + return -EINVAL; + + /* + * There is no support for non-banked GICv1/2 register in ACPI spec. + * All CPU interface addresses have to be the same. + */ + gic_cpu_base = processor->base_address; + if (cpu_base_assigned && gic_cpu_base != cpu_phy_base) + return -EINVAL; + + cpu_phy_base = gic_cpu_base; + cpu_base_assigned = 1; + return 0; +} + +static int __init +gic_acpi_parse_madt_distributor(struct acpi_subtable_header *header, + const unsigned long end) +{ + struct acpi_madt_generic_distributor *dist; + + dist = (struct acpi_madt_generic_distributor *)header; + + if (BAD_MADT_ENTRY(dist, end)) + return -EINVAL; + + dist_phy_base = dist->base_address; + return 0; +} + +int __init +gic_v2_acpi_init(struct acpi_table_header *table) +{ + void __iomem *cpu_base, *dist_base; + int count; + + /* Collect CPU base addresses */ + count = acpi_parse_entries(ACPI_SIG_MADT, + sizeof(struct acpi_table_madt), + gic_acpi_parse_madt_cpu, table, + ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0); + if (count <= 0) { + pr_err("No valid GICC entries exist\n"); + return -EINVAL; + } + + /* + * Find distributor base address. We expect one distributor entry since + * ACPI 5.1 spec neither support multi-GIC instances nor GIC cascade. + */ + count = acpi_parse_entries(ACPI_SIG_MADT, + sizeof(struct acpi_table_madt), + gic_acpi_parse_madt_distributor, table, + ACPI_MADT_TYPE_GENERIC_DISTRIBUTOR, 0); + if (count <= 0) { + pr_err("No valid GICD entries exist\n"); + return -EINVAL; + } else if (count > 1) { + pr_err("More than one GICD entry detected\n"); + return -EINVAL; + } + + cpu_base = ioremap(cpu_phy_base, ACPI_GIC_CPU_IF_MEM_SIZE); + if (!cpu_base) { + pr_err("Unable to map GICC registers\n"); + return -ENOMEM; + } + + dist_base = ioremap(dist_phy_base, ACPI_GICV2_DIST_MEM_SIZE); + if (!dist_base) { + pr_err("Unable to map GICD registers\n"); + iounmap(cpu_base); + return -ENOMEM; + } + + /* + * Initialize zero GIC instance (no multi-GIC support). Also, set GIC + * as default IRQ domain to allow for GSI registration and GSI to IRQ + * number translation (see acpi_register_gsi() and acpi_gsi_to_irq()). + */ + gic_init_bases(0, -1, dist_base, cpu_base, 0, NULL); + irq_set_default_host(gic_data[0].domain); + return 0; +} +#endif diff --git a/drivers/irqchip/irqchip.c b/drivers/irqchip/irqchip.c index 0fe2f71..5855240 100644 --- a/drivers/irqchip/irqchip.c +++ b/drivers/irqchip/irqchip.c @@ -8,6 +8,7 @@ * warranty of any kind, whether express or implied. */
+#include <linux/acpi.h> #include <linux/init.h> #include <linux/of_irq.h> #include <linux/irqchip.h> @@ -26,4 +27,6 @@ extern struct of_device_id __irqchip_of_table[]; void __init irqchip_init(void) { of_irq_init(__irqchip_of_table); + + acpi_irq_init(); } diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 87f365e..536991b 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -162,6 +162,16 @@ extern u32 acpi_irq_not_handled; extern int sbf_port; extern unsigned long acpi_realmode_flags;
+static inline void acpi_irq_init(void) +{ + /* + * Hardcode ACPI IRQ chip initialization to GICv2 for now. + * Proper irqchip infrastructure will be implemented along with + * incoming GICv2m|GICv3|ITS bits. + */ + acpi_gic_init(); +} + int acpi_register_gsi (struct device *dev, u32 gsi, int triggering, int polarity); int acpi_gsi_to_irq (u32 gsi, unsigned int *irq); int acpi_isa_irq_to_gsi (unsigned isa_irq, u32 *gsi); @@ -508,6 +518,11 @@ static inline int acpi_table_parse(char *id, return -ENODEV; }
+static inline void acpi_irq_init(void) +{ + return; +} + static inline int acpi_nvs_register(__u64 start, __u64 size) { return 0; diff --git a/include/linux/irqchip/arm-gic-acpi.h b/include/linux/irqchip/arm-gic-acpi.h new file mode 100644 index 0000000..ad5b577 --- /dev/null +++ b/include/linux/irqchip/arm-gic-acpi.h @@ -0,0 +1,31 @@ +/* + * Copyright (C) 2014, Linaro Ltd. + * Author: Tomasz Nowicki tomasz.nowicki@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef ARM_GIC_ACPI_H_ +#define ARM_GIC_ACPI_H_ + +#ifdef CONFIG_ACPI + +/* + * Hard code here, we can not get memory size from MADT (but FDT does), + * Actually no need to do that, because this size can be inferred + * from GIC spec. + */ +#define ACPI_GICV2_DIST_MEM_SIZE (SZ_4K) +#define ACPI_GIC_CPU_IF_MEM_SIZE (SZ_8K) + +struct acpi_table_header; + +void acpi_gic_init(void); +int gic_v2_acpi_init(struct acpi_table_header *table); +#else +static inline void acpi_gic_init(void) { } +#endif + +#endif /* ARM_GIC_ACPI_H_ */
On Monday, February 02, 2015 08:45:44 PM Hanjun Guo wrote:
From: Tomasz Nowicki tomasz.nowicki@linaro.org
ACPI kernel uses MADT table for proper GIC initialization. It needs to parse GIC related subtables, collect CPU interface and distributor addresses and call driver initialization function (which is hardware abstraction agnostic). In a similar way, FDT initialize GICv1/2.
NOTE: This commit allow to initialize GICv1/2 basic functionality. While now simple GICv2 init call is used, any further GIC features require generic infrastructure for proper ACPI irqchip initialization. That mechanism and stacked irqdomains to support GICv2 MSI/vitalization extension, GICv3/4 and its ITS are considered as next steps.
CC: Jason Cooper jason@lakedaemon.net CC: Marc Zyngier marc.zyngier@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
An ACK from Thomas Gleixner is absolutely necessary for anything touching drivers/irqchip/.
arch/arm64/include/asm/acpi.h | 2 + arch/arm64/kernel/acpi.c | 25 +++++++++ drivers/irqchip/irq-gic.c | 102 +++++++++++++++++++++++++++++++++++ drivers/irqchip/irqchip.c | 3 ++ include/linux/acpi.h | 15 ++++++ include/linux/irqchip/arm-gic-acpi.h | 31 +++++++++++ 6 files changed, 178 insertions(+) create mode 100644 include/linux/irqchip/arm-gic-acpi.h
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 7e825b9..ea4d2b3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -12,6 +12,8 @@ #ifndef _ASM_ACPI_H #define _ASM_ACPI_H +#include <linux/irqchip/arm-gic-acpi.h>
#include <asm/smp_plat.h> /* Basic configuration for ACPI */ diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index f86a982..437315e 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -319,6 +319,31 @@ void __init acpi_boot_table_init(void) } } +void __init acpi_gic_init(void) +{
- struct acpi_table_header *table;
- acpi_status status;
- acpi_size tbl_size;
- int err;
- if (acpi_disabled)
return;
- status = acpi_get_table_with_size(ACPI_SIG_MADT, 0, &table, &tbl_size);
- if (ACPI_FAILURE(status)) {
const char *msg = acpi_format_exception(status);
pr_err("Failed to get MADT table, %s\n", msg);
return;
- }
- err = gic_v2_acpi_init(table);
- if (err)
pr_err("Failed to initialize GIC IRQ controller");
- early_acpi_os_unmap_memory((char *)table, tbl_size);
+}
static int __init parse_acpi(char *arg) { if (!arg) diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index d617ee5..7f874d6 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -33,12 +33,14 @@ #include <linux/of.h> #include <linux/of_address.h> #include <linux/of_irq.h> +#include <linux/acpi.h> #include <linux/irqdomain.h> #include <linux/interrupt.h> #include <linux/percpu.h> #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/irqchip/arm-gic-acpi.h> #include <asm/cputype.h> #include <asm/irq.h> @@ -1083,3 +1085,103 @@ IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init); IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init); #endif
+#ifdef CONFIG_ACPI +static phys_addr_t dist_phy_base, cpu_phy_base; +static int cpu_base_assigned;
+static int __init +gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
const unsigned long end)
+{
- struct acpi_madt_generic_interrupt *processor;
- phys_addr_t gic_cpu_base;
- processor = (struct acpi_madt_generic_interrupt *)header;
- if (BAD_MADT_ENTRY(processor, end))
return -EINVAL;
- /*
* There is no support for non-banked GICv1/2 register in ACPI spec.
* All CPU interface addresses have to be the same.
*/
- gic_cpu_base = processor->base_address;
- if (cpu_base_assigned && gic_cpu_base != cpu_phy_base)
return -EINVAL;
- cpu_phy_base = gic_cpu_base;
- cpu_base_assigned = 1;
- return 0;
+}
+static int __init +gic_acpi_parse_madt_distributor(struct acpi_subtable_header *header,
const unsigned long end)
+{
- struct acpi_madt_generic_distributor *dist;
- dist = (struct acpi_madt_generic_distributor *)header;
- if (BAD_MADT_ENTRY(dist, end))
return -EINVAL;
- dist_phy_base = dist->base_address;
- return 0;
+}
+int __init +gic_v2_acpi_init(struct acpi_table_header *table) +{
- void __iomem *cpu_base, *dist_base;
- int count;
- /* Collect CPU base addresses */
- count = acpi_parse_entries(ACPI_SIG_MADT,
sizeof(struct acpi_table_madt),
gic_acpi_parse_madt_cpu, table,
ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0);
- if (count <= 0) {
pr_err("No valid GICC entries exist\n");
return -EINVAL;
- }
- /*
* Find distributor base address. We expect one distributor entry since
* ACPI 5.1 spec neither support multi-GIC instances nor GIC cascade.
*/
- count = acpi_parse_entries(ACPI_SIG_MADT,
sizeof(struct acpi_table_madt),
gic_acpi_parse_madt_distributor, table,
ACPI_MADT_TYPE_GENERIC_DISTRIBUTOR, 0);
- if (count <= 0) {
pr_err("No valid GICD entries exist\n");
return -EINVAL;
- } else if (count > 1) {
pr_err("More than one GICD entry detected\n");
return -EINVAL;
- }
- cpu_base = ioremap(cpu_phy_base, ACPI_GIC_CPU_IF_MEM_SIZE);
- if (!cpu_base) {
pr_err("Unable to map GICC registers\n");
return -ENOMEM;
- }
- dist_base = ioremap(dist_phy_base, ACPI_GICV2_DIST_MEM_SIZE);
- if (!dist_base) {
pr_err("Unable to map GICD registers\n");
iounmap(cpu_base);
return -ENOMEM;
- }
- /*
* Initialize zero GIC instance (no multi-GIC support). Also, set GIC
* as default IRQ domain to allow for GSI registration and GSI to IRQ
* number translation (see acpi_register_gsi() and acpi_gsi_to_irq()).
*/
- gic_init_bases(0, -1, dist_base, cpu_base, 0, NULL);
- irq_set_default_host(gic_data[0].domain);
- return 0;
+} +#endif diff --git a/drivers/irqchip/irqchip.c b/drivers/irqchip/irqchip.c index 0fe2f71..5855240 100644 --- a/drivers/irqchip/irqchip.c +++ b/drivers/irqchip/irqchip.c @@ -8,6 +8,7 @@
- warranty of any kind, whether express or implied.
*/ +#include <linux/acpi.h> #include <linux/init.h> #include <linux/of_irq.h> #include <linux/irqchip.h> @@ -26,4 +27,6 @@ extern struct of_device_id __irqchip_of_table[]; void __init irqchip_init(void) { of_irq_init(__irqchip_of_table);
- acpi_irq_init();
} diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 87f365e..536991b 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -162,6 +162,16 @@ extern u32 acpi_irq_not_handled; extern int sbf_port; extern unsigned long acpi_realmode_flags; +static inline void acpi_irq_init(void) +{
- /*
* Hardcode ACPI IRQ chip initialization to GICv2 for now.
* Proper irqchip infrastructure will be implemented along with
* incoming GICv2m|GICv3|ITS bits.
*/
- acpi_gic_init();
+}
int acpi_register_gsi (struct device *dev, u32 gsi, int triggering, int polarity); int acpi_gsi_to_irq (u32 gsi, unsigned int *irq); int acpi_isa_irq_to_gsi (unsigned isa_irq, u32 *gsi); @@ -508,6 +518,11 @@ static inline int acpi_table_parse(char *id, return -ENODEV; } +static inline void acpi_irq_init(void) +{
- return;
+}
static inline int acpi_nvs_register(__u64 start, __u64 size) { return 0; diff --git a/include/linux/irqchip/arm-gic-acpi.h b/include/linux/irqchip/arm-gic-acpi.h new file mode 100644 index 0000000..ad5b577 --- /dev/null +++ b/include/linux/irqchip/arm-gic-acpi.h @@ -0,0 +1,31 @@ +/*
- Copyright (C) 2014, Linaro Ltd.
- Author: Tomasz Nowicki tomasz.nowicki@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#ifndef ARM_GIC_ACPI_H_ +#define ARM_GIC_ACPI_H_
+#ifdef CONFIG_ACPI
+/*
- Hard code here, we can not get memory size from MADT (but FDT does),
- Actually no need to do that, because this size can be inferred
- from GIC spec.
- */
+#define ACPI_GICV2_DIST_MEM_SIZE (SZ_4K) +#define ACPI_GIC_CPU_IF_MEM_SIZE (SZ_8K)
+struct acpi_table_header;
+void acpi_gic_init(void); +int gic_v2_acpi_init(struct acpi_table_header *table); +#else +static inline void acpi_gic_init(void) { } +#endif
+#endif /* ARM_GIC_ACPI_H_ */
On 02.02.2015 13:45, Hanjun Guo wrote:
From: Tomasz Nowicki tomasz.nowicki@linaro.org
ACPI kernel uses MADT table for proper GIC initialization. It needs to parse GIC related subtables, collect CPU interface and distributor addresses and call driver initialization function (which is hardware abstraction agnostic). In a similar way, FDT initialize GICv1/2.
NOTE: This commit allow to initialize GICv1/2 basic functionality. While now simple GICv2 init call is used, any further GIC features require generic infrastructure for proper ACPI irqchip initialization. That mechanism and stacked irqdomains to support GICv2 MSI/vitalization extension, GICv3/4 and its ITS are considered as next steps.
CC: Jason Cooper jason@lakedaemon.net CC: Marc Zyngier marc.zyngier@arm.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/include/asm/acpi.h | 2 + arch/arm64/kernel/acpi.c | 25 +++++++++ drivers/irqchip/irq-gic.c | 102 +++++++++++++++++++++++++++++++++++ drivers/irqchip/irqchip.c | 3 ++ include/linux/acpi.h | 15 ++++++ include/linux/irqchip/arm-gic-acpi.h | 31 +++++++++++ 6 files changed, 178 insertions(+) create mode 100644 include/linux/irqchip/arm-gic-acpi.h
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 7e825b9..ea4d2b3 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -12,6 +12,8 @@ #ifndef _ASM_ACPI_H #define _ASM_ACPI_H
+#include <linux/irqchip/arm-gic-acpi.h>
#include <asm/smp_plat.h>
/* Basic configuration for ACPI */
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index f86a982..437315e 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -319,6 +319,31 @@ void __init acpi_boot_table_init(void) } }
+void __init acpi_gic_init(void) +{
- struct acpi_table_header *table;
- acpi_status status;
- acpi_size tbl_size;
- int err;
- if (acpi_disabled)
return;
- status = acpi_get_table_with_size(ACPI_SIG_MADT, 0, &table, &tbl_size);
- if (ACPI_FAILURE(status)) {
const char *msg = acpi_format_exception(status);
pr_err("Failed to get MADT table, %s\n", msg);
return;
- }
- err = gic_v2_acpi_init(table);
- if (err)
pr_err("Failed to initialize GIC IRQ controller");
- early_acpi_os_unmap_memory((char *)table, tbl_size);
+}
- static int __init parse_acpi(char *arg) { if (!arg)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index d617ee5..7f874d6 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -33,12 +33,14 @@ #include <linux/of.h> #include <linux/of_address.h> #include <linux/of_irq.h> +#include <linux/acpi.h> #include <linux/irqdomain.h> #include <linux/interrupt.h> #include <linux/percpu.h> #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/irqchip/arm-gic-acpi.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -1083,3 +1085,103 @@ IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init); IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);
#endif
+#ifdef CONFIG_ACPI +static phys_addr_t dist_phy_base, cpu_phy_base; +static int cpu_base_assigned;
+static int __init +gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
const unsigned long end)
+{
- struct acpi_madt_generic_interrupt *processor;
- phys_addr_t gic_cpu_base;
- processor = (struct acpi_madt_generic_interrupt *)header;
- if (BAD_MADT_ENTRY(processor, end))
return -EINVAL;
- /*
* There is no support for non-banked GICv1/2 register in ACPI spec.
* All CPU interface addresses have to be the same.
*/
- gic_cpu_base = processor->base_address;
- if (cpu_base_assigned && gic_cpu_base != cpu_phy_base)
return -EINVAL;
- cpu_phy_base = gic_cpu_base;
- cpu_base_assigned = 1;
- return 0;
+}
+static int __init +gic_acpi_parse_madt_distributor(struct acpi_subtable_header *header,
const unsigned long end)
+{
- struct acpi_madt_generic_distributor *dist;
- dist = (struct acpi_madt_generic_distributor *)header;
- if (BAD_MADT_ENTRY(dist, end))
return -EINVAL;
- dist_phy_base = dist->base_address;
- return 0;
+}
+int __init +gic_v2_acpi_init(struct acpi_table_header *table) +{
- void __iomem *cpu_base, *dist_base;
- int count;
- /* Collect CPU base addresses */
- count = acpi_parse_entries(ACPI_SIG_MADT,
sizeof(struct acpi_table_madt),
gic_acpi_parse_madt_cpu, table,
ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0);
- if (count <= 0) {
pr_err("No valid GICC entries exist\n");
return -EINVAL;
- }
- /*
* Find distributor base address. We expect one distributor entry since
* ACPI 5.1 spec neither support multi-GIC instances nor GIC cascade.
*/
- count = acpi_parse_entries(ACPI_SIG_MADT,
sizeof(struct acpi_table_madt),
gic_acpi_parse_madt_distributor, table,
ACPI_MADT_TYPE_GENERIC_DISTRIBUTOR, 0);
- if (count <= 0) {
pr_err("No valid GICD entries exist\n");
return -EINVAL;
- } else if (count > 1) {
pr_err("More than one GICD entry detected\n");
return -EINVAL;
- }
- cpu_base = ioremap(cpu_phy_base, ACPI_GIC_CPU_IF_MEM_SIZE);
- if (!cpu_base) {
pr_err("Unable to map GICC registers\n");
return -ENOMEM;
- }
- dist_base = ioremap(dist_phy_base, ACPI_GICV2_DIST_MEM_SIZE);
- if (!dist_base) {
pr_err("Unable to map GICD registers\n");
iounmap(cpu_base);
return -ENOMEM;
- }
- /*
* Initialize zero GIC instance (no multi-GIC support). Also, set GIC
* as default IRQ domain to allow for GSI registration and GSI to IRQ
* number translation (see acpi_register_gsi() and acpi_gsi_to_irq()).
*/
- gic_init_bases(0, -1, dist_base, cpu_base, 0, NULL);
- irq_set_default_host(gic_data[0].domain);
- return 0;
+} +#endif diff --git a/drivers/irqchip/irqchip.c b/drivers/irqchip/irqchip.c index 0fe2f71..5855240 100644 --- a/drivers/irqchip/irqchip.c +++ b/drivers/irqchip/irqchip.c @@ -8,6 +8,7 @@
- warranty of any kind, whether express or implied.
*/
+#include <linux/acpi.h> #include <linux/init.h> #include <linux/of_irq.h> #include <linux/irqchip.h> @@ -26,4 +27,6 @@ extern struct of_device_id __irqchip_of_table[]; void __init irqchip_init(void) { of_irq_init(__irqchip_of_table);
- acpi_irq_init(); }
diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 87f365e..536991b 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -162,6 +162,16 @@ extern u32 acpi_irq_not_handled; extern int sbf_port; extern unsigned long acpi_realmode_flags;
+static inline void acpi_irq_init(void) +{
- /*
* Hardcode ACPI IRQ chip initialization to GICv2 for now.
* Proper irqchip infrastructure will be implemented along with
* incoming GICv2m|GICv3|ITS bits.
*/
- acpi_gic_init();
+}
- int acpi_register_gsi (struct device *dev, u32 gsi, int triggering, int polarity); int acpi_gsi_to_irq (u32 gsi, unsigned int *irq); int acpi_isa_irq_to_gsi (unsigned isa_irq, u32 *gsi);
@@ -508,6 +518,11 @@ static inline int acpi_table_parse(char *id, return -ENODEV; }
+static inline void acpi_irq_init(void) +{
- return;
+}
- static inline int acpi_nvs_register(__u64 start, __u64 size) { return 0;
I just realized this will not work for !CONFIG_ARM64 case. Instead, it should be:
@@ -564,6 +549,23 @@ static inline int acpi_device_modalias(struct device *dev,
#endif /* !CONFIG_ACPI */
+#if defined(CONFIG_ACPI) && defined(CONFIG_ARM64) +static inline void acpi_irq_init(void) +{ + /* + * Hardcode ACPI IRQ chip initialization to GICv2 for now. + * Proper irqchip infrastructure will be implemented along with + * incoming GICv2m|GICv3|ITS bits. + */ + acpi_gic_init(); +} +#else +static inline void acpi_irq_init(void) +{ + return; +} +#endif + #ifdef CONFIG_ACPI void acpi_os_set_prepare_sleep(int (*func)(u8 sleep_state, u32 pm1a_ctrl, u32 pm1b_ctrl));
Regards, Tomasz
Using the information presented by GTDT (Generic Timer Description Table) to initialize the arch timer (not memory-mapped).
CC: Daniel Lezcano daniel.lezcano@linaro.org Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/kernel/time.c | 7 ++ drivers/clocksource/arm_arch_timer.c | 132 ++++++++++++++++++++++++++++------- include/linux/clocksource.h | 6 ++ 3 files changed, 118 insertions(+), 27 deletions(-)
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c index 1a7125c..42f9195 100644 --- a/arch/arm64/kernel/time.c +++ b/arch/arm64/kernel/time.c @@ -35,6 +35,7 @@ #include <linux/delay.h> #include <linux/clocksource.h> #include <linux/clk-provider.h> +#include <linux/acpi.h>
#include <clocksource/arm_arch_timer.h>
@@ -72,6 +73,12 @@ void __init time_init(void)
tick_setup_hrtimer_broadcast();
+ /* + * Since ACPI or FDT will only one be available in the system, + * we can use acpi_generic_timer_init() here safely + */ + acpi_generic_timer_init(); + arch_timer_rate = arch_timer_get_rate(); if (!arch_timer_rate) panic("Unable to initialise architected timer.\n"); diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 095c177..407aa63 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -21,6 +21,7 @@ #include <linux/io.h> #include <linux/slab.h> #include <linux/sched_clock.h> +#include <linux/acpi.h>
#include <asm/arch_timer.h> #include <asm/virt.h> @@ -370,8 +371,12 @@ arch_timer_detect_rate(void __iomem *cntbase, struct device_node *np) if (arch_timer_rate) return;
- /* Try to determine the frequency from the device tree or CNTFRQ */ - if (of_property_read_u32(np, "clock-frequency", &arch_timer_rate)) { + /* + * Try to determine the frequency from the device tree or CNTFRQ, + * if ACPI is enabled, get the frequency from CNTFRQ ONLY. + */ + if (!acpi_disabled || + of_property_read_u32(np, "clock-frequency", &arch_timer_rate)) { if (cntbase) arch_timer_rate = readl_relaxed(cntbase + CNTFRQ); else @@ -690,28 +695,8 @@ static void __init arch_timer_common_init(void) arch_timer_arch_init(); }
-static void __init arch_timer_init(struct device_node *np) +static void __init arch_timer_init(void) { - int i; - - if (arch_timers_present & ARCH_CP15_TIMER) { - pr_warn("arch_timer: multiple nodes in dt, skipping\n"); - return; - } - - arch_timers_present |= ARCH_CP15_TIMER; - for (i = PHYS_SECURE_PPI; i < MAX_TIMER_PPI; i++) - arch_timer_ppi[i] = irq_of_parse_and_map(np, i); - arch_timer_detect_rate(NULL, np); - - /* - * If we cannot rely on firmware initializing the timer registers then - * we should use the physical timers instead. - */ - if (IS_ENABLED(CONFIG_ARM) && - of_property_read_bool(np, "arm,cpu-registers-not-fw-configured")) - arch_timer_use_virtual = false; - /* * If HYP mode is available, we know that the physical timer * has been configured to be accessible from PL1. Use it, so @@ -730,13 +715,39 @@ static void __init arch_timer_init(struct device_node *np) } }
- arch_timer_c3stop = !of_property_read_bool(np, "always-on"); - arch_timer_register(); arch_timer_common_init(); } -CLOCKSOURCE_OF_DECLARE(armv7_arch_timer, "arm,armv7-timer", arch_timer_init); -CLOCKSOURCE_OF_DECLARE(armv8_arch_timer, "arm,armv8-timer", arch_timer_init); + +static void __init arch_timer_of_init(struct device_node *np) +{ + int i; + + if (arch_timers_present & ARCH_CP15_TIMER) { + pr_warn("arch_timer: multiple nodes in dt, skipping\n"); + return; + } + + arch_timers_present |= ARCH_CP15_TIMER; + for (i = PHYS_SECURE_PPI; i < MAX_TIMER_PPI; i++) + arch_timer_ppi[i] = irq_of_parse_and_map(np, i); + + arch_timer_detect_rate(NULL, np); + + arch_timer_c3stop = !of_property_read_bool(np, "always-on"); + + /* + * If we cannot rely on firmware initializing the timer registers then + * we should use the physical timers instead. + */ + if (IS_ENABLED(CONFIG_ARM) && + of_property_read_bool(np, "arm,cpu-registers-not-fw-configured")) + arch_timer_use_virtual = false; + + arch_timer_init(); +} +CLOCKSOURCE_OF_DECLARE(armv7_arch_timer, "arm,armv7-timer", arch_timer_of_init); +CLOCKSOURCE_OF_DECLARE(armv8_arch_timer, "arm,armv8-timer", arch_timer_of_init);
static void __init arch_timer_mem_init(struct device_node *np) { @@ -803,3 +814,70 @@ static void __init arch_timer_mem_init(struct device_node *np) } CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, "arm,armv7-timer-mem", arch_timer_mem_init); + +#ifdef CONFIG_ACPI +static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags) +{ + int trigger, polarity; + + if (!interrupt) + return 0; + + trigger = (flags & ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE + : ACPI_LEVEL_SENSITIVE; + + polarity = (flags & ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW + : ACPI_ACTIVE_HIGH; + + return acpi_register_gsi(NULL, interrupt, trigger, polarity); +} + +/* Initialize per-processor generic timer */ +static int __init arch_timer_acpi_init(struct acpi_table_header *table) +{ + struct acpi_table_gtdt *gtdt; + + if (arch_timers_present & ARCH_CP15_TIMER) { + pr_warn("arch_timer: already initialized, skipping\n"); + return -EINVAL; + } + + gtdt = container_of(table, struct acpi_table_gtdt, header); + + arch_timers_present |= ARCH_CP15_TIMER; + + arch_timer_ppi[PHYS_SECURE_PPI] = + map_generic_timer_interrupt(gtdt->secure_el1_interrupt, + gtdt->secure_el1_flags); + + arch_timer_ppi[PHYS_NONSECURE_PPI] = + map_generic_timer_interrupt(gtdt->non_secure_el1_interrupt, + gtdt->non_secure_el1_flags); + + arch_timer_ppi[VIRT_PPI] = + map_generic_timer_interrupt(gtdt->virtual_timer_interrupt, + gtdt->virtual_timer_flags); + + arch_timer_ppi[HYP_PPI] = + map_generic_timer_interrupt(gtdt->non_secure_el2_interrupt, + gtdt->non_secure_el2_flags); + + /* Get the frequency from CNTFRQ */ + arch_timer_detect_rate(NULL, NULL); + + /* Always-on capability */ + arch_timer_c3stop = !(gtdt->non_secure_el1_flags & ACPI_GTDT_ALWAYS_ON); + + arch_timer_init(); + return 0; +} + +/* Initialize all the generic timers presented in GTDT */ +void __init acpi_generic_timer_init(void) +{ + if (acpi_disabled) + return; + + acpi_table_parse(ACPI_SIG_GTDT, arch_timer_acpi_init); +} +#endif diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index abcafaa..af6155a 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -346,4 +346,10 @@ extern void clocksource_of_init(void); static inline void clocksource_of_init(void) {} #endif
+#ifdef CONFIG_ACPI +void acpi_generic_timer_init(void); +#else +static inline void acpi_generic_timer_init(void) { } +#endif + #endif /* _LINUX_CLOCKSOURCE_H */
On Monday, February 02, 2015 08:45:45 PM Hanjun Guo wrote:
Using the information presented by GTDT (Generic Timer Description Table) to initialize the arch timer (not memory-mapped).
CC: Daniel Lezcano daniel.lezcano@linaro.org Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/kernel/time.c | 7 ++ drivers/clocksource/arm_arch_timer.c | 132 ++++++++++++++++++++++++++++-------
An ACK from Thomas Gleixner is necessary for that.
include/linux/clocksource.h | 6 ++ 3 files changed, 118 insertions(+), 27 deletions(-)
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c index 1a7125c..42f9195 100644 --- a/arch/arm64/kernel/time.c +++ b/arch/arm64/kernel/time.c @@ -35,6 +35,7 @@ #include <linux/delay.h> #include <linux/clocksource.h> #include <linux/clk-provider.h> +#include <linux/acpi.h> #include <clocksource/arm_arch_timer.h> @@ -72,6 +73,12 @@ void __init time_init(void) tick_setup_hrtimer_broadcast();
- /*
* Since ACPI or FDT will only one be available in the system,
* we can use acpi_generic_timer_init() here safely
*/
- acpi_generic_timer_init();
- arch_timer_rate = arch_timer_get_rate(); if (!arch_timer_rate) panic("Unable to initialise architected timer.\n");
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 095c177..407aa63 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -21,6 +21,7 @@ #include <linux/io.h> #include <linux/slab.h> #include <linux/sched_clock.h> +#include <linux/acpi.h> #include <asm/arch_timer.h> #include <asm/virt.h> @@ -370,8 +371,12 @@ arch_timer_detect_rate(void __iomem *cntbase, struct device_node *np) if (arch_timer_rate) return;
- /* Try to determine the frequency from the device tree or CNTFRQ */
- if (of_property_read_u32(np, "clock-frequency", &arch_timer_rate)) {
- /*
* Try to determine the frequency from the device tree or CNTFRQ,
* if ACPI is enabled, get the frequency from CNTFRQ ONLY.
*/
- if (!acpi_disabled ||
if (cntbase) arch_timer_rate = readl_relaxed(cntbase + CNTFRQ); elseof_property_read_u32(np, "clock-frequency", &arch_timer_rate)) {
@@ -690,28 +695,8 @@ static void __init arch_timer_common_init(void) arch_timer_arch_init(); } -static void __init arch_timer_init(struct device_node *np) +static void __init arch_timer_init(void) {
- int i;
- if (arch_timers_present & ARCH_CP15_TIMER) {
pr_warn("arch_timer: multiple nodes in dt, skipping\n");
return;
- }
- arch_timers_present |= ARCH_CP15_TIMER;
- for (i = PHYS_SECURE_PPI; i < MAX_TIMER_PPI; i++)
arch_timer_ppi[i] = irq_of_parse_and_map(np, i);
- arch_timer_detect_rate(NULL, np);
- /*
* If we cannot rely on firmware initializing the timer registers then
* we should use the physical timers instead.
*/
- if (IS_ENABLED(CONFIG_ARM) &&
of_property_read_bool(np, "arm,cpu-registers-not-fw-configured"))
arch_timer_use_virtual = false;
- /*
- If HYP mode is available, we know that the physical timer
- has been configured to be accessible from PL1. Use it, so
@@ -730,13 +715,39 @@ static void __init arch_timer_init(struct device_node *np) } }
- arch_timer_c3stop = !of_property_read_bool(np, "always-on");
- arch_timer_register(); arch_timer_common_init();
} -CLOCKSOURCE_OF_DECLARE(armv7_arch_timer, "arm,armv7-timer", arch_timer_init); -CLOCKSOURCE_OF_DECLARE(armv8_arch_timer, "arm,armv8-timer", arch_timer_init);
+static void __init arch_timer_of_init(struct device_node *np) +{
- int i;
- if (arch_timers_present & ARCH_CP15_TIMER) {
pr_warn("arch_timer: multiple nodes in dt, skipping\n");
return;
- }
- arch_timers_present |= ARCH_CP15_TIMER;
- for (i = PHYS_SECURE_PPI; i < MAX_TIMER_PPI; i++)
arch_timer_ppi[i] = irq_of_parse_and_map(np, i);
- arch_timer_detect_rate(NULL, np);
- arch_timer_c3stop = !of_property_read_bool(np, "always-on");
- /*
* If we cannot rely on firmware initializing the timer registers then
* we should use the physical timers instead.
*/
- if (IS_ENABLED(CONFIG_ARM) &&
of_property_read_bool(np, "arm,cpu-registers-not-fw-configured"))
arch_timer_use_virtual = false;
- arch_timer_init();
+} +CLOCKSOURCE_OF_DECLARE(armv7_arch_timer, "arm,armv7-timer", arch_timer_of_init); +CLOCKSOURCE_OF_DECLARE(armv8_arch_timer, "arm,armv8-timer", arch_timer_of_init); static void __init arch_timer_mem_init(struct device_node *np) { @@ -803,3 +814,70 @@ static void __init arch_timer_mem_init(struct device_node *np) } CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, "arm,armv7-timer-mem", arch_timer_mem_init);
+#ifdef CONFIG_ACPI +static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags) +{
- int trigger, polarity;
- if (!interrupt)
return 0;
- trigger = (flags & ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
: ACPI_LEVEL_SENSITIVE;
- polarity = (flags & ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
: ACPI_ACTIVE_HIGH;
- return acpi_register_gsi(NULL, interrupt, trigger, polarity);
+}
+/* Initialize per-processor generic timer */ +static int __init arch_timer_acpi_init(struct acpi_table_header *table) +{
- struct acpi_table_gtdt *gtdt;
- if (arch_timers_present & ARCH_CP15_TIMER) {
pr_warn("arch_timer: already initialized, skipping\n");
return -EINVAL;
- }
- gtdt = container_of(table, struct acpi_table_gtdt, header);
- arch_timers_present |= ARCH_CP15_TIMER;
- arch_timer_ppi[PHYS_SECURE_PPI] =
map_generic_timer_interrupt(gtdt->secure_el1_interrupt,
gtdt->secure_el1_flags);
- arch_timer_ppi[PHYS_NONSECURE_PPI] =
map_generic_timer_interrupt(gtdt->non_secure_el1_interrupt,
gtdt->non_secure_el1_flags);
- arch_timer_ppi[VIRT_PPI] =
map_generic_timer_interrupt(gtdt->virtual_timer_interrupt,
gtdt->virtual_timer_flags);
- arch_timer_ppi[HYP_PPI] =
map_generic_timer_interrupt(gtdt->non_secure_el2_interrupt,
gtdt->non_secure_el2_flags);
- /* Get the frequency from CNTFRQ */
- arch_timer_detect_rate(NULL, NULL);
- /* Always-on capability */
- arch_timer_c3stop = !(gtdt->non_secure_el1_flags & ACPI_GTDT_ALWAYS_ON);
- arch_timer_init();
- return 0;
+}
+/* Initialize all the generic timers presented in GTDT */ +void __init acpi_generic_timer_init(void) +{
- if (acpi_disabled)
return;
- acpi_table_parse(ACPI_SIG_GTDT, arch_timer_acpi_init);
+} +#endif diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index abcafaa..af6155a 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -346,4 +346,10 @@ extern void clocksource_of_init(void); static inline void clocksource_of_init(void) {} #endif +#ifdef CONFIG_ACPI +void acpi_generic_timer_init(void); +#else +static inline void acpi_generic_timer_init(void) { } +#endif
#endif /* _LINUX_CLOCKSOURCE_H */
On 2015年02月03日 06:23, Rafael J. Wysocki wrote:
On Monday, February 02, 2015 08:45:45 PM Hanjun Guo wrote:
Using the information presented by GTDT (Generic Timer Description Table) to initialize the arch timer (not memory-mapped).
CC: Daniel Lezcano daniel.lezcano@linaro.org Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/kernel/time.c | 7 ++ drivers/clocksource/arm_arch_timer.c | 132 ++++++++++++++++++++++++++++-------
An ACK from Thomas Gleixner is necessary for that.
I will CC Thomas in the change log in next version with both patch 16/17.
Thanks Hanjun
On Mon, Feb 02, 2015 at 12:45:45PM +0000, Hanjun Guo wrote:
Using the information presented by GTDT (Generic Timer Description Table) to initialize the arch timer (not memory-mapped).
Why are you not initializing the memory mapped timer ?
CC: Daniel Lezcano daniel.lezcano@linaro.org Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/kernel/time.c | 7 ++ drivers/clocksource/arm_arch_timer.c | 132 ++++++++++++++++++++++++++++------- include/linux/clocksource.h | 6 ++ 3 files changed, 118 insertions(+), 27 deletions(-)
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c index 1a7125c..42f9195 100644 --- a/arch/arm64/kernel/time.c +++ b/arch/arm64/kernel/time.c @@ -35,6 +35,7 @@ #include <linux/delay.h> #include <linux/clocksource.h> #include <linux/clk-provider.h> +#include <linux/acpi.h> #include <clocksource/arm_arch_timer.h> @@ -72,6 +73,12 @@ void __init time_init(void) tick_setup_hrtimer_broadcast();
- /*
* Since ACPI or FDT will only one be available in the system,
* we can use acpi_generic_timer_init() here safely
*/
- acpi_generic_timer_init();
- arch_timer_rate = arch_timer_get_rate(); if (!arch_timer_rate) panic("Unable to initialise architected timer.\n");
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 095c177..407aa63 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -21,6 +21,7 @@ #include <linux/io.h> #include <linux/slab.h> #include <linux/sched_clock.h> +#include <linux/acpi.h> #include <asm/arch_timer.h> #include <asm/virt.h> @@ -370,8 +371,12 @@ arch_timer_detect_rate(void __iomem *cntbase, struct device_node *np) if (arch_timer_rate) return;
- /* Try to determine the frequency from the device tree or CNTFRQ */
- if (of_property_read_u32(np, "clock-frequency", &arch_timer_rate)) {
- /*
* Try to determine the frequency from the device tree or CNTFRQ,
* if ACPI is enabled, get the frequency from CNTFRQ ONLY.
*/
- if (!acpi_disabled ||
of_property_read_u32(np, "clock-frequency", &arch_timer_rate)) {
This is getting a mess. cntbase tells you it is a memory mapped timer, node pointer that you are probing through DT, and to top it all acpi_disabled detects if you are probing in ACPI or DT mode.
I think this function should be simplified, this driver is also pending a refactoring to split arch timer and the memory mapped one so I think you'd better wait that work to make things simpler.
[...]
+/* Initialize all the generic timers presented in GTDT */ +void __init acpi_generic_timer_init(void) +{
- if (acpi_disabled)
return;
acpi_disabled used again here, I repeat myself this is going to be hard to track. You should try to organize the code something like:
if (acpi_disabled) timer_dt_probe(); else timer_acpi_probe();
mixing the code paths is getting unwieldy, see above to see my reasoning.
- acpi_table_parse(ACPI_SIG_GTDT, arch_timer_acpi_init);
+} +#endif diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index abcafaa..af6155a 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -346,4 +346,10 @@ extern void clocksource_of_init(void); static inline void clocksource_of_init(void) {} #endif +#ifdef CONFIG_ACPI +void acpi_generic_timer_init(void); +#else +static inline void acpi_generic_timer_init(void) { } +#endif
That's not nice, it is a generic header, arch specific stuff should be avoided. I think you should have an ACPI generic layer similar to clocksource_of_init(), and probe from there when matching the respective timers.
Lorenzo
Hi Lorenzo,
On 2015年02月05日 02:59, Lorenzo Pieralisi wrote:
On Mon, Feb 02, 2015 at 12:45:45PM +0000, Hanjun Guo wrote:
Using the information presented by GTDT (Generic Timer Description Table) to initialize the arch timer (not memory-mapped).
Why are you not initializing the memory mapped timer ?
We left it for later work because no need for that to boot available ARM64 hardware at now, and we have no hardware to test unless I missed some of platforms.
CC: Daniel Lezcano daniel.lezcano@linaro.org Originally-by: Amit Daniel Kachhap amit.daniel@samsung.com Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
arch/arm64/kernel/time.c | 7 ++ drivers/clocksource/arm_arch_timer.c | 132 ++++++++++++++++++++++++++++------- include/linux/clocksource.h | 6 ++ 3 files changed, 118 insertions(+), 27 deletions(-)
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c index 1a7125c..42f9195 100644 --- a/arch/arm64/kernel/time.c +++ b/arch/arm64/kernel/time.c @@ -35,6 +35,7 @@ #include <linux/delay.h> #include <linux/clocksource.h> #include <linux/clk-provider.h> +#include <linux/acpi.h>
#include <clocksource/arm_arch_timer.h>
@@ -72,6 +73,12 @@ void __init time_init(void)
tick_setup_hrtimer_broadcast();
- /*
* Since ACPI or FDT will only one be available in the system,
* we can use acpi_generic_timer_init() here safely
*/
- acpi_generic_timer_init();
- arch_timer_rate = arch_timer_get_rate(); if (!arch_timer_rate) panic("Unable to initialise architected timer.\n");
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 095c177..407aa63 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -21,6 +21,7 @@ #include <linux/io.h> #include <linux/slab.h> #include <linux/sched_clock.h> +#include <linux/acpi.h>
#include <asm/arch_timer.h> #include <asm/virt.h> @@ -370,8 +371,12 @@ arch_timer_detect_rate(void __iomem *cntbase, struct device_node *np) if (arch_timer_rate) return;
- /* Try to determine the frequency from the device tree or CNTFRQ */
- if (of_property_read_u32(np, "clock-frequency", &arch_timer_rate)) {
- /*
* Try to determine the frequency from the device tree or CNTFRQ,
* if ACPI is enabled, get the frequency from CNTFRQ ONLY.
*/
- if (!acpi_disabled ||
of_property_read_u32(np, "clock-frequency", &arch_timer_rate)) {
This is getting a mess. cntbase tells you it is a memory mapped timer, node pointer that you are probing through DT, and to top it all acpi_disabled detects if you are probing in ACPI or DT mode.
I think this function should be simplified, this driver is also pending a refactoring to split arch timer and the memory mapped one so I think you'd better wait that work to make things simpler.
Does anyone working on this now?
[...]
+/* Initialize all the generic timers presented in GTDT */ +void __init acpi_generic_timer_init(void) +{
- if (acpi_disabled)
return;
acpi_disabled used again here, I repeat myself this is going to be hard to track. You should try to organize the code something like:
if (acpi_disabled) timer_dt_probe(); else timer_acpi_probe();
mixing the code paths is getting unwieldy, see above to see my reasoning.
Olof is unhappy with such approach, I think if (acpi_disabled) is self-contained because we only get DT or ACPI in the system, we can call this function time_init() without more if (acpi_disabled).
- acpi_table_parse(ACPI_SIG_GTDT, arch_timer_acpi_init);
+} +#endif diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index abcafaa..af6155a 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -346,4 +346,10 @@ extern void clocksource_of_init(void); static inline void clocksource_of_init(void) {} #endif
+#ifdef CONFIG_ACPI +void acpi_generic_timer_init(void); +#else +static inline void acpi_generic_timer_init(void) { } +#endif
That's not nice, it is a generic header, arch specific stuff should be avoided. I think you should have an ACPI generic layer similar to clocksource_of_init(), and probe from there when matching the respective timers.
I think I'm OK with it, but do we really need to introduce a heavy framework to init just arm arch timer (memory mapped or not) ?
Thanks Hanjun
From: Al Stone al.stone@linaro.org
ACPI reduced hardware mode is disabled by default, but ARM64 can only run properly in ACPI hardware reduced mode, so select ACPI_REDUCED_HARDWARE_ONLY if ACPI is enabled on ARM64.
CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Reviewed-by: Grant Likely grant.likely@linaro.org Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b1f9a20..c19ae5d 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1,5 +1,6 @@ config ARM64 def_bool y + select ACPI_REDUCED_HARDWARE_ONLY if ACPI select ARCH_BINFMT_ELF_RANDOMIZE_PIE select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select ARCH_HAS_GCOV_PROFILE_ALL
From: Graeme Gregory graeme.gregory@linaro.org
Add Kconfigs to build ACPI on ARM64, and make ACPI available on ARM64.
acpi_idle driver is x86/IA64 dependent now, so make CONFIG_ACPI_PROCESSOR depend on X86 || IA64, and implement it on ARM64 in the future.
CC: Rafael J. Wysocki rjw@rjwysocki.net CC: Catalin Marinas catalin.marinas@arm.com CC: Will Deacon will.deacon@arm.com Reviewed-by: Grant Likely grant.likely@linaro.org Tested-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Tested-by: Yijing Wang wangyijing@huawei.com Tested-by: Mark Langsdorf mlangsdo@redhat.com Tested-by: Jon Masters jcm@redhat.com Tested-by: Timur Tabi timur@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- arch/arm64/Kconfig | 2 ++ drivers/acpi/Kconfig | 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c19ae5d..915aa16 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -647,6 +647,8 @@ source "drivers/Kconfig"
source "drivers/firmware/Kconfig"
+source "drivers/acpi/Kconfig" + source "fs/Kconfig"
source "arch/arm64/kvm/Kconfig" diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 8951cef..fd19ad6 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -5,7 +5,7 @@ menuconfig ACPI bool "ACPI (Advanced Configuration and Power Interface) Support" depends on !IA64_HP_SIM - depends on IA64 || X86 + depends on IA64 || X86 || (ARM64 && EXPERT) depends on PCI select PNP default y @@ -163,6 +163,7 @@ config ACPI_PROCESSOR tristate "Processor" select THERMAL select CPU_IDLE + depends on X86 || IA64 default y help This driver installs ACPI as the idle handler for Linux and uses
From: Graeme Gregory graeme.gregory@linaro.org
Add documentation for the guidelines of how to use ACPI on ARM64.
Reviewed-by: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Reviewed-by: Yi Li phoenix.liyi@huawei.com Reviewed-by: Mark Langsdorf mlangsdo@redhat.com Reviewed-by: Ashwin Chaugule ashwinc@codeaurora.org Signed-off-by: Graeme Gregory graeme.gregory@linaro.org Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- Documentation/arm64/arm-acpi.txt | 506 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 506 insertions(+) create mode 100644 Documentation/arm64/arm-acpi.txt
diff --git a/Documentation/arm64/arm-acpi.txt b/Documentation/arm64/arm-acpi.txt new file mode 100644 index 0000000..275524e --- /dev/null +++ b/Documentation/arm64/arm-acpi.txt @@ -0,0 +1,506 @@ +ACPI on ARMv8 Servers +--------------------- +ACPI can be used for ARMv8 general purpose servers designed to follow +the ARM SBSA (Server Base System Architecture) [0] and SBBR (Server +Base Boot Requirements) [1] specifications. Please note that the SBBR +can be retrieved simply by visiting [1], but the SBSA is currently only +available to those with an ARM login due to ARM IP licensing concerns. + +The ARMv8 kernel implements the reduced hardware model of ACPI version +5.1 or later. Links to the specification and all external documents +it refers to are managed by the UEFI Forum. The specification is +available at http://www.uefi.org/specifications and documents referenced +by the specification can be found via http://www.uefi.org/acpi. + +If an ARMv8 system does not meet the requirements of the SBSA and SBBR, +or cannot be described using the mechanisms defined in the required ACPI +specifications, then ACPI may not be a good fit for the hardware. + +While the documents mentioned above set out the requirements for building +industry-standard ARMv8 servers, they also apply to more than one operating +system. The purpose of this document is to describe the interaction between +ACPI and Linux only, on an ARMv8 system -- that is, what Linux expects of +ACPI and what ACPI can expect of Linux. + + +Why ACPI on ARM? +---------------- +Before examining the details of the interface between ACPI and Linux, it is +useful to understand why ACPI is being used. Several technologies already +exist in Linux for describing non-enumerable hardware, after all. In this +section we summarize a blog post [2] from Grant Likely that outlines the +reasoning behind ACPI on ARMv8 servers. Actually, we snitch a good portion +of the summary text almost directly, to be honest. + +The short form of the rationale for ACPI on ARM is: + +-- ACPI’s bytecode (AML) allows the platform to encode hardware behavior, + while DT explicitly does not support this. For hardware vendors, being + able to encode behavior is a key tool used in supporting operating + system releases on new hardware. + +-- ACPI’s OSPM defines a power management model that constrains what the + platform is allowed to do into a specific model, while still providing + flexibility in hardware design. + +-- In the enterprise server environment, ACPI has established bindings (such + as for RAS) which are currently used in production systems. DT does not. + Such bindings could be defined in DT at some point, but doing so means ARM + and x86 would end up using completely different code paths in both firmware + and the kernel. + +-- Choosing a single interface to describe the abstraction between a platform + and an OS is important. Hardware vendors would not be required to implement + both DT and ACPI if they want to support multiple operating systems. And, + agreeing on a single interface instead of being fragmented into per OS + interfaces makes for better interoperability overall. + +-- The new ACPI governance process works well and Linux is now at the same + table as hardware vendors and other OS vendors. In fact, there is no + longer any reason to feel that ACPI is only belongs to Windows or that + Linux is in any way secondary to Microsoft in this arena. The move of + ACPI governance into the UEFI forum has significantly opened up the + specification development process, and currently, a large portion of the + changes being made to ACPI is being driven by Linux. + +Key to the use of ACPI is the support model. For servers in general, the +responsibility for hardware behaviour cannot solely be the domain of the +kernel, but rather must be split between the platform and the kernel, in +order to allow for orderly change over time. ACPI frees the OS from needing +to understand all the minute details of the hardware so that the OS doesn’t +need to be ported to each and every device individually. It allows the +hardware vendors to take responsibility for power management behaviour without +depending on an OS release cycle which is not under their control. + +ACPI is also important because hardware and OS vendors have already worked +out the mechanisms for supporting a general purpose computing ecosystem. The +infrastructure is in place, the bindings are in place, and the processes are +in place. DT does exactly what Linux needs it to when working with vertically +integrated devices, but there are no good processes for supporting what the +server vendors need. Linux could potentially get there with DT, but doing so +really just duplicates something that already works. ACPI already does what +the hardware vendors need, Microsoft won’t collaborate on DT, and hardware +vendors would still end up providing two completely separate firmware +interfaces -- one for Linux and one for Windows. + + +Kernel Compatibility +-------------------- +One of the primary motivations for ACPI is standardization, and using that +to provide backward compatibility for Linux kernels. In the server market, +software and hardware are often used for long periods. ACPI allows the +kernel and firmware to agree on a consistent abstraction that can be +maintained over time, even as hardware or software change. As long as the +abstraction is supported, systems can be updated without necessarily having +to replace the kernel. + +When a Linux driver or subsystem is first implemented using ACPI, it by +definition ends up requiring a specific version of the ACPI specification +-- it's baseline. ACPI firmware must continue to work, even though it may +not be optimal, with the earliest kernel version that first provides support +for that baseline version of ACPI. There may be a need for additional drivers, +but adding new functionality (e.g., CPU power management) should not break +older kernel versions. Further, ACPI firmware must also work with the most +recent version of the kernel. + + +Relationship with Device Tree +----------------------------- +ACPI support in drivers and subsystems for ARMv8 should never be mutually +exclusive with DT support at compile time. + +At boot time the kernel will only use one description method depending on +parameters passed from the bootloader (including kernel bootargs). + +Regardless of whether DT or ACPI is used, the kernel must always be capable +of booting with either scheme (in kernels with both schemes enabled at compile +time). + + +Booting using ACPI tables +------------------------- +The only defined method for passing ACPI tables to the kernel on ARMv8 +is via the UEFI system configuration table. Just so it is explicit, this +means that ACPI is only supported on platforms that boot via UEFI. + +When an ARMv8 system boots, it can either have DT information, ACPI tables, +or in some very unusual cases, both. If no command line parameters are used, +the kernel will try to use DT for device enumeration; if there is no DT +present, the kernel will try to use ACPI tables, but only if they are present. +In neither is available, the kernel will not boot. If acpi=force is used +on the command line, the kernel will attempt to use ACPI tables first, but +fall back to DT if there are no ACPI tables present. The basic idea is that +the kernel will not fail to boot unless it absolutely has no other choice. + +Processing of ACPI tables may be disabled by passing acpi=off on the kernel +command line; this is the default behavior. + +In order for the kernel to load and use ACPI tables, the UEFI implementation +MUST set the ACPI_20_TABLE_GUID to point to the RSDP table (the table with +the ACPI signature "RSD PTR "). If this pointer is incorrect and acpi=force +is used, the kernel will disable ACPI and try to use DT to boot instead; the +kernel has, in effect, determined that ACPI tables are not present at that +point. + +If the pointer to the RSDP table is correct, the table will be mapped into +the kernel by the ACPI core, using the address provided by UEFI. + +The ACPI core will then locate and map in all other ACPI tables provided by +using the addresses in the RSDP table to find the XSDT (eXtended System +Description Table). The XSDT in turn provides the addresses to all other +ACPI tables provided by the system firmware; the ACPI core will then traverse +this table and map in the tables listed. + +The ACPI core will ignore any provided RSDT (Root System Description Table). +RSDTs have been deprecated and are ignored on arm64 since they only allow +for 32-bit addresses. + +Further, the ACPI core will only use the 64-bit address fields in the FADT +(Fixed ACPI Description Table). Any 32-bit address fields in the FADT will +be ignored on arm64. + +Hardware reduced mode (see Section 4.1 of the ACPI 5.1 specification) will +be enforced by the ACPI core on arm64. Doing so allows the ACPI core to +run less complex code since it no longer has to provide support for legacy +hardware from other architectures. Any fields that are not to be used for +hardware reduced mode must be set to zero. + +For the ACPI core to operate properly, and in turn provide the information +the kernel needs to configure devices, it expects to find the following +tables (all section numbers refer to the ACPI 5.1 specfication): + + -- RSDP (Root System Description Pointer), section 5.2.5 + + -- XSDT (eXtended System Description Table), section 5.2.8 + + -- FADT (Fixed ACPI Description Table), section 5.2.9 + + -- DSDT (Differentiated System Description Table), section + 5.2.11.1 + + -- MADT (Multiple APIC Description Table), section 5.2.12 + + -- GTDT (Generic Timer Description Table), section 5.2.24 + + -- If PCI is supported, the MCFG (Memory mapped ConFiGuration + Table), section 5.2.6, specifically Table 5-31. + +If the above tables are not all present, the kernel may or may not be +able to boot properly since it may not be able to configure all of the +devices available. + + +ACPI Detection +-------------- +Drivers should determine their probe() type by checking for a null +value for ACPI_HANDLE, or checking .of_node, or other information in +the device structure. This is detailed further in the "Driver +Recommendations" section. + +In non-driver code, if the presence of ACPI needs to be detected at +runtime, then check the value of acpi_disabled. If CONFIG_ACPI is not +set, acpi_disabled will always be 1. + + +Device Enumeration +------------------ +Device descriptions in ACPI should use standard recognized ACPI interfaces. +These may contain less information than is typically provided via a Device +Tree description for the same device. This is also one of the reasons that +ACPI can be useful -- the driver takes into account that it may have less +detailed information about the device and uses sensible defaults instead. +If done properly in the driver, the hardware can change and improve over +time without the driver having to change at all. + +Clocks provide an excellent example. In DT, clocks need to be specified +and the drivers need to take them into account. In ACPI, the assumption +is that UEFI will leave the device in a reasonable default state, including +any clock settings. If for some reason the driver needs to change a clock +value, this can be done in an ACPI method; all the driver needs to do is +invoke the method and not concern itself with what the method needs to do +to change the clock. Changing the hardware can then take place over time +by changing what the ACPI method does, and not the driver. + +In DT, the parameters needed by the driver to set up clocks as in the example +above are known as "bindings"; in ACPI, these are known as "Device Properties" +and provided to a driver via the _DSD object. + +ACPI tables are described with a formal language called ASL, the ACPI +Source Language (section 19 of the specification). This means that there +are always multiple ways to describe the same thing -- including device +properties. For example, device properties could use an ASL construct +that looks like this: Name(KEY0, "value0"). An ACPI device driver would +then retrieve the value of the property by evaluating the KEY0 object. +However, using Name() this way has multiple problems: (1) ACPI limits +names ("KEY0") to four characters unlike DT; (2) there is no industry +wide registry that maintains a list of names, minimzing re-use; (3) +there is also no registry for the definition of property values ("value0"), +again making re-use difficult; and (4) how does one maintain backward +compatibility as new hardware comes out? The _DSD method was created +to solve precisely these sorts of problems; Linux drivers should ALWAYS +use the _DSD method for device properties and nothing else. + +The _DSM object (ACPI Section 9.14.1) could also be used for conveying +device properties to a driver. Linux drivers should only expect it to +be used if _DSD cannot represent the data required, and there is no way +to create a new UUID for the _DSD object. Note that there is even less +regulation of the use of _DSM than there is of _DSD. Drivers that depend +on the contents of _DSM objects will be more difficult to maintain over +time because of this; as of this writing, the use of _DSM is the cause +of quite a few firmware problems and is not recommended. + +Drivers should look for device properties in the _DSD object ONLY; the _DSD +object is described in the ACPI specification section 6.2.5, but this only +describes how to define the structure of an object returned via _DSD, and +how specific data structures are defined by specific UUIDs. Linux should +only use the _DSD Device Properties UUID [5]: + + -- UUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301 + + -- http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUI... + +The UEFI Forum provides a mechanism for registering device properties [4] +so that they may be used across all operating systems supporting ACPI. +Device properties that have not been registered with the UEFI Forum should +not be used. + +Before creating new device properties, check to be sure that they have not +been defined before and either registered in the Linux kernel documentation +as DT bindings, or the UEFI Forum as device properties. While we do not want +to simply move all DT bindings into ACPI device properties, we can learn from +what has been previously defined. + +If it is necessary to define a new device property, or if it makes sense to +synthesize the definition of a binding so it can be used in any firmware, +both DT bindings and ACPI device properties for device drivers have review +processes. Use them both. When the driver itself is submitted for review +to the Linux mailing lists, the device property definitions needed must be +submitted at the same time. A driver that supports ACPI and uses device +properties will not be considered complete without their definitions. Once +the device property has been accepted by the Linux community, it must be +registered with the UEFI Forum [4], which will review it again for consistency +within the registry. This may require iteration. The UEFI Forum, though, +will always be the canonical site for device property definitions. + +It may make sense to provide notice to the UEFI Forum that there is the +intent to register a previously unused device property name as a means of +reserving the name for later use. Other operating system vendors will +also be submitting registration requests and this may help smooth the +process. + +Once registration and review have been completed, the kernel provides an +interface for looking up device properties in a manner independent of +whether DT or ACPI is being used. This API should be used [6]; it can +eliminate some duplication of code paths in driver probing functions and +discourage divergence between DT bindings and ACPI device properties. + + +Programmable Power Control Resources +------------------------------------ +Programmable power control resources include such resources as voltage/current +providers (regulators) and clock sources. + +With ACPI, the kernel clock and regulator framework is not expected to be used +at all. + +The kernel assumes that power control of these resources is represented with +Power Resource Objects (ACPI section 7.1). The ACPI core will then handle +correctly enabling and disabling resources as they are needed. In order to +get that to work, ACPI assumes each device has defined D-states and that these +can be controlled through the optional ACPI methods _PS0, _PS1, _PS2, and _PS3; +in ACPI, _PS0 is the method to invoke to turn a device full on, and _PS3 is for +turning a device full off. + +There are two options for using those Power Resources. They can: + + -- be managed in a _PSx method which gets called on entry to power + state Dx. + + -- be declared separately as power resources with their own _ON and _OFF + methods. They are then tied back to D-states for a particular device + via _PRx which specifies which power resources a device needs to be on + while in Dx. Kernel then tracks number of devices using a power resource + and calls _ON/_OFF as needed. + +The kernel ACPI code will also assume that the _PSx methods follow the normal +ACPI rules for such methods: + + -- If either _PS0 or _PS3 is implemented, then the other method must also + be implemented. + + -- If a device requires usage or setup of a power resource when on, the ASL + should organize that it is allocated/enabled using the _PS0 method. + + -- Resources allocated or enabled in the _PS0 method should be disabled + or de-allocated in the _PS3 method. + + -- Firmware will leave the resources in a reasonable state before handing + over control to the kernel. + +Such code in _PSx methods will of course be very platform specific. But, +this allows the driver to abstract out the interface for operating the device +and avoid having to read special non-standard values from ACPI tables. Further, +abstracting the use of these resources allows the hardware to change over time +without requiring updates to the driver. + + +Clocks +------ +ACPI makes the assumption that clocks are initialized by the firmware -- +UEFI, in this case -- to some working value before control is handed over +to the kernel. This has implications for devices such as UARTs, or SoC-driven +LCD displays, for example. + +When the kernel boots, the clocks are assumed to be set to reasonable +working values. If for some reason the frequency needs to change -- e.g., +throttling for power management -- the device driver should expect that +process to be abstracted out into some ACPI method that can be invoked +(please see the ACPI specification for further recommendations on standard +methods to be expected). The only exceptions to this are CPU clocks where +CPPC provides a much richer interface than ACPI methods. If the clocks +are not set, there is no direct way for Linux to control them. + +If an SoC vendor wants to provide fine-grained control of the system clocks, +they could do so by providing ACPI methods that could be invoked by Linux +drivers. However, this is NOT recommended and Linux drivers should NOT use +such methods, even if they are provided. Such methods are not currently +standardized in the ACPI specification, and using them could tie a kernel +to a very specific SoC, or tie an SoC to a very specific version of the +kernel, both of which we are trying to avoid. + + +Driver Recommendations +---------------------- +DO NOT remove any DT handling when adding ACPI support for a driver. The +same device may be used on many different systems. + +DO try to structure the driver so that it is data-driven. That is, set up +a struct containing internal per-device state based on defaults and whatever +else must be discovered by the driver probe function. Then, have the rest +of the driver operate off of the contents of that struct. Doing so should +allow most divergence between ACPI and DT functionality to be kept local to +the probe function instead of being scattered throughout the driver. For +example: + +static int device_probe_dt(struct platform_device *pdev) +{ + /* DT specific functionality */ + ... +} + +static int device_probe_acpi(struct platform_device *pdev) +{ + /* ACPI specific functionality */ + ... +} + +static int device_probe(stuct platform_device *pdev) +{ + ... + struct device_node node = pdev->dev.of_node; + ... + + if (node) + ret = device_probe_dt(pdev); + else if (ACPI_HANDLE(&pdev->dev)) + ret = device_probe_acpi(pdev); + else + /* other initialization */ + ... + /* Continue with any generic probe operations */ + ... +} + +DO keep the MODULE_DEVICE_TABLE entries together in the driver to make it +clear the different names the driver is probed for, both from DT and from +ACPI: + +static struct of_device_id virtio_mmio_match[] = { + { .compatible = "virtio,mmio", }, + { } +}; +MODULE_DEVICE_TABLE(of, virtio_mmio_match); + +static const struct acpi_device_id virtio_mmio_acpi_match[] = { + { "LNRO0005", }, + { } +}; +MODULE_DEVICE_TABLE(acpi, virtio_mmio_acpi_match); + + +ASWG +---- +The ACPI specification changes regularly. During the year 2014, for instance, +version 5.1 was released and version 6.0 substantially completed, with most of +the changes being driven by ARM-specific requirements. Proposed changes are +presented and discussed in the ASWG (ACPI Specification Working Group) which +is a part of the UEFI Forum. + +Participation in this group is open to all UEFI members. Please see +http://www.uefi.org/workinggroup for details on group membership. + +It is the intent of the ARMv8 ACPI kernel code to follow the ACPI specification +as closely as possible, and to only implement functionality that complies with +the released standards from UEFI ASWG. As a practical matter, there will be +vendors that provide bad ACPI tables or violate the standards in some way. +If this is because of errors, quirks and fixups may be necessary, but will +be avoided if possible. If there are features missing from ACPI that preclude +it from being used on a platform, ECRs (Engineering Change Requests) should be +submitted to ASWG and go through the normal approval process; for those that +are not UEFI members, many other members of the Linux community are and would +likely be willing to assist in submitting ECRs. + + +Linux Code +---------- +Individual items specific to Linux on ARM, contained in the the Linux +source code, are in the list that follows: + +ACPI_OS_NAME This macro defines the string to be returned when + an ACPI method invokes the _OS method. On ARM64 + systems, this macro will be "Linux" by default. + The command line parameter acpi_os=<string> + can be used to set it to some other value. The + default value for other architectures is "Microsoft + Windows NT", for example. + +ACPI Objects +------------ +Detailed expectations for ACPI tables and object are listed in the file +Documentation/arm64/acpi_object_usage.txt. + + +References +---------- +[0] http://silver.arm.com -- document ARM-DEN-0029, or newer + "Server Base System Architecture", version 2.3, dated 27 Mar 2014 + +[1] http://infocenter.arm.com/help/topic/com.arm.doc.den0044a/Server_Base_Boot_R... + Document ARM-DEN-0044A, or newer: "Server Base Boot Requirements, System + Software on ARM Platforms", dated 16 Aug 2014 + +[2] http://www.secretlab.ca/archives/151, 10 Jan 2015, Copyright (c) 2015, + Linaro Ltd., written by Grant Likely. A copy of the verbatim text (apart + from formatting) is also in Documentation/arm64/why_use_acpi.txt. + +[3] AMD ACPI for Seattle platform documentation: + http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Seattle_ACPI_... + +[4] http://www.uefi.org/acpi -- please see the link for the "ACPI _DSD Device + Property Registry Instructions" + +[5] http://www.uefi.org/acpi -- please see the link for the "_DSD (Device + Specific Data) Implementation Guide" + +[6] Kernel code for the unified device property interface can be found in + include/linux/property.h and drivers/base/property.c. + + +Authors +------- +Al Stone al.stone@linaro.org +Graeme Gregory graeme.gregory@linaro.org +Hanjun Guo hanjun.guo@linaro.org + +Grant Likely grant.likely@linaro.org, for the "Why ACPI on ARM?" section +
On 02/02/2015 06:45 AM, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
Add documentation for the guidelines of how to use ACPI on ARM64.
This patch has a slight corruption to it:
In-Reply-To: 1422881149-8177-1-git-send-email-hanjun.guo@linaro.org References: 1422881149-8177-1-git-send-email-hanjun.guo@linaro.org MIME-Version: 1.0 Content-Type: text/plain; charset=n ^ git-am chokes on this:
fatal: cannot convert from n to UTF-8
Replacing the 'n' with '"us-ascii"' fixes it.
Patch #21 has the same problem, although it also has a trailing whitespace problem.
On 2015年02月03日 03:01, Timur Tabi wrote:
On 02/02/2015 06:45 AM, Hanjun Guo wrote:
From: Graeme Gregory graeme.gregory@linaro.org
Add documentation for the guidelines of how to use ACPI on ARM64.
This patch has a slight corruption to it:
In-Reply-To: 1422881149-8177-1-git-send-email-hanjun.guo@linaro.org References: 1422881149-8177-1-git-send-email-hanjun.guo@linaro.org MIME-Version: 1.0 Content-Type: text/plain; charset=n ^ git-am chokes on this:
fatal: cannot convert from n to UTF-8
Replacing the 'n' with '"us-ascii"' fixes it.
Patch #21 has the same problem, although it also has a trailing whitespace problem.
Hi Timur, I will fix that in next version.
Thanks Hanjun
From: Al Stone al.stone@linaro.org
Two more documentation files are also being added: (1) A verbatim copy of the "Why ACPI on ARM?" blog posting by Grant Likely, which is also summarized in arm-acpi.txt, and
(2) A section by section review of the ACPI spec (acpi_object_usage.txt) to note recommendations and prohibitions on the use of the numerous ACPI tables and objects. This sets out the current expectations of the firmware by Linux very explicitly (or as explicitly as I can, for now).
CC: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com CC: Yi Li phoenix.liyi@huawei.com CC: Mark Langsdorf mlangsdo@redhat.com CC: Ashwin Chaugule ashwinc@codeaurora.org Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org --- Documentation/arm64/acpi_object_usage.txt | 592 ++++++++++++++++++++++++++++++ Documentation/arm64/why_use_acpi.txt | 231 ++++++++++++ 2 files changed, 823 insertions(+) create mode 100644 Documentation/arm64/acpi_object_usage.txt create mode 100644 Documentation/arm64/why_use_acpi.txt
diff --git a/Documentation/arm64/acpi_object_usage.txt b/Documentation/arm64/acpi_object_usage.txt new file mode 100644 index 0000000..2c4f733 --- /dev/null +++ b/Documentation/arm64/acpi_object_usage.txt @@ -0,0 +1,592 @@ +ACPI Tables +----------- +The expectations of individual ACPI tables are discussed in the list that +follows. + +If a section number is used, it refers to a section number in the ACPI +specification where the object is defined. If "Signature Reserved" is used, +the table signature (the first four bytes of the table) is the only portion +of the table recognized by the specification, and the actual table is defined +outside of the UEFI Forum (see Section 5.2.6 of the specification). + +For ACPI on arm64, tables also fall into the following categories: + + -- Required: DSDT, FADT, GTDT, MADT, MCFG, RSDP, SPCR, XSDT + + -- Recommended: BERT, EINJ, ERST, HEST, SSDT + + -- Optional: BGRT, CPEP, CSRT, DRTM, ECDT, FACS, FPDT, MCHI, MPST, + MSCT, RASF, SBST, SLIT, SPMI, SRAT, TCPA, TPM2, UEFI + + -- Not supported: BOOT, DBG2, DBGP, DMAR, ETDT, HPET, IBFT, IVRS, + LPIT, MSDM, RSDT, SLIC, WAET, WDAT, WDRT, WPBT + + +Table Usage for ARMv8 Linux +----- ---------------------------------------------------------------- +BERT Section 18.3 (signature == "BERT") + == Boot Error Record Table == + Must be supplied if RAS support is provided by the platform. It + is recommended this table be supplied. + +BOOT Signature Reserved (signature == "BOOT") + == simple BOOT flag table == + Microsoft only table, will not be supported. + +BGRT Section 5.2.22 (signature == "BGRT") + == Boot Graphics Resource Table == + Optional, not currently supported, with no real use-case for an + ARM server. + +CPEP Section 5.2.18 (signature == "CPEP") + == Corrected Platform Error Polling table == + Optional, not currently supported, and not recommended until such + time as ARM-compatible hardware is available, and the specification + suitably modified. + +CSRT Signature Reserved (signature == "CSRT") + == Core System Resources Table == + Optional, not currently supported. + +DBG2 Signature Reserved (signature == "DBG2") + == DeBuG port table 2 == + Microsoft only table, will not be supported. + +DBGP Signature Reserved (signature == "DBGP") + == DeBuG Port table == + Microsoft only table, will not be supported. + +DSDT Section 5.2.11.1 (signature == "DSDT") + == Differentiated System Description Table == + A DSDT is required; see also SSDT. + + ACPI tables contain only one DSDT but can contain one or more SSDTs, + which are optional. Each SSDT can only add to the ACPI namespace, + but cannot modify or replace anything in the DSDT. + +DMAR Signature Reserved (signature == "DMAR") + == DMA Remapping table == + x86 only table, will not be supported. + +DRTM Signature Reserved (signature == "DRTM") + == Dynamic Root of Trust for Measurement table == + Optional, not currently supported. + +ECDT Section 5.2.16 (signature == "ECDT") + == Embedded Controller Description Table == + Optional, not currently supported, but could be used on ARM if and + only if one uses the GPE_BIT field to represent an IRQ number, since + there are no GPE blocks defined in hardware reduced mode. This would + need to be modified in the ACPI specification. + +EINJ Section 18.6 (signature == "EINJ") + == Error Injection table == + This table is very useful for testing platform response to error + conditions; it allows one to inject an error into the system as + if it had actually occurred. However, this table should not be + shipped with a production system; it should be dynamically loaded + and executed with the ACPICA tools only during testing. + +ERST Section 18.5 (signature == "ERST") + == Error Record Serialization Table == + Must be supplied if RAS support is provided by the platform. It + is recommended this table be supplied. + +ETDT Signature Reserved (signature == "ETDT") + == Event Timer Description Table == + Obsolete table, will not be supported. + +FACS Section 5.2.10 (signature == "FACS") + == Firmware ACPI Control Structure == + It is unlikely that this table will be terribly useful. If it is + provided, the Global Lock will NOT be used since it is not part of + the hardware reduced profile, and only 64-bit address fields will + be considered valid. + +FADT Section 5.2.9 (signature == "FACP") + == Fixed ACPI Description Table == + Required for arm64. + + The HW_REDUCED_ACPI flag must be set. All of the fields that are + to be ignored when HW_REDUCED_ACPI is set are expected to be set to + zero. + + If an FACS table is provided, the X_FIRMWARE_CTRL field is to be + used, not FIRMWARE_CTRL. + + If PSCI is used (as is recommended), make sure that ARM_BOOT_ARCH is + filled in properly -- that the PSCI_COMPLIANT flag is set and that + PSCI_USE_HVC is set or unset as needed (see table 5-37). + + For the DSDT that is also required, the X_DSDT field is to be used, + not the DSDT field. + +FPDT Section 5.2.23 (signature == "FPDT") + == Firmware Performance Data Table == + Optional, not currently supported. + +GTDT Section 5.2.24 (signature == "GTDT") + == Generic Timer Description Table == + Required for arm64. + +HEST Section 18.3.2 (signature == "HEST") + == Hardware Error Source Table == + Until further error source types are defined, use only types 6 (AER + Root Port), 7 (AER Endpoint), 8 (AER Bridge), or 9 (Generic Hardware + Error Source). Firmware first error handling is possible if and only + if Trusted Firmware is being used on arm64. + + Must be supplied if RAS support is provided by the platform. It + is recommended this table be supplied. + +HPET Signature Reserved (signature == "HPET") + == High Precision Event timer Table == + x86 only table, will not be supported. + +IBFT Signature Reserved (signature == "IBFT") + == iSCSI Boot Firmware Table == + Microsoft defined table, support TBD. + +IVRS Signature Reserved (signature == "IVRS") + == I/O Virtualization Reporting Structure == + x86_64 (AMD) only table, will not be supported. + +LPIT Signature Reserved (signature == "LPIT") + == Low Power Idle Table == + x86 only table as of ACPI 5.1; future versions have been adapted for + use with ARM and will be recommended in order to support ACPI power + management. + +MADT Section 5.2.12 (signature == "APIC") + == Multiple APIC Description Table == + Required for arm64. Only the GIC interrupt controller structures + should be used (types 0xA - 0xE). + +MCFG Signature Reserved (signature == "MCFG") + == Memory-mapped ConFiGuration space == + If the platform supports PCI/PCIe, an MCFG table is required. + +MCHI Signature Reserved (signature == "MCHI") + == Management Controller Host Interface table == + Optional, not currently supported. + +MPST Section 5.2.21 (signature == "MPST") + == Memory Power State Table == + Optional, not currently supported. + +MSDM Signature Reserved (signature == "MSDM") + == Microsoft Data Management table == + Microsoft only table, will not be supported. + +MSCT Section 5.2.19 (signature == "MSCT") + == Maximum System Characteristic Table == + Optional, not currently supported. + +RASF Section 5.2.20 (signature == "RASF") + == RAS Feature table == + Optional, not currently supported. + +RSDP Section 5.2.5 (signature == "RSD PTR") + == Root System Description PoinTeR == + Required for arm64. + +RSDT Section 5.2.7 (signature == "RSDT") + == Root System Description Table == + Since this table can only provide 32-bit addresses, it is deprecated + on arm64, and will not be used. + +SBST Section 5.2.14 (signature == "SBST") + == Smart Battery Subsystem Table == + Optional, not currently supported. + +SLIC Signature Reserved (signature == "SLIC") + == Software LIcensing table == + Microsoft only table, will not be supported. + +SLIT Section 5.2.17 (signature == "SLIT") + == System Locality distance Information Table == + Optional in general, but required for NUMA systems. + +SPCR Signature Reserved (signature == "SPCR") + == Serial Port Console Redirection table == + Required for arm64. + +SPMI Signature Reserved (signature == "SPMI") + == Server Platform Management Interface table == + Optional, not currently supported. + +SRAT Section 5.2.16 (signature == "SRAT") + == System Resource Affinity Table == + Optional, but if used, only the GICC Affinity structures are read. + To support NUMA, this table is required. + +SSDT Section 5.2.11.2 (signature == "SSDT") + == Secondary System Description Table == + These tables are a continuation of the DSDT; these are recommended + for use with devices that can be added to a running system, but can + also serve the purpose of dividing up device descriptions into more + manageable pieces. + + An SSDT can only ADD to the ACPI namespace. It cannot modify or + replace existing device descriptions already in the namespace. + + These tables are optional, however. ACPI tables should contain only + one DSDT but can contain many SSDTs. + +TCPA Signature Reserved (signature == "TCPA") + == Trusted Computing Platform Alliance table == + Optional, not currently supported, and may need changes to fully + interoperate with arm64. + +TPM2 Signature Reserved (signature == "TPM2") + == Trusted Platform Module 2 table == + Optional, not currently supported, and may need changes to fully + interoperate with arm64. + +UEFI Signature Reserved (signature == "UEFI") + == UEFI ACPI data table == + Optional, not currently supported. No known use case for arm64, + at present. + +WAET Signature Reserved (signature == "WAET") + == Windows ACPI Emulated devices Table == + Microsoft only table, will not be supported. + +WDAT Signature Reserved (signature == "WDAT") + == Watch Dog Action Table == + Microsoft only table, will not be supported. + +WDRT Signature Reserved (signature == "WDRT") + == Watch Dog Resource Table == + Microsoft only table, will not be supported. + +WPBT Signature Reserved (signature == "WPBT") + == Windows Platform Binary Table == + Microsoft only table, will not be supported. + +XSDT Section 5.2.8 (signature == "XSDT") + == eXtended System Description Table == + Required for arm64. + + +ACPI Objects +------------ +The expectations on individual ACPI objects are discussed in the list that +follows: + +Name Section Usage for ARMv8 Linux +---- ------------ ------------------------------------------------- +_ADR 6.1.1 Use as needed. + +_BBN 6.5.5 Use as needed; PCI-specific. + +_BDN 6.5.3 Optional; not likely to be used on arm64. + +_CCA 6.2.17 This method should be defined for all bus masters + on arm64. While cache coherency is assumed, making + it explicit ensures the kernel will set up DMA as + it should. + +_CDM 6.2.1 Optional, to be used only for processor devices. + +_CID 6.1.2 Use as needed. + +_CLS 6.1.3 Use as needed. + +_CRS 6.2.2 Required on arm64. + +_DCK 6.5.2 Optional; not likely to be used on arm64. + +_DDN 6.1.4 This field can be used for a device name. However, + it is meant for DOS device names (e.g., COM1), so be + careful of its use across OSes. + +_DEP 6.5.8 Use as needed. + +_DIS 6.2.3 Optional, for power management use. + +_DLM 5.7.5 Optional. + +_DMA 6.2.4 Optional. + +_DSD 6.2.5 To be used with caution. If this object is used, try + to use it within the constraints already defined by the + Device Properties UUID. Only in rare circumstances + should it be necessary to create a new _DSD UUID. + + In either case, submit the _DSD definition along with + any driver patches for discussion, especially when + device properties are used. A driver will not be + considered complete without a corresponding _DSD + description. Once approved by kernel maintainers, + the UUID or device properties must then be registered + with the UEFI Forum; this may cause some iteration as + more than one OS will be registering entries. + +_DSM Do not use this method. It is not standardized, the + return values are not well documented, and it is + currently a frequent source of error. + +_DSW 7.2.1 Use as needed; power management specific. + +_EDL 6.3.1 Optional. + +_EJD 6.3.2 Optional. + +_EJx 6.3.3 Optional. + +_FIX 6.2.7 x86 specific, not used on arm64. + +_GL 5.7.1 This object is not to be used in hardware reduced + mode, and therefore should not be used on arm64. + +_GLK 6.5.7 This object requires a global lock be defined; there + is no global lock on arm64 since it runs in hardware + reduced mode. Hence, do not use this object on arm64. + +_GPE 5.3.1 This namespace is for x86 use only. Do not use it + on arm64. + +_GSB 6.2.7 Optional. + +_HID 6.1.5 Use as needed. This is the primary object to use in + device probing, though _CID and _CLS may also be used. + +_HPP 6.2.8 Optional, PCI specific. + +_HPX 6.2.9 Optional, PCI specific. + +_HRV 6.1.6 Optional, use as needed to clarify device behavior; in + some cases, this may be easier to use than _DSD. + +_INI 6.5.1 Not required, but can be useful in setting up devices + when UEFI leaves them in a state that may not be what + the driver expects before it starts probing. + +_IRC 7.2.15 Use as needed; power management specific. + +_LCK 6.3.4 Optional. + +_MAT 6.2.10 Optional; see also the MADT. + +_MLS 6.1.7 Optional, but highly recommended for use in + internationalization. + +_OFF 7.1.2 It is recommended to define this method for any device + that can be turned on or off. + +_ON 7.1.3 It is recommended to define this method for any device + that can be turned on or off. + +_OS 5.7.3 This method will return "Linux" by default (this is + the value of the macro ACPI_OS_NAME on Linux). The + command line parameter acpi_os=<string> can be used + to set it to some other value. + +_OSC 6.2.11 This method can be a global method in ACPI (i.e., + _SB._OSC), or it may be associated with a specific + device (e.g., _SB.DEV0._OSC), or both. When used + as a global method, only capabilities published in + the ACPI specification are allowed. When used as + a device-specifc method, the process described for + using _DSD MUST be used to create an _OSC definition; + out-of-process use of _OSC is not allowed. That is, + submit the device-specific _OSC usage description as + part of the kernel driver submission, get it approved + by the kernel community, then register it with the + UEFI Forum. + +_OSI 5.7.2 Deprecated on ARM64. Any invocation of this method + will print a warning on the console and return false. + That is, as far as ACPI firmware is concerned, _OSI + cannot be used to determine what sort of system is + being used or what functionality is provided. The + _OSC method is to be used instead. + +_OST 6.3.5 Optional. + +_PDC 8.4.1 Deprecated, do not use on arm64. + +_PIC 5.8.1 The method should not be used. On arm64, the only + interrupt model available is GIC. + +_PLD 6.1.8 Optional. + +_PR 5.3.1 This namespace is for x86 use only on legacy systems. + Do not use it on arm64. + +_PRS 6.2.12 Optional. + +_PRT 6.2.13 Required as part of the definition of all PCI root + devices. + +_PRW 7.2.13 Use as needed; power management specific. + +_PRx 7.2.8-11 Use as needed; power management specific. If _PR0 is + defined, _PR3 must also be defined. + +_PSC 7.2.6 Use as needed; power management specific. + +_PSE 7.2.7 Use as needed; power management specific. + +_PSW 7.2.14 Use as needed; power management specific. + +_PSx 7.2.2-5 Use as needed; power management specific. If _PS0 is + defined, _PS3 must also be defined. If clocks or + regulators need adjusting to be consistent with power + usage, change them in these methods. + +_PTS 7.3.1 Use as needed; power management specific. + +_PXM 6.2.14 Optional. + +_REG 6.5.4 Use as needed. + +_REV 5.7.4 Always returns the latest version of ACPI supported. + +_RMV 6.3.6 Optional. + +_SB 5.3.1 Required on arm64; all devices must be defined in this + namespace. + +_SEG 6.5.6 Use as needed; PCI-specific. + +_SI 5.3.1, Optional. + 9.1 + +_SLI 6.2.15 Optional; recommended when SLIT table is in use. + +_STA 6.3.7, It is recommended to define this method for any device + 7.1.4 that can be turned on or off. + +_SRS 6.2.16 Optional; see also _PRS. + +_STR 6.1.10 Recommended for conveying device names to end users; + this is preferred over using _DDN. + +_SUB 6.1.9 Use as needed; _HID or _CID are preferred. + +_SUN 6.1.11 Optional. + +_Sx 7.3.2 Use as needed; power management specific. + +_SxD 7.2.16-19 Use as needed; power management specific. + +_SxW 7.2.20-24 Use as needed; power management specific. + +_SWS 7.3.3 Use as needed; power management specific; this may + require specification changes for use on arm64. + +_TTS 7.3.4 Use as needed; power management specific. + +_TZ 5.3.1 Optional. + +_UID 6.1.12 Recommended for distinguishing devices of the same + class; define it if at all possible. + +_WAK 7.3.5 Use as needed; power management specific. + + +ACPI Event Model +---------------- +Do not use GPE block devices; these are not supported in the hardware reduced +profile used by arm64. Since there are no GPE blocks defined for use on ARM +platforms, GPIO-signaled interrupts should be used for creating system events. + + +ACPI Processor Control +---------------------- +Section 8 of the ACPI specification is currently undergoing change that +should be completed in the 6.0 version of the specification. Processor +performance control will be handled differently for arm64 at that point +in time. Processor aggregator devices (section 8.5) will not be used, +for example, but another similar mechanism instead. + +While UEFI constrains what we can say until the release of 6.0, it is +recommended that CPPC (8.4.5) be used as the primary model. This will +still be useful into the future. C-states and P-states will still be +provided, but most of the current design work appears to favor CPPC. + +Further, it is essential that the ARMv8 SoC provide a fully functional +implementation of PSCI; this will be the only mechanism supported by ACPI +to control CPU power state (including secondary CPU booting). + +More details will be provided on the release of the ACPI 6.0 specification. + + +ACPI System Address Map Interfaces +---------------------------------- +In Section 15 of the ACPI specification, several methods are mentioned as +possible mechanisms for conveying memory resource information to the kernel. +For arm64, we will only support UEFI for booting with ACPI, hence the UEFI +GetMemoryMap() boot service is the only mechanism that will be used. + + +ACPI Platform Error Interfaces (APEI) +------------------------------------- +The APEI tables supported are described above. + +APEI requires the equivalent of an SCI and an NMI on ARMv8. The SCI is used +to notify the OSPM of errors that have occurred but can be corrected and the +system can continue correct operation, even if possibly degraded. The NMI is +used to indicate fatal errors that cannot be corrected, and require immediate +attention. + +Since there is no direct equivalent of the x86 SCI or NMI, arm64 handles +these slightly differently. The SCI is handled as a normal GPIO-signaled +interrupt; given that these are corrected (or correctable) errors being +reported, this is sufficient. The NMI is emulated as the highest priority +GPIO-signaled interrupt possible. This implies some caution must be used +since there could be interrupts at higher privilege levels or even interrupts +at the same priority as the emulated NMI. In Linux, this should not be the +case but one should be aware it could happen. + + +ACPI Objects Not Supported on ARM64 +----------------------------------- +While this may change in the future, there are several classes of objects +that can be defined, but are not currently of general interest to ARM servers. + +These are not supported: + + -- Section 9.2: ambient light sensor devices + + -- Section 9.3: battery devices + + -- Section 9.4: lids (e.g., laptop lids) + + -- Section 9.8.2: IDE controllers + + -- Section 9.9: floppy controllers + + -- Section 9.10: GPE block devices + + -- Section 9.15: PC/AT RTC/CMOS devices + + -- Section 9.16: user presence detection devices + + -- Section 9.17: I/O APIC devices; all GICs must be enumerable via MADT + + -- Section 9.18: time and alarm devices (see 9.15) + + +ACPI Objects Not Yet Implemented +-------------------------------- +While these objects have x86 equivalents, and they do make some sense in ARM +servers, there is either no hardware available at present, or in some cases +there may not yet be a non-ARM implementation. Hence, they are currently not +implemented though that may change in the future. + +Not yet implemented are: + + -- Section 10: power source and power meter devices + + -- Section 11: thermal management + + -- Section 12: embedded controllers interface + + -- Section 13: SMBus interfaces + + -- Section 17: NUMA support (prototypes have been submitted for + review) + diff --git a/Documentation/arm64/why_use_acpi.txt b/Documentation/arm64/why_use_acpi.txt new file mode 100644 index 0000000..9bb583e --- /dev/null +++ b/Documentation/arm64/why_use_acpi.txt @@ -0,0 +1,231 @@ +Why ACPI on ARM? +---------------- +Copyright (c) 2015, Linaro, Ltd. +Author: Grant Likely grant.likely@linaro.org + +Why are we doing ACPI on ARM? That question has been asked many times, but +we haven’t yet had a good summary of the most important reasons for wanting +ACPI on ARM. This article is an attempt to state the rationale clearly. + +During an email conversation late last year, Catalin Marinas asked for +a summary of exactly why we want ACPI on ARM, Dong Wei replied with the +following list: +> 1. Support multiple OSes, including Linux and Windows +> 2. Support device configurations +> 3. Support dynamic device configurations (hot add/removal) +> 4. Support hardware abstraction through control methods +> 5. Support power management +> 6. Support thermal management +> 7. Support RAS interfaces + +The above list is certainly true in that all of them need to be supported. +However, that list doesn’t give the rationale for choosing ACPI. We already +have DT mechanisms for doing most of the above, and can certainly create +new bindings for anything that is missing. So, if it isn’t an issue of +functionality, then how does ACPI differ from DT and why is ACPI a better +fit for general purpose ARM servers? + +The difference is in the support model. To explain what I mean, I’m first +going to expand on each of the items above and discuss the similarities and +differences between ACPI and DT. Then, with that as the groundwork, I’ll +discuss how ACPI is a better fit for the general purpose hardware support +model. + + +Device Configurations +--------------------- +2. Support device configurations +3. Support dynamic device configurations (hot add/removal) + +From day one, DT was about device configurations. There isn’t any significant +difference between ACPI & DT here. In fact, the majority of ACPI tables are +completely analogous to DT descriptions. With the exception of the DSDT and +SSDT tables, most ACPI tables are merely flat data used to describe hardware. + +DT platforms have also supported dynamic configuration and hotplug for years. +There isn’t a lot here that differentiates between ACPI and DT. The biggest +difference is that dynamic changes to the ACPI namespace can be triggered by +ACPI methods, whereas for DT changes are received as messages from firmware +and have been very much platform specific (e.g. IBM pSeries does this) + + +Power Management +---------------- +4. Support hardware abstraction through control methods +5. Support power management +6. Support thermal management + +Power, thermal, and clock management can all be dealt with as a group. ACPI +defines a power management model (OSPM) that both the platform and the OS +conform to. The OS implements the OSPM state machine, but the platform can +provide state change behaviour in the form of bytecode methods. Methods can +access hardware directly or hand off PM operations to a coprocessor. The OS +really doesn’t have to care about the details as long as the platform obeys +the rules of the OSPM model. + +With DT, the kernel has device drivers for each and every component in the +platform, and configures them using DT data. DT itself doesn’t have a PM model. +Rather the PM model is an implementation detail of the kernel. Device drivers +use DT data to decide how to handle PM state changes. We have clock, pinctrl, +and regulator frameworks in the kernel for working out runtime PM. However, +this only works when all the drivers and support code have been merged into +the kernel. When the kernel’s PM model doesn’t work for new hardware, then we +change the model. This works very well for mobile/embedded because the vendor +controls the kernel. We can change things when we need to, but we also struggle +with getting board support mainlined. + +This difference has a big impact when it comes to OS support. Engineers from +hardware vendors, Microsoft, and most vocally Red Hat have all told me bluntly +that rebuilding the kernel doesn’t work for enterprise OS support. Their model +is based around a fixed OS release that ideally boots out-of-the-box. It may +still need additional device drivers for specific peripherals/features, but +from a system view, the OS works. When additional drivers are provided +separately, those drivers fit within the existing OSPM model for power +management. This is where ACPI has a technical advantage over DT. The ACPI +OSPM model and it’s bytecode gives the HW vendors a level of abstraction +under their control, not the kernel’s. When the hardware behaves differently +from what the OS expects, the vendor is able to change the behaviour without +changing the HW or patching the OS. + +At this point you’d be right to point out that it is harder to get the whole +system working correctly when behaviour is split between the kernel and the +platform. The OS must trust that the platform doesn’t violate the OSPM model. +All manner of bad things happen if it does. That is exactly why the DT model +doesn’t encode behaviour: It is easier to make changes and fix bugs when +everything is within the same code base. We don’t need a platform/kernel +split when we can modify the kernel. + +However, the enterprise folks don’t have that luxury. The platform/kernel +split isn’t a design choice. It is a characteristic of the market. Hardware +and OS vendors each have their own product timetables, and they don’t line +up. The timeline for getting patches into the kernel and flowing through into +OS releases puts OS support far downstream from the actual release of hardware. +Hardware vendors simply cannot wait for OS support to come online to be able to +release their products. They need to be able to work with available releases, +and make their hardware behave in the way the OS expects. The advantage of ACPI +OSPM is that it defines behaviour and limits what the hardware is allowed to do +without involving the kernel. + +What remains is sorting out how we make sure everything works. How do we make +sure there is enough cross platform testing to ensure new hardware doesn’t +ship broken and that new OS releases don’t break on old hardware? Those are +the reasons why a UEFI/ACPI firmware summit is being organized, it’s why the +UEFI forum holds plugfests 3 times a year, and it is why we’re working on +FWTS and LuvOS. + + +Reliability, Availability & Serviceability (RAS) +------------------------------------------------ +7. Support RAS interfaces + +This isn’t a question of whether or not DT can support RAS. Of course it can. +Rather it is a matter of RAS bindings already existing for ACPI, including a +usage model. We’ve barely begun to explore this on DT. This item doesn’t make +ACPI technically superior to DT, but it certainly makes it more mature. + + +Multiplatform Support +--------------------- +1. Support multiple OSes, including Linux and Windows + +I’m tackling this item last because I think it is the most contentious for +those of us in the Linux world. I wanted to get the other issues out of the +way before addressing it. + +The separation between hardware vendors and OS vendors in the server market +is new for ARM. For the first time ARM hardware and OS release cycles are +completely decoupled from each other, and neither are expected to have specific +knowledge of the other (ie. the hardware vendor doesn’t control the choice of +OS). ARM and their partners want to create an ecosystem of independent OSes +and hardware platforms that don’t explicitly require the former to be ported +to the latter. + +Now, one could argue that Linux is driving the potential market for ARM +servers, and therefore Linux is the only thing that matters, but hardware +vendors don’t see it that way. For hardware vendors it is in their best +interest to support as wide a choice of OSes as possible in order to catch +the widest potential customer base. Even if the majority choose Linux, some +will choose BSD, some will choose Windows, and some will choose something +else. Whether or not we think this is foolish is beside the point; it isn’t +something we have influence over. + +During early ARM server planning meetings between ARM, its partners and other +industry representatives (myself included) we discussed this exact point. +Before us were two options, DT and ACPI. As one of the Linux people in the +room, I advised that ACPI’s closed governance model was a show stopper for +Linux and that DT is the working interface. Microsoft on the other hand made +it abundantly clear that ACPI was the only interface that they would support. +For their part, the hardware vendors stated the platform abstraction behaviour +of ACPI is a hard requirement for their support model and that they would not +close the door on either Linux or Windows. + +However, the one thing that all of us could agree on was that supporting +multiple interfaces doesn’t help anyone: It would require twice as much +effort on defining bindings (once for Linux-DT and once for Windows-ACPI) +and it would require firmware to describe everything twice. Eventually we +reached the compromise to use ACPI, but on the condition of opening the +governance process to give Linux engineers equal influence over the +specification. The fact that we now have a much better seat at the ACPI +table, for both ARM and x86, is a direct result of these early ARM server +negotiations. We are no longer second class citizens in the ACPI world and +are actually driving much of the recent development. + +I know that this line of thought is more about market forces rather than a +hard technical argument between ACPI and DT, but it is an equally significant +one. Agreeing on a single way of doing things is important. The ARM server +ecosystem is better for the agreement to use the same interface for all +operating systems. This is what is meant by standards compliant. The standard +is a codification of the mutually agreed interface. It provides confidence +that all vendors are using the same rules for interoperability. + + +Summary +------- +To summarize, here is the short form rationale for ACPI on ARM: + +-- ACPI’s bytecode allows the platform to encode behaviour. DT explicitly + does not support this. For hardware vendors, being able to encode behaviour + is an important tool for supporting operating system releases on new + hardware. + +-- ACPI’s OSPM defines a power management model that constrains what the + platform is allowed into a specific model while still having flexibility + in hardware design. + +-- For enterprise use-cases, ACPI has extablished bindings, such as for RAS, + which are used in production. DT does not. Yes, we can define those bindings + but doing so means ARM and x86 will use completely different code paths in + both firmware and the kernel. + +-- Choosing a single interface for platform/OS abstraction is important. It + is not reasonable to require vendors to implement both DT and ACPI if they + want to support multiple operating systems. Agreeing on a single interface + instead of being fragmented into per-OS interfaces makes for better + interoperability overall. + +-- The ACPI governance process works well and we’re at the same table as HW + vendors and other OS vendors. In fact, there is no longer any reason to + feel that ACPI is a Windows thing or that we are playing second fiddle to + Microsoft. The move of ACPI governance into the UEFI forum has significantly + opened up the processes, and currently, a large portion of the changes being + made to ACPI is being driven by Linux. + +At the beginning of this article I made the statement that the difference +is in the support model. For servers, responsibility for hardware behaviour +cannot be purely the domain of the kernel, but rather is split between the +platform and the kernel. ACPI frees the OS from needing to understand all +the minute details of the hardware so that the OS doesn’t need to be ported +to each and every device individually. It allows the hardware vendors to take +responsibility for PM behaviour without depending on an OS release cycle which +it is not under their control. + +ACPI is also important because hardware and OS vendors have already worked +out how to use it to support the general purpose ecosystem. The infrastructure +is in place, the bindings are in place, and the process is in place. DT does +exactly what we need it to when working with vertically integrated devices, +but we don’t have good processes for supporting what the server vendors need. +We could potentially get there with DT, but doing so doesn’t buy us anything. +ACPI already does what the hardware vendors need, Microsoft won’t collaborate +with us on DT, and the hardware vendors would still need to provide two +completely separate firmware interface; one for Linux and one for Windows. +
Much removed to cut down the size on this and to highlight a couple of specific sections pertinent to the ACPI on ARMv8 TODO List.....
On 02/02/2015 05:45 AM, Hanjun Guo wrote:
From: Al Stone al.stone@linaro.org
Two more documentation files are also being added: (1) A verbatim copy of the "Why ACPI on ARM?" blog posting by Grant Likely, which is also summarized in arm-acpi.txt, and
(2) A section by section review of the ACPI spec (acpi_object_usage.txt) to note recommendations and prohibitions on the use of the numerous ACPI tables and objects. This sets out the current expectations of the firmware by Linux very explicitly (or as explicitly as I can, for now).
CC: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com CC: Yi Li phoenix.liyi@huawei.com CC: Mark Langsdorf mlangsdo@redhat.com CC: Ashwin Chaugule ashwinc@codeaurora.org Signed-off-by: Al Stone al.stone@linaro.org Signed-off-by: Hanjun Guo hanjun.guo@linaro.org
Documentation/arm64/acpi_object_usage.txt | 592 ++++++++++++++++++++++++++++++ Documentation/arm64/why_use_acpi.txt | 231 ++++++++++++ 2 files changed, 823 insertions(+) create mode 100644 Documentation/arm64/acpi_object_usage.txt create mode 100644 Documentation/arm64/why_use_acpi.txt
diff --git a/Documentation/arm64/acpi_object_usage.txt b/Documentation/arm64/acpi_object_usage.txt new file mode 100644 index 0000000..2c4f733 --- /dev/null +++ b/Documentation/arm64/acpi_object_usage.txt @@ -0,0 +1,592 @@ +ACPI Tables +----------- +The expectations of individual ACPI tables are discussed in the list that +follows.
+If a section number is used, it refers to a section number in the ACPI +specification where the object is defined. If "Signature Reserved" is used, +the table signature (the first four bytes of the table) is the only portion +of the table recognized by the specification, and the actual table is defined +outside of the UEFI Forum (see Section 5.2.6 of the specification).
[snip....]
+ACPI Objects +------------ +The expectations on individual ACPI objects are discussed in the list that +follows:
+Name Section Usage for ARMv8 Linux +---- ------------ ------------------------------------------------- +_ADR 6.1.1 Use as needed.
[snip....]
+_DMA 6.2.4 Optional.
+_DSD 6.2.5 To be used with caution. If this object is used, try
to use it within the constraints already defined by the
Device Properties UUID. Only in rare circumstances
should it be necessary to create a new _DSD UUID.
In either case, submit the _DSD definition along with
any driver patches for discussion, especially when
device properties are used. A driver will not be
considered complete without a corresponding _DSD
description. Once approved by kernel maintainers,
the UUID or device properties must then be registered
with the UEFI Forum; this may cause some iteration as
more than one OS will be registering entries.
[snip...]
So, this is my attempt to encapsulate what I think people want to have happen around the use of _DSD; I just want to make sure I point it out so it doesn't inadvertently get lost somehow.
Is this far too little? Is it sufficient? If it only addresses part of the concerns, what did I miss?
+_OSC 6.2.11 This method can be a global method in ACPI (i.e.,
\_SB._OSC), or it may be associated with a specific
device (e.g., \_SB.DEV0._OSC), or both. When used
as a global method, only capabilities published in
the ACPI specification are allowed. When used as
a device-specifc method, the process described for
using _DSD MUST be used to create an _OSC definition;
out-of-process use of _OSC is not allowed. That is,
submit the device-specific _OSC usage description as
part of the kernel driver submission, get it approved
by the kernel community, then register it with the
UEFI Forum.
Note that _OSC is very similar to _DSD in how it is defined in the ACPI spec. Hence, I suggest a very similar mechanism for vetting the use of _OSC in the kernel. Again: is this sufficient?
+_OSI 5.7.2 Deprecated on ARM64. Any invocation of this method
will print a warning on the console and return false.
That is, as far as ACPI firmware is concerned, _OSI
cannot be used to determine what sort of system is
being used or what functionality is provided. The
_OSC method is to be used instead.
Just a side note that patches have been sent out to deprecate _OSI for arm64, and that a change request has been sent in to the ACPI spec committee to make it official (with an additional forewarning that _OSI will eventually be deprecated completely for all architectures).
On Tue, Feb 03, 2015 at 05:40:20PM -0700, Al Stone wrote:
Much removed to cut down the size on this and to highlight a couple of specific sections pertinent to the ACPI on ARMv8 TODO List.....
This is of course good practice when replying to anything!
+_DSD 6.2.5 To be used with caution. If this object is used, try
to use it within the constraints already defined by the
Device Properties UUID. Only in rare circumstances
should it be necessary to create a new _DSD UUID.
In either case, submit the _DSD definition along with
any driver patches for discussion, especially when
device properties are used. A driver will not be
considered complete without a corresponding _DSD
description. Once approved by kernel maintainers,
the UUID or device properties must then be registered
with the UEFI Forum; this may cause some iteration as
more than one OS will be registering entries.
[snip...]
So, this is my attempt to encapsulate what I think people want to have happen around the use of _DSD; I just want to make sure I point it out so it doesn't inadvertently get lost somehow.
Is this far too little? Is it sufficient? If it only addresses part of the concerns, what did I miss?
This does take us back to the issue of how exactly one is supposed to register/approve _DSD bindings and what format they're written in which I don't think we ever fully got to the bottom of it (there's some stuff on the UEFI website but it's definitely looking a bit placeholderish).
On 02/04/2015 11:12 AM, Mark Brown wrote:
On Tue, Feb 03, 2015 at 05:40:20PM -0700, Al Stone wrote:
Much removed to cut down the size on this and to highlight a couple of specific sections pertinent to the ACPI on ARMv8 TODO List.....
This is of course good practice when replying to anything!
Yup :).
+_DSD 6.2.5 To be used with caution. If this object is used, try + to use it within the constraints already defined by the + Device Properties UUID. Only in rare circumstances + should it be necessary to create a new _DSD UUID. + + In either case, submit the _DSD definition along with + any driver patches for discussion, especially when + device properties are used. A driver will not be + considered complete without a corresponding _DSD + description. Once approved by kernel maintainers, + the UUID or device properties must then be registered + with the UEFI Forum; this may cause some iteration as + more than one OS will be registering entries.
[snip...]
So, this is my attempt to encapsulate what I think people want to have happen around the use of _DSD; I just want to make sure I point it out so it doesn't inadvertently get lost somehow.
Is this far too little? Is it sufficient? If it only addresses part of the concerns, what did I miss?
This does take us back to the issue of how exactly one is supposed to register/approve _DSD bindings and what format they're written in which I don't think we ever fully got to the bottom of it (there's some stuff on the UEFI website but it's definitely looking a bit placeholderish).
Right; the UEFI stuff is indeed place-holder-ish. This is one of the places where Linux is really driving what happens in the spec, so it's a little bit of a chicken-and-egg problem. I will go repair the UEFI data once I have a better understanding of what's needed.
I guess what I'm trying to figure out is: how specific does this need to be? Does it need to be a step-by-step description, something like Documentation/bindings/submitting-patches.txt, or something far more detailed than that, with templates to fill out, and circles and arrows and a paragraph on the back explaining each one [0] :)?
[0] http://en.wikipedia.org/wiki/Alice%27s_Restaurant
On Wed, Feb 04, 2015 at 12:06:14PM -0700, Al Stone wrote:
I guess what I'm trying to figure out is: how specific does this need to be? Does it need to be a step-by-step description, something like Documentation/bindings/submitting-patches.txt, or something far more detailed than that, with templates to fill out, and circles and arrows and a paragraph on the back explaining each one [0] :)?
I'd say it should have a reference to where someone can learn how to do the right thing; explicitly going through every single thing here sounds like overkill and duplication but it's also bad if nobody can tell how they're supposed to diligently follow the instructions.
On Mon, Feb 02, 2015 at 12:45:28PM +0000, Hanjun Guo wrote:
Hi,
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
Add two more documantation to explain why we need ACPI in ARM64 servers by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and ITS, GICv2m are implemented;
Rebased on top of 3.19-rc7.
previous version is here: v7: https://lkml.org/lkml/2015/1/14/586 v6: https://lkml.org/lkml/2015/1/4/40
Any comments are welcome :)
I note that for ACPI the PMU interrupt information is stored in the GICC (as "Performance Interrupt" and "Performance Interrupt Mode"), but I don't see any code for handling that as part of this series.
Is anyone currently looking into that?
For those systems ACPI is being developed on do we know that the GICC information for the PMU interrupts is sane?
I'm slightly worried about the prospect of adding support later only to find that the performance interrupt data in contemporary GICC tables is invalid; it's going to be extremely painful to detect that being the case in order to perform any kind of workaround.
Thanks, Mark.
On 02/03/2015 09:47 AM, Mark Rutland wrote:
On Mon, Feb 02, 2015 at 12:45:28PM +0000, Hanjun Guo wrote:
Hi,
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
Add two more documantation to explain why we need ACPI in ARM64 servers by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and ITS, GICv2m are implemented;
Rebased on top of 3.19-rc7.
previous version is here: v7: https://lkml.org/lkml/2015/1/14/586 v6: https://lkml.org/lkml/2015/1/4/40
Any comments are welcome :)
I note that for ACPI the PMU interrupt information is stored in the GICC (as "Performance Interrupt" and "Performance Interrupt Mode"), but I don't see any code for handling that as part of this series.
Is anyone currently looking into that?
Yes. IIRC, it's a pretty small patch that I'll be including in the Seattle patches that build on top of this core set.
For those systems ACPI is being developed on do we know that the GICC information for the PMU interrupts is sane?
Yes. We know this works for Seattle platforms, using their latest firmware.
I'm slightly worried about the prospect of adding support later only to find that the performance interrupt data in contemporary GICC tables is invalid; it's going to be extremely painful to detect that being the case in order to perform any kind of workaround.
That will depend on the error, of course. It was pretty straightforward when the interrupt value was set to zero in some of the early tables we used.
On 2015年02月04日 01:43, Al Stone wrote:
On 02/03/2015 09:47 AM, Mark Rutland wrote:
On Mon, Feb 02, 2015 at 12:45:28PM +0000, Hanjun Guo wrote:
Hi,
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
Add two more documantation to explain why we need ACPI in ARM64 servers by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and ITS, GICv2m are implemented;
Rebased on top of 3.19-rc7.
previous version is here: v7: https://lkml.org/lkml/2015/1/14/586 v6: https://lkml.org/lkml/2015/1/4/40
Any comments are welcome :)
I note that for ACPI the PMU interrupt information is stored in the GICC (as "Performance Interrupt" and "Performance Interrupt Mode"), but I don't see any code for handling that as part of this series.
Is anyone currently looking into that?
Yes. IIRC, it's a pretty small patch that I'll be including in the Seattle patches that build on top of this core set.
For those systems ACPI is being developed on do we know that the GICC information for the PMU interrupts is sane?
Yes. We know this works for Seattle platforms, using their latest firmware.
I'm slightly worried about the prospect of adding support later only to find that the performance interrupt data in contemporary GICC tables is invalid; it's going to be extremely painful to detect that being the case in order to perform any kind of workaround.
That will depend on the error, of course. It was pretty straightforward when the interrupt value was set to zero in some of the early tables we used.
If we are not worrying about too much patch in this series, I can cherry-pick Mark Salter's ACPI PMU patch for Seattle in next version.
Thanks Hanjun
On 02/02/2015 06:45 AM, Hanjun Guo wrote:
Hi,
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
All 21 patches:
Tested-by: Timur Tabi timur@codeaurora.org
I no longer need to use "acpi=force", so that's nice.
- Add two more documantation to explain why we need ACPI in ARM64 servers
"documentation"
by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
"need to be considered separately", "discussed"
Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and
"implement", "stacked"
On 2015年02月05日 04:29, Timur Tabi wrote:
On 02/02/2015 06:45 AM, Hanjun Guo wrote:
Hi,
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
All 21 patches:
Tested-by: Timur Tabi timur@codeaurora.org
Hi Timur, thank you very much :)
I no longer need to use "acpi=force", so that's nice.
- Add two more documantation to explain why we need ACPI in ARM64
servers
"documentation"
by Grant, and recommendations and prohibitions on the use of the
numerous ACPI tables and objects by Al Stone.
- Add two patches which is need to map acpi tables after
acpi_gbl_permanent_mmap is set
- Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb
property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
"need to be considered separately", "discussed"
- Add CC in the patch to the subsystem maintainers and modify the
subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and
"implement", "stacked"
Sorry for that, I'm not a native English speaker but I'm glad that you you got the meaning of what I said :)
Thank you again for the test.
Hanjun
On 02.02.15 20:45:28, Hanjun Guo wrote:
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
Add two more documantation to explain why we need ACPI in ARM64 servers by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and ITS, GICv2m are implemented;
Rebased on top of 3.19-rc7.
Patches tested on Cavium ThunderX. For the whole series:
Tested-by: Robert Richter rrichter@cavium.com Acked-by: Robert Richter rrichter@cavium.com
Please apply.
Thanks,
-Robert
On 2015年02月12日 18:02, Robert Richter wrote:
On 02.02.15 20:45:28, Hanjun Guo wrote:
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
Add two more documantation to explain why we need ACPI in ARM64 servers by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and ITS, GICv2m are implemented;
Rebased on top of 3.19-rc7.
Patches tested on Cavium ThunderX. For the whole series:
Tested-by: Robert Richter rrichter@cavium.com Acked-by: Robert Richter rrichter@cavium.com
Hi Robert, thank you very much.
I'm going to send out v9 to address some of the comments, I think your Acked-by still apply if no objection from you :)
Thanks Hanjun
On 13.02.15 10:48:18, Hanjun Guo wrote:
On 2015年02月12日 18:02, Robert Richter wrote:
On 02.02.15 20:45:28, Hanjun Guo wrote:
This is the v8 of ACPI core patches for ARM64 based on ACPI 5.1, there are some updates since v7:
Add two more documantation to explain why we need ACPI in ARM64 servers by Grant, and recommendations and prohibitions on the use of the numerous ACPI tables and objects by Al Stone.
Add two patches which is need to map acpi tables after acpi_gbl_permanent_mmap is set
Add another patch "dt / chosen: Add linux,uefi-stub-generated-dtb property" to address that if firmware providing no dtb, we can try ACPI configuration data even if no "acpi=force" is passed in early parameters. (I think ACPI for XEN and kexec need consider sperately as disscussed, correct me if I'm wrong).
Add CC in the patch to the subsystem maintainers and modify the subject of the patch to explicitly show the subsystem touched by this patch set, please help us to review and ack them if they make sense, thanks.
Add Tested-by from Qualcomm and Redhat;
Make ACPI depends on PCI suggested by Catalin;
Clean up SMP init function as Lorenzo suggested, remove physical CPU hot-plug code in the patch;
Address some comments from Marc and explicitly state that will implment statcked irqdomain and GIC init framework when GICv3 and ITS, GICv2m are implemented;
Rebased on top of 3.19-rc7.
Patches tested on Cavium ThunderX. For the whole series:
Tested-by: Robert Richter rrichter@cavium.com Acked-by: Robert Richter rrichter@cavium.com
Hi Robert, thank you very much.
I'm going to send out v9 to address some of the comments, I think your Acked-by still apply if no objection from you :)
Right, unless you hear from me. ;)
-Robert