I started out tweaking some things for kexec, and then things got out of hand...
Anyway, there is hopefully some stuff here that we will reuse, but I am not getting my hopes up that this will land upstream unmodified.
The main premise of these patches is that, in order to support kexec, we need to add code to the kernel that is able to deal with the state of the firmware after SetVirtualAddressMap() has been called. However, if we are going to deal with that anyway, why not make that the default state, and have only a single code path for both cases.
This means SVAM() needs to move to the stub, and hence the code that invents the layout needs to move with it. The result is that the kernel proper is entered with the virt_addr members of all EFI_MEMORY_RUNTIME regions assigned, and the mapping installled in UEFI. The kernel proper needs to set up the page tables, and switch to them while performing the runtime services calls. Note that there is also an efi_to_phys() to translate the values of the fw_vendor and tables fields of the EFI system table. Again, this is something we need to do anyway under kexec, or we end up handing over state between one kernel and the next, which implies different code paths between non-kexec and kexec.
One thing that may stand out is the reordering of the memory map. The reason for doing this is that we can use the same memory map as input to SVAM(). The alternative is allocating memory for it using boot services, but that clutters up the existing logic a bit between getting the memory map, populating the fdt, and loop again if it didn't fit. The current code works perfectly fine, but I am aware that it is an acquired taste :-)
The first 2 patches are stuff that is missing from Matt Fleming's efi-next branch, which is what these patches are based on, so I included them for completeness. The meat is in patch #9, everything before that is groundwork and/or fixes, after that is dropping stuff that we don't need any longer.
Ard Biesheuvel (9): arm64/efi: reserve regions of type ACPI_MEMORY_NVS arm64/efi: drop redundant set_bit(EFI_CONFIG_TABLES) arm64/efi: use UEFI memory map unconditionally if available arm64/mm: add explicit struct_mm argument to __create_mapping() arm64/mm: add create_pgd_mapping() to create private page tables efi: split off remapping code from efi_config_init() arm64/efi: move SetVirtualAddressMap() to UEFI stub arm64/efi: remove free_boot_services() and friends arm64/efi: remove idmap manipulations from UEFI code
Leif Lindholm (1): arm64: ignore DT memreserve entries when booting in UEFI mode
Semen Protsenko (1): efi/arm64: Store Runtime Services revision
arch/arm64/include/asm/efi.h | 23 ++- arch/arm64/include/asm/mmu.h | 12 +- arch/arm64/kernel/efi.c | 368 +++++++++++++------------------------ arch/arm64/kernel/setup.c | 2 +- arch/arm64/mm/init.c | 4 +- arch/arm64/mm/mmu.c | 57 +++--- drivers/firmware/efi/efi.c | 49 +++-- drivers/firmware/efi/libstub/fdt.c | 110 ++++++++++- include/linux/efi.h | 2 + 9 files changed, 328 insertions(+), 299 deletions(-)
From: Semen Protsenko semen.protsenko@linaro.org
"efi" global data structure contains "runtime_version" field which must be assigned in order to use it later in Runtime Services virtual calls (virt_efi_* functions).
Before this patch "runtime_version" was unassigned (0), so each Runtime Service virtual call that checks revision would fail.
Signed-off-by: Semen Protsenko semen.protsenko@linaro.org Acked-by: Ard Biesheuvel ard.biesheuvel@linaro.org Cc: stable@vger.kernel.org Signed-off-by: Matt Fleming matt.fleming@intel.com --- arch/arm64/kernel/efi.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 865fdf5c7344..219a59f2ae97 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -455,6 +455,8 @@ static int __init arm64_enter_virtual_mode(void) efi_native_runtime_setup(); set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
+ efi.runtime_version = efi.systab->hdr.revision; + return 0;
err_unmap:
From: Leif Lindholm leif.lindholm@linaro.org
UEFI provides its own method for marking regions to reserve, via the memory map which is also used to initialise memblock. So when using the UEFI memory map, ignore any memreserve entries present in the DT.
Reported-by: Mark Rutland mark.rutland@arm.com Reviewed-by: Mark Rutland mark.rutland@arm.com Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Leif Lindholm leif.lindholm@linaro.org Signed-off-by: Will Deacon will.deacon@arm.com --- arch/arm64/kernel/efi.c | 2 ++ arch/arm64/mm/init.c | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 219a59f2ae97..95c49ebc660d 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -175,6 +175,8 @@ static __init void reserve_regions(void) if (uefi_debug) pr_cont("\n"); } + + set_bit(EFI_MEMMAP, &efi.flags); }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 5b4526ee3a01..5472c2401876 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -32,6 +32,7 @@ #include <linux/of_fdt.h> #include <linux/dma-mapping.h> #include <linux/dma-contiguous.h> +#include <linux/efi.h>
#include <asm/fixmap.h> #include <asm/sections.h> @@ -148,7 +149,8 @@ void __init arm64_memblock_init(void) memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start); #endif
- early_init_fdt_scan_reserved_mem(); + if (!efi_enabled(EFI_MEMMAP)) + early_init_fdt_scan_reserved_mem();
/* 4GB maximum for 32-bit only capable devices */ if (IS_ENABLED(CONFIG_ZONE_DMA))
On Mon, Oct 20, 2014 at 6:19 PM, Ard Biesheuvel ard.biesheuvel@linaro.org wrote:
I started out tweaking some things for kexec, and then things got out of hand...
Anyway, there is hopefully some stuff here that we will reuse, but I am not getting my hopes up that this will land upstream unmodified.
The main premise of these patches is that, in order to support kexec, we need to add code to the kernel that is able to deal with the state of the firmware after SetVirtualAddressMap() has been called. However, if we are going to deal with that anyway, why not make that the default state, and have only a single code path for both cases.
This means SVAM() needs to move to the stub, and hence the code that invents the layout needs to move with it. The result is that the kernel proper is entered with the virt_addr members of all EFI_MEMORY_RUNTIME regions assigned, and the mapping installled in UEFI. The kernel proper needs to set up the page tables, and switch to them while performing the runtime services calls. Note that there is also an efi_to_phys() to translate the values of the fw_vendor and tables fields of the EFI system table. Again, this is something we need to do anyway under kexec, or we end up handing over state between one kernel and the next, which implies different code paths between non-kexec and kexec.
That's sounds like a pretty sane approach, well done!
g.