This patch set enables kdump (crash dump kernel) support on arm64 on top of Geoff's kexec patchset.
In this version, there are some arm64-specific usage/constraints: 1) "mem=" boot parameter must be specified on crash dump kernel 2) Kvm will not be enabled on crash dump kernel even if configured See commit messages and Documentation/kdump/kdump.txt for details.
The only concern I have is whether or not we can use the exact same kernel as both system kernel and crash dump kernel. The current arm64 kernel is not relocatable in the exact sense but I have no problems in using the same binary for testing kdump.
I tested the code with - ATF v1.1 + EDK2(UEFI) v3.0-rc0 - kernel v4.0-rc4 + Geoff' kexec v8 on Base fast model, using my yet-to-be-submitted kexec-tools [1]. You may want to start a kernel with the following boot parameter: crashkernel=64M@2240M and try $ kexec -p --load <vmlinux> --append ... $ echo c > /proc/sysrq-trigger
To examine vmcore (/proc/vmcore), you may use - gdb v7.7 or later - crash + a small patch (to recognize v4.0 kernel)
[1] https://git.linaro.org/people/takahiro.akashi/kexec-tools.git
AKASHI Takahiro (5): arm64: kdump: reserve memory for crash dump kernel arm64: kdump: implement machine_crash_shutdown() arm64: kdump: do not go into EL2 before starting a crash dump kernel arm64: add kdump support arm64: enable kdump in the arm64 defconfig
Documentation/kdump/kdump.txt | 31 ++++++++++++++- arch/arm64/Kconfig | 12 ++++++ arch/arm64/configs/defconfig | 1 + arch/arm64/include/asm/kexec.h | 34 +++++++++++++++- arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/crash_dump.c | 71 +++++++++++++++++++++++++++++++++ arch/arm64/kernel/machine_kexec.c | 55 +++++++++++++++++++++++++- arch/arm64/kernel/process.c | 7 +++- arch/arm64/kernel/setup.c | 78 +++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/smp.c | 10 ++++- 10 files changed, 294 insertions(+), 6 deletions(-) create mode 100644 arch/arm64/kernel/crash_dump.c
On system kernel, the memory region used by crash dump kernel must be specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel() will allocate the region in "System RAM" and reserve it for later use.
On crash dump kernel, memory region information in system kernel is described in a specific region specified by "elfcorehdr=X@Y" boot parameter. reserve_elfcorehdr() will set aside the region to avoid data destruction by the kernel.
Crash dump kernel will access memory regions in system kernel via copy_oldmem_page(), which reads a page with ioremap'ing it assuming that such pages are not part of main memory of crash dump kernel. This is true under non-UEFI environment because kexec-tools modifies a device tree adding "usablemem" attributes to memory sections. Under UEFI, however, this is not true because UEFI remove memory sections in a device tree and export all the memory regions, even though they belong to system kernel.
So we should add "mem=X[MG]" boot parameter to limit the meory size and avoid hitting the following assertion in ioremap(): if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) return NULL;
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 78 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 150 insertions(+) create mode 100644 arch/arm64/kernel/crash_dump.c
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index da9a7ee..3c24d4e 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -36,6 +36,7 @@ arm64-obj-$(CONFIG_EFI) += efi.o efi-stub.o efi-entry.o arm64-obj-$(CONFIG_PCI) += pci.o arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o +arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-y += $(arm64-obj-y) vdso/ obj-m += $(arm64-obj-m) diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c new file mode 100644 index 0000000..dd31b2e --- /dev/null +++ b/arch/arm64/kernel/crash_dump.c @@ -0,0 +1,71 @@ +/* + * arch/arm64/kernel/crash_dump.c + * + * Copyright (C) 2014 Linaro Limited + * Author: AKASHI Takahiro takahiro.akashi@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/crash_dump.h> +#include <linux/errno.h> +#include <linux/io.h> +#include <linux/memblock.h> +#include <linux/uaccess.h> +#include <asm/memory.h> + +/** + * copy_oldmem_page() - copy one page from old kernel memory + * @pfn: page frame number to be copied + * @buf: buffer where the copied page is placed + * @csize: number of bytes to copy + * @offset: offset in bytes into the page + * @userbuf: if set, @buf is int he user address space + * + * This function copies one page from old kernel memory into buffer pointed by + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes + * copied or negative error in case of failure. + */ +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, + size_t csize, unsigned long offset, + int userbuf) +{ + void *vaddr; + + if (!csize) + return 0; + + vaddr = ioremap(pfn << PAGE_SHIFT, PAGE_SIZE); + if (!vaddr) + return -ENOMEM; + + if (userbuf) { + if (copy_to_user(buf, vaddr + offset, csize)) { + iounmap(vaddr); + return -EFAULT; + } + } else { + memcpy(buf, vaddr + offset, csize); + } + + iounmap(vaddr); + + return csize; +} + +/** + * elfcorehdr_read - read from ELF core header + * @buf: buffer where the data is placed + * @csize: number of bytes to read + * @ppos: address in the memory + * + * This function reads @count bytes from elf core header which exists + * on crash dump kernel's memory. + */ +ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos) +{ + memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count); + return count; +} diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index e8420f6..daaed93 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -323,6 +323,69 @@ static void __init setup_machine_fdt(phys_addr_t dt_phys) dump_stack_set_arch_desc("%s (DT)", of_flat_dt_get_machine_name()); }
+#ifdef CONFIG_KEXEC +/* + * reserve_crashkernel() - reserves memory for crash kernel + * + * This function reserves memory area given in "crashkernel=" kernel command + * line parameter. The memory reserved is used by a dump capture kernel when + * primary kernel is crashing. + */ +static void __init reserve_crashkernel(void) +{ + unsigned long long crash_size, crash_base; + int ret; + + /* use ULONG_MAX since we don't know system memory size here. */ + ret = parse_crashkernel(boot_command_line, ULONG_MAX, + &crash_size, &crash_base); + if (ret) + return; + + ret = memblock_reserve(crash_base, crash_size); + if (ret < 0) { + pr_warn("crashkernel reservation failed - memory is in use (0x%lx)\n", + (unsigned long)crash_base); + return; + } + + pr_info("Reserving %ldMB of memory at %ldMB for crashkernel\n", + (unsigned long)(crash_size >> 20), + (unsigned long)(crash_base >> 20)); + + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; +} +#endif /* CONFIG_KEXEC */ + +#ifdef CONFIG_CRASH_DUMP +/* + * reserve_elfcorehdr() - reserves memory for elf core header + * + * This function reserves memory area given in "elfcorehdr=" kernel command + * line parameter. The memory reserved is used by a dump capture kernel to + * identify the memory used by primary kernel. + */ +static void __init reserve_elfcorehdr(void) +{ + int ret; + + if (!elfcorehdr_size) + return; + + ret = memblock_reserve(elfcorehdr_addr, elfcorehdr_size); + if (ret < 0) { + pr_warn("elfcorehdr reservation failed - memory is in use (0x%lx)\n", + (unsigned long)elfcorehdr_addr); + return; + } + + pr_info("Reserving %ldKB of memory at %ldMB for elfcorehdr\n", + (unsigned long)(elfcorehdr_size >> 10), + (unsigned long)(elfcorehdr_addr >> 20)); +} +#endif /* CONFIG_CRASH_DUMP */ + static void __init request_standard_resources(void) { struct memblock_region *region; @@ -378,10 +441,25 @@ void __init setup_arch(char **cmdline_p) local_async_enable();
efi_init(); + /* + * reserve_crashkernel() and reserver_elfcorehdr() must be called + * before arm64_bootmem_init() because dma_contiguous_reserve() + * may conflict with those regions. + */ +#ifdef CONFIG_KEXEC + reserve_crashkernel(); +#endif +#ifdef CONFIG_CRASH_DUMP + reserve_elfcorehdr(); +#endif arm64_memblock_init();
paging_init(); request_standard_resources(); +#ifdef CONFIG_KEXEC + /* kexec-tool will detect the region with /proc/iomem */ + insert_resource(&iomem_resource, &crashk_res); +#endif
early_ioremap_reset();
Hi Takahiro,
On Thu, 2015-03-26 at 17:28 +0900, AKASHI Takahiro wrote:
On system kernel, the memory region used by crash dump kernel must be specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel() will allocate the region in "System RAM" and reserve it for later use.
On crash dump kernel, memory region information in system kernel is described in a specific region specified by "elfcorehdr=X@Y" boot parameter. reserve_elfcorehdr() will set aside the region to avoid data destruction by the kernel.
Crash dump kernel will access memory regions in system kernel via copy_oldmem_page(), which reads a page with ioremap'ing it assuming that
s/page with/page by/
such pages are not part of main memory of crash dump kernel. This is true under non-UEFI environment because kexec-tools modifies a device tree adding "usablemem" attributes to memory sections. Under UEFI, however, this is not true because UEFI remove memory sections in a device tree and export all the memory regions, even though they belong to system kernel.
So we should add "mem=X[MG]" boot parameter to limit the meory size and
s/meory/memory/
avoid hitting the following assertion in ioremap(): if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) return NULL;
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 78 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 150 insertions(+) create mode 100644 arch/arm64/kernel/crash_dump.c
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index da9a7ee..3c24d4e 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -36,6 +36,7 @@ arm64-obj-$(CONFIG_EFI) += efi.o efi-stub.o efi-entry.o arm64-obj-$(CONFIG_PCI) += pci.o arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o +arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-y += $(arm64-obj-y) vdso/ obj-m += $(arm64-obj-m) diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c new file mode 100644 index 0000000..dd31b2e --- /dev/null +++ b/arch/arm64/kernel/crash_dump.c @@ -0,0 +1,71 @@ +/*
- arch/arm64/kernel/crash_dump.c
I would recommend against adding paths in source files. It often happens that files with paths are moved, but the file comments are not updated.
- Copyright (C) 2014 Linaro Limited
- Author: AKASHI Takahiro takahiro.akashi@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#include <linux/crash_dump.h> +#include <linux/errno.h> +#include <linux/io.h> +#include <linux/memblock.h> +#include <linux/uaccess.h> +#include <asm/memory.h>
+/**
- copy_oldmem_page() - copy one page from old kernel memory
- @pfn: page frame number to be copied
- @buf: buffer where the copied page is placed
- @csize: number of bytes to copy
- @offset: offset in bytes into the page
- @userbuf: if set, @buf is int he user address space
s/is int he user/is a user/
- This function copies one page from old kernel memory into buffer pointed by
- @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
- copied or negative error in case of failure.
- */
+ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
size_t csize, unsigned long offset,
int userbuf)
Since userbuf is a binary flag, I think type bool would be better. Change the comments from 'set' and '1' to 'true'.
Should offset be type size_t?
+{
- void *vaddr;
- if (!csize)
return 0;
- vaddr = ioremap(pfn << PAGE_SHIFT, PAGE_SIZE);
- if (!vaddr)
return -ENOMEM;
- if (userbuf) {
if (copy_to_user(buf, vaddr + offset, csize)) {
iounmap(vaddr);
return -EFAULT;
}
- } else {
memcpy(buf, vaddr + offset, csize);
- }
- iounmap(vaddr);
- return csize;
+}
+/**
- elfcorehdr_read - read from ELF core header
- @buf: buffer where the data is placed
- @csize: number of bytes to read
- @ppos: address in the memory
- This function reads @count bytes from elf core header which exists
- on crash dump kernel's memory.
- */
+ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
Should ppos be type phys_addr_t *?
+{
- memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
- return count;
+} diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index e8420f6..daaed93 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -323,6 +323,69 @@ static void __init setup_machine_fdt(phys_addr_t dt_phys) dump_stack_set_arch_desc("%s (DT)", of_flat_dt_get_machine_name()); } +#ifdef CONFIG_KEXEC +/*
- reserve_crashkernel() - reserves memory for crash kernel
- This function reserves memory area given in "crashkernel=" kernel command
- line parameter. The memory reserved is used by a dump capture kernel when
- primary kernel is crashing.
- */
+static void __init reserve_crashkernel(void) +{
- unsigned long long crash_size, crash_base;
- int ret;
- /* use ULONG_MAX since we don't know system memory size here. */
- ret = parse_crashkernel(boot_command_line, ULONG_MAX,
&crash_size, &crash_base);
- if (ret)
return;
- ret = memblock_reserve(crash_base, crash_size);
- if (ret < 0) {
pr_warn("crashkernel reservation failed - memory is in use (0x%lx)\n",
(unsigned long)crash_base);
Can we use 0x%llx here and below and avoid these casts?
return;
- }
- pr_info("Reserving %ldMB of memory at %ldMB for crashkernel\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crash_base >> 20));
- crashk_res.start = crash_base;
- crashk_res.end = crash_base + crash_size - 1;
+} +#endif /* CONFIG_KEXEC */
+#ifdef CONFIG_CRASH_DUMP +/*
- reserve_elfcorehdr() - reserves memory for elf core header
- This function reserves memory area given in "elfcorehdr=" kernel command
- line parameter. The memory reserved is used by a dump capture kernel to
- identify the memory used by primary kernel.
- */
+static void __init reserve_elfcorehdr(void) +{
- int ret;
- if (!elfcorehdr_size)
return;
- ret = memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
- if (ret < 0) {
pr_warn("elfcorehdr reservation failed - memory is in use (0x%lx)\n",
(unsigned long)elfcorehdr_addr);
return;
- }
- pr_info("Reserving %ldKB of memory at %ldMB for elfcorehdr\n",
(unsigned long)(elfcorehdr_size >> 10),
(unsigned long)(elfcorehdr_addr >> 20));
+} +#endif /* CONFIG_CRASH_DUMP */
static void __init request_standard_resources(void) { struct memblock_region *region; @@ -378,10 +441,25 @@ void __init setup_arch(char **cmdline_p) local_async_enable(); efi_init();
Maybe have a blank line here?
- /*
* reserve_crashkernel() and reserver_elfcorehdr() must be called
s/reserver_elfcorehdr/reserve_elfcorehdr/
* before arm64_bootmem_init() because dma_contiguous_reserve()
* may conflict with those regions.
*/
+#ifdef CONFIG_KEXEC
- reserve_crashkernel();
+#endif +#ifdef CONFIG_CRASH_DUMP
- reserve_elfcorehdr();
+#endif arm64_memblock_init(); paging_init(); request_standard_resources(); +#ifdef CONFIG_KEXEC
- /* kexec-tool will detect the region with /proc/iomem */
There are more kexec user programs than kexec-tools, so I think something like 'user space tools' would be better.
- insert_resource(&iomem_resource, &crashk_res);
+#endif early_ioremap_reset();
Hi Takahiro,
On Thu, 2015-03-26 at 17:28 +0900, AKASHI Takahiro wrote:
On system kernel, the memory region used by crash dump kernel must be specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel() will allocate the region in "System RAM" and reserve it for later use.
On crash dump kernel, memory region information in system kernel is described in a specific region specified by "elfcorehdr=X@Y" boot parameter. reserve_elfcorehdr() will set aside the region to avoid data destruction by the kernel.
Crash dump kernel will access memory regions in system kernel via copy_oldmem_page(), which reads a page with ioremap'ing it assuming that
s/page with/page by/
such pages are not part of main memory of crash dump kernel. This is true under non-UEFI environment because kexec-tools modifies a device tree adding "usablemem" attributes to memory sections. Under UEFI, however, this is not true because UEFI remove memory sections in a device tree and export all the memory regions, even though they belong to system kernel.
So we should add "mem=X[MG]" boot parameter to limit the meory size and
s/meory/memory/
avoid hitting the following assertion in ioremap(): if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) return NULL;
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 78 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 150 insertions(+) create mode 100644 arch/arm64/kernel/crash_dump.c
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index da9a7ee..3c24d4e 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -36,6 +36,7 @@ arm64-obj-$(CONFIG_EFI) += efi.o efi-stub.o efi-entry.o arm64-obj-$(CONFIG_PCI) += pci.o arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o +arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-y += $(arm64-obj-y) vdso/ obj-m += $(arm64-obj-m) diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c new file mode 100644 index 0000000..dd31b2e --- /dev/null +++ b/arch/arm64/kernel/crash_dump.c @@ -0,0 +1,71 @@ +/*
- arch/arm64/kernel/crash_dump.c
I would recommend against adding paths in source files. It often happens that files with paths are moved, but the file comments are not updated.
- Copyright (C) 2014 Linaro Limited
- Author: AKASHI Takahiro takahiro.akashi@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#include <linux/crash_dump.h> +#include <linux/errno.h> +#include <linux/io.h> +#include <linux/memblock.h> +#include <linux/uaccess.h> +#include <asm/memory.h>
+/**
- copy_oldmem_page() - copy one page from old kernel memory
- @pfn: page frame number to be copied
- @buf: buffer where the copied page is placed
- @csize: number of bytes to copy
- @offset: offset in bytes into the page
- @userbuf: if set, @buf is int he user address space
s/is int he user/is a user/
- This function copies one page from old kernel memory into buffer pointed by
- @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
- copied or negative error in case of failure.
- */
+ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
size_t csize, unsigned long offset,
int userbuf)
Since userbuf is a binary flag, I think type bool would be better. Change the comments from 'set' and '1' to 'true'.
Should offset be type size_t?
+{
- void *vaddr;
- if (!csize)
return 0;
- vaddr = ioremap(pfn << PAGE_SHIFT, PAGE_SIZE);
- if (!vaddr)
return -ENOMEM;
- if (userbuf) {
if (copy_to_user(buf, vaddr + offset, csize)) {
iounmap(vaddr);
return -EFAULT;
}
- } else {
memcpy(buf, vaddr + offset, csize);
- }
- iounmap(vaddr);
- return csize;
+}
+/**
- elfcorehdr_read - read from ELF core header
- @buf: buffer where the data is placed
- @csize: number of bytes to read
- @ppos: address in the memory
- This function reads @count bytes from elf core header which exists
- on crash dump kernel's memory.
- */
+ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
Should ppos be type phys_addr_t *?
+{
- memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
- return count;
+} diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index e8420f6..daaed93 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -323,6 +323,69 @@ static void __init setup_machine_fdt(phys_addr_t dt_phys) dump_stack_set_arch_desc("%s (DT)", of_flat_dt_get_machine_name()); } +#ifdef CONFIG_KEXEC +/*
- reserve_crashkernel() - reserves memory for crash kernel
- This function reserves memory area given in "crashkernel=" kernel command
- line parameter. The memory reserved is used by a dump capture kernel when
- primary kernel is crashing.
- */
+static void __init reserve_crashkernel(void) +{
- unsigned long long crash_size, crash_base;
- int ret;
- /* use ULONG_MAX since we don't know system memory size here. */
- ret = parse_crashkernel(boot_command_line, ULONG_MAX,
&crash_size, &crash_base);
- if (ret)
return;
- ret = memblock_reserve(crash_base, crash_size);
- if (ret < 0) {
pr_warn("crashkernel reservation failed - memory is in use (0x%lx)\n",
(unsigned long)crash_base);
Can we use 0x%llx here and below and avoid these casts?
return;
- }
- pr_info("Reserving %ldMB of memory at %ldMB for crashkernel\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crash_base >> 20));
- crashk_res.start = crash_base;
- crashk_res.end = crash_base + crash_size - 1;
+} +#endif /* CONFIG_KEXEC */
+#ifdef CONFIG_CRASH_DUMP +/*
- reserve_elfcorehdr() - reserves memory for elf core header
- This function reserves memory area given in "elfcorehdr=" kernel command
- line parameter. The memory reserved is used by a dump capture kernel to
- identify the memory used by primary kernel.
- */
+static void __init reserve_elfcorehdr(void) +{
- int ret;
- if (!elfcorehdr_size)
return;
- ret = memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
- if (ret < 0) {
pr_warn("elfcorehdr reservation failed - memory is in use (0x%lx)\n",
(unsigned long)elfcorehdr_addr);
return;
- }
- pr_info("Reserving %ldKB of memory at %ldMB for elfcorehdr\n",
(unsigned long)(elfcorehdr_size >> 10),
(unsigned long)(elfcorehdr_addr >> 20));
+} +#endif /* CONFIG_CRASH_DUMP */
static void __init request_standard_resources(void) { struct memblock_region *region; @@ -378,10 +441,25 @@ void __init setup_arch(char **cmdline_p) local_async_enable(); efi_init();
Maybe have a blank line here?
- /*
* reserve_crashkernel() and reserver_elfcorehdr() must be called
s/reserver_elfcorehdr/reserve_elfcorehdr/
* before arm64_bootmem_init() because dma_contiguous_reserve()
* may conflict with those regions.
*/
+#ifdef CONFIG_KEXEC
- reserve_crashkernel();
+#endif +#ifdef CONFIG_CRASH_DUMP
- reserve_elfcorehdr();
+#endif arm64_memblock_init(); paging_init(); request_standard_resources(); +#ifdef CONFIG_KEXEC
- /* kexec-tool will detect the region with /proc/iomem */
There are more kexec user programs than kexec-tools, so I think something like 'user space tools' would be better.
- insert_resource(&iomem_resource, &crashk_res);
+#endif early_ioremap_reset();
Thank you, Geoff.
On 03/27/2015 01:43 PM, Geoff Levand wrote:
Hi Takahiro,
On Thu, 2015-03-26 at 17:28 +0900, AKASHI Takahiro wrote:
On system kernel, the memory region used by crash dump kernel must be specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel() will allocate the region in "System RAM" and reserve it for later use.
On crash dump kernel, memory region information in system kernel is described in a specific region specified by "elfcorehdr=X@Y" boot parameter. reserve_elfcorehdr() will set aside the region to avoid data destruction by the kernel.
Crash dump kernel will access memory regions in system kernel via copy_oldmem_page(), which reads a page with ioremap'ing it assuming that
s/page with/page by/
Fix it.
such pages are not part of main memory of crash dump kernel. This is true under non-UEFI environment because kexec-tools modifies a device tree adding "usablemem" attributes to memory sections. Under UEFI, however, this is not true because UEFI remove memory sections in a device tree and export all the memory regions, even though they belong to system kernel.
So we should add "mem=X[MG]" boot parameter to limit the meory size and
s/meory/memory/
Fix it.
avoid hitting the following assertion in ioremap(): if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) return NULL;
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org
arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 78 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 150 insertions(+) create mode 100644 arch/arm64/kernel/crash_dump.c
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index da9a7ee..3c24d4e 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -36,6 +36,7 @@ arm64-obj-$(CONFIG_EFI) += efi.o efi-stub.o efi-entry.o arm64-obj-$(CONFIG_PCI) += pci.o arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o +arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-y += $(arm64-obj-y) vdso/ obj-m += $(arm64-obj-m) diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c new file mode 100644 index 0000000..dd31b2e --- /dev/null +++ b/arch/arm64/kernel/crash_dump.c @@ -0,0 +1,71 @@ +/*
- arch/arm64/kernel/crash_dump.c
I would recommend against adding paths in source files. It often happens that files with paths are moved, but the file comments are not updated.
Yeah, will fix it.
- Copyright (C) 2014 Linaro Limited
- Author: AKASHI Takahiro takahiro.akashi@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- */
+#include <linux/crash_dump.h> +#include <linux/errno.h> +#include <linux/io.h> +#include <linux/memblock.h> +#include <linux/uaccess.h> +#include <asm/memory.h>
+/**
- copy_oldmem_page() - copy one page from old kernel memory
- @pfn: page frame number to be copied
- @buf: buffer where the copied page is placed
- @csize: number of bytes to copy
- @offset: offset in bytes into the page
- @userbuf: if set, @buf is int he user address space
s/is int he user/is a user/
Fix it. Sorry for bunch of typos.
- This function copies one page from old kernel memory into buffer pointed by
- @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
- copied or negative error in case of failure.
- */
+ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
size_t csize, unsigned long offset,
int userbuf)
Since userbuf is a binary flag, I think type bool would be better. Change the comments from 'set' and '1' to 'true'.
Should offset be type size_t?
No. The prototype is already there in include/linux/crash_dump.h.
+{
- void *vaddr;
- if (!csize)
return 0;
- vaddr = ioremap(pfn << PAGE_SHIFT, PAGE_SIZE);
- if (!vaddr)
return -ENOMEM;
- if (userbuf) {
if (copy_to_user(buf, vaddr + offset, csize)) {
iounmap(vaddr);
return -EFAULT;
}
- } else {
memcpy(buf, vaddr + offset, csize);
- }
- iounmap(vaddr);
- return csize;
+}
+/**
- elfcorehdr_read - read from ELF core header
- @buf: buffer where the data is placed
- @csize: number of bytes to read
- @ppos: address in the memory
- This function reads @count bytes from elf core header which exists
- on crash dump kernel's memory.
- */
+ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
Should ppos be type phys_addr_t *?
No. The prototype is already there in include/linux/crash_dump.h.
+{
- memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
- return count;
+} diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index e8420f6..daaed93 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -323,6 +323,69 @@ static void __init setup_machine_fdt(phys_addr_t dt_phys) dump_stack_set_arch_desc("%s (DT)", of_flat_dt_get_machine_name()); }
+#ifdef CONFIG_KEXEC +/*
- reserve_crashkernel() - reserves memory for crash kernel
- This function reserves memory area given in "crashkernel=" kernel command
- line parameter. The memory reserved is used by a dump capture kernel when
- primary kernel is crashing.
- */
+static void __init reserve_crashkernel(void) +{
- unsigned long long crash_size, crash_base;
- int ret;
- /* use ULONG_MAX since we don't know system memory size here. */
- ret = parse_crashkernel(boot_command_line, ULONG_MAX,
&crash_size, &crash_base);
- if (ret)
return;
- ret = memblock_reserve(crash_base, crash_size);
- if (ret < 0) {
pr_warn("crashkernel reservation failed - memory is in use (0x%lx)\n",
(unsigned long)crash_base);
Can we use 0x%llx here and below and avoid these casts?
Fix it.
return;
- }
- pr_info("Reserving %ldMB of memory at %ldMB for crashkernel\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crash_base >> 20));
- crashk_res.start = crash_base;
- crashk_res.end = crash_base + crash_size - 1;
+} +#endif /* CONFIG_KEXEC */
+#ifdef CONFIG_CRASH_DUMP +/*
- reserve_elfcorehdr() - reserves memory for elf core header
- This function reserves memory area given in "elfcorehdr=" kernel command
- line parameter. The memory reserved is used by a dump capture kernel to
- identify the memory used by primary kernel.
- */
+static void __init reserve_elfcorehdr(void) +{
- int ret;
- if (!elfcorehdr_size)
return;
- ret = memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
- if (ret < 0) {
pr_warn("elfcorehdr reservation failed - memory is in use (0x%lx)\n",
(unsigned long)elfcorehdr_addr);
return;
- }
- pr_info("Reserving %ldKB of memory at %ldMB for elfcorehdr\n",
(unsigned long)(elfcorehdr_size >> 10),
(unsigned long)(elfcorehdr_addr >> 20));
+} +#endif /* CONFIG_CRASH_DUMP */
- static void __init request_standard_resources(void) { struct memblock_region *region;
@@ -378,10 +441,25 @@ void __init setup_arch(char **cmdline_p) local_async_enable();
efi_init();
Maybe have a blank line here?
Add one.
- /*
* reserve_crashkernel() and reserver_elfcorehdr() must be called
s/reserver_elfcorehdr/reserve_elfcorehdr/
* before arm64_bootmem_init() because dma_contiguous_reserve()
* may conflict with those regions.
*/
+#ifdef CONFIG_KEXEC
- reserve_crashkernel();
+#endif +#ifdef CONFIG_CRASH_DUMP
- reserve_elfcorehdr();
+#endif arm64_memblock_init();
paging_init(); request_standard_resources(); +#ifdef CONFIG_KEXEC
- /* kexec-tool will detect the region with /proc/iomem */
There are more kexec user programs than kexec-tools, so I think something like 'user space tools' would be better.
Sure, fix it.
-Takahiro AKASHI
- insert_resource(&iomem_resource, &crashk_res);
+#endif
early_ioremap_reset();
Hi Takahiro,
On Thursday 26 March 2015 01:58 PM, AKASHI Takahiro wrote:
Crash dump kernel will access memory regions in system kernel via copy_oldmem_page(), which reads a page with ioremap'ing it assuming that such pages are not part of main memory of crash dump kernel. This is true under non-UEFI environment because kexec-tools modifies a device tree adding "usablemem" attributes to memory sections. Under UEFI, however, this is not true because UEFI remove memory sections in a device tree and export all the memory regions, even though they belong to system kernel.
So we should add "mem=X[MG]" boot parameter to limit the meory size and avoid hitting the following assertion in ioremap(): if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) return NULL;
Well I am using your updated kexec-tool which has support of automatic addition of "mem=" parameter. I found that this warning is still appearing and therefore another error about "Kdump: vmcore not initialized".
Memory address for which ioremap failed was almost at the top of crash_reserved_mem. So I modified kexec-tool [1] to accept user specific mem= parameter with a value lesser than physical location which was being remapped, however still the warning was there.
Further I noticed that there is no reserved memblock with nonzero memblock_region->size when early_mem -> memblock_enforce_memory_limit is called. Therefore this mem= param is not limiting memory location in my case.
I was just wondering, why do not we use ioremap_cache instead of ioremap in copy_oldmem_page?
~Pratyush
[1] https://github.com/pratyushanand/kexec-tools/commit/7dc38d587cb32d4522f6baf0...
Hi Pratyush,
On 04/09/2015 10:09 PM, Pratyush Anand wrote:
Hi Takahiro,
On Thursday 26 March 2015 01:58 PM, AKASHI Takahiro wrote:
Crash dump kernel will access memory regions in system kernel via copy_oldmem_page(), which reads a page with ioremap'ing it assuming that such pages are not part of main memory of crash dump kernel. This is true under non-UEFI environment because kexec-tools modifies a device tree adding "usablemem" attributes to memory sections. Under UEFI, however, this is not true because UEFI remove memory sections in a device tree and export all the memory regions, even though they belong to system kernel.
So we should add "mem=X[MG]" boot parameter to limit the meory size and avoid hitting the following assertion in ioremap(): if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) return NULL;
Well I am using your updated kexec-tool which has support of automatic addition of "mem=" parameter. I found that this warning is still appearing and therefore another error about "Kdump: vmcore not initialized".
Memory address for which ioremap failed was almost at the top of crash_reserved_mem. So I modified kexec-tool [1] to accept user specific mem= parameter with a value lesser than physical location which was being remapped, however still the warning was there.
Further I noticed that there is no reserved memblock with nonzero memblock_region->size when early_mem -> memblock_enforce_memory_limit is called. Therefore this mem= param is not limiting memory location in my case.
On crash dump kernel? Sounds strange. Can you send me the followings for both 1st kernel and crash dump kernel? (add memblock_debug to cmd line for verbose messages)
- boot log (dmesg) - cat /proc/iomem
sending them in a private mail is fine.
I was just wondering, why do not we use ioremap_cache instead of ioremap in copy_oldmem_page?
Good point. My next version of kdump patch uses ioremap_cache() for another reason. # As I'm discussing with Mark about kvm issue, I'm holding off submitting it.
Thanks, -Takahiro AKASHI
~Pratyush
[1] https://github.com/pratyushanand/kexec-tools/commit/7dc38d587cb32d4522f6baf0...
On Friday 10 April 2015 11:27 AM, AKASHI Takahiro wrote:
Hi Pratyush,
On 04/09/2015 10:09 PM, Pratyush Anand wrote:
Hi Takahiro,
On Thursday 26 March 2015 01:58 PM, AKASHI Takahiro wrote:
Crash dump kernel will access memory regions in system kernel via copy_oldmem_page(), which reads a page with ioremap'ing it assuming that such pages are not part of main memory of crash dump kernel. This is true under non-UEFI environment because kexec-tools modifies a device tree adding "usablemem" attributes to memory sections. Under UEFI, however, this is not true because UEFI remove memory sections in a device tree and export all the memory regions, even though they belong to system kernel.
So we should add "mem=X[MG]" boot parameter to limit the meory size and avoid hitting the following assertion in ioremap(): if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) return NULL;
Well I am using your updated kexec-tool which has support of automatic addition of "mem=" parameter. I found that this warning is still appearing and therefore another error about "Kdump: vmcore not initialized".
Memory address for which ioremap failed was almost at the top of crash_reserved_mem. So I modified kexec-tool [1] to accept user specific mem= parameter with a value lesser than physical location which was being remapped, however still the warning was there.
Further I noticed that there is no reserved memblock with nonzero memblock_region->size when early_mem -> memblock_enforce_memory_limit is called. Therefore this mem= param is not limiting memory location in my case.
On crash dump kernel? Sounds strange. Can you send me the followings for both 1st kernel and crash dump kernel? (add memblock_debug to cmd line for verbose messages)
- boot log (dmesg)
- cat /proc/iomem
sending them in a private mail is fine.
OK.
I was just wondering, why do not we use ioremap_cache instead of ioremap in copy_oldmem_page?
Good point. My next version of kdump patch uses ioremap_cache() for another reason.
Thanks that you have already changed it. I am using ioremap_cache and I am able to get /proc/vmcore :) ioremap allocates VAs with memory attribute DEVICE which might not be correct for RAM area.ioremap_cache is using memory attribute NORMAL which seems more logical for RAM areas.
# As I'm discussing with Mark about kvm issue, I'm holding off submitting it.
I am using one more modification on top of your patches for finding crashkernel address automatically. If that seems fine to you then may be I can send you that patch and you can add that in your next rev.
Author: Pratyush Anand panand@redhat.com Date: Tue Apr 7 08:14:55 2015 +0530
arm64/kdump: Find free area for crash kernel memory
If crashkernel=X@0 is passed to the kernel then this patch will find free memory area of size X for crash kernel.
Signed-off-by: Pratyush Anand panand@redhat.com
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 94694897beea..90b0b9d138f6 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -368,10 +368,18 @@ static void __init reserve_crashkernel(void) if (ret) return;
- ret = memblock_reserve(crash_base, crash_size); - if (ret < 0) { + /* 0 means: find the address automatically */ + if (crash_base <= 0) { + crash_base = memblock_alloc(crash_size, 2 << 20); + + if (!crash_base) { + pr_warn("Could not find free memory for crashkernel\n"); + return; + } + + } else if (memblock_reserve(crash_base, crash_size) < 0) { pr_warn("crashkernel reservation failed - memory is in use (0x%lx)\n", - (unsigned long)crash_base); + (unsigned long)crash_base); return; }
~Pratyush
kdump calls machine_crash_shutdown() to shut down non-boot cpus and save per-cpu general-purpose registers before restarting the crash dump kernel. See kernel_kexec(). ipi_cpu_stop() is used and a bit modified to support this behavior.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/include/asm/kexec.h | 34 ++++++++++++++++++++++- arch/arm64/kernel/machine_kexec.c | 55 ++++++++++++++++++++++++++++++++++++- arch/arm64/kernel/smp.c | 10 +++++-- 3 files changed, 95 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 3530ff5..eaf3fcb 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -30,6 +30,8 @@
#if !defined(__ASSEMBLY__)
+extern bool in_crash_kexec; + /** * crash_setup_regs() - save registers for the panic kernel * @@ -40,7 +42,37 @@ static inline void crash_setup_regs(struct pt_regs *newregs, struct pt_regs *oldregs) { - /* Empty routine needed to avoid build errors. */ + if (oldregs) { + memcpy(newregs, oldregs, sizeof(*newregs)); + } else { + __asm__ __volatile__ ( + "stp x0, x1, [%3]\n\t" + "stp x2, x3, [%3, 0x10]\n\t" + "stp x4, x5, [%3, 0x20]\n\t" + "stp x6, x7, [%3, 0x30]\n\t" + "stp x8, x9, [%3, 0x40]\n\t" + "stp x10, x11, [%3, 0x50]\n\t" + "stp x12, x13, [%3, 0x60]\n\t" + "stp x14, x15, [%3, 0x70]\n\t" + "stp x16, x17, [%3, 0x80]\n\t" + "stp x18, x19, [%3, 0x90]\n\t" + "stp x20, x21, [%3, 0xa0]\n\t" + "stp x22, x23, [%3, 0xb0]\n\t" + "stp x24, x25, [%3, 0xc0]\n\t" + "stp x26, x27, [%3, 0xd0]\n\t" + "stp x28, x29, [%3, 0xe0]\n\t" + "str x30, [%3, 0xf0]\n\t" + "mov %0, sp\n\t" + "adr %1, 1f\n\t" + "mrs %2, spsr_el1\n\t" + "1:" + : "=r" (newregs->sp), + "=r" (newregs->pc), + "=r" (newregs->pstate) + : "r" (&newregs->regs) + : "memory" + ); + } }
#endif /* !defined(__ASSEMBLY__) */ diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 7aa2fa2..375dd1f 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -9,6 +9,8 @@ * published by the Free Software Foundation. */
+#include <linux/irq.h> +#include <linux/kernel.h> #include <linux/kexec.h> #include <linux/of_fdt.h> #include <linux/slab.h> @@ -24,6 +26,8 @@ extern unsigned long arm64_kexec_dtb_addr; extern unsigned long arm64_kexec_kimage_head; extern unsigned long arm64_kexec_kimage_start;
+bool in_crash_kexec = false; + /** * kexec_is_dtb - Helper routine to check the device tree header signature. */ @@ -183,7 +187,56 @@ void machine_kexec(struct kimage *image) soft_restart(reboot_code_buffer_phys); }
+static void machine_kexec_mask_interrupts(void) +{ + unsigned int i; + struct irq_desc *desc; + + for_each_irq_desc(i, desc) { + struct irq_chip *chip; + + chip = irq_desc_get_chip(desc); + if (!chip) + continue; + + if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) + chip->irq_eoi(&desc->irq_data); + + if (chip->irq_mask) + chip->irq_mask(&desc->irq_data); + + if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) + chip->irq_disable(&desc->irq_data); + } +} + +/** + * machine_crash_shutdown - shutdown non-boot cpus and save registers + */ void machine_crash_shutdown(struct pt_regs *regs) { - /* Empty routine needed to avoid build errors. */ + struct pt_regs dummy_regs; + int cpu; + + local_irq_disable(); + + in_crash_kexec = true; + + /* + * clear and initialize the per-cpu info. This is necessary + * because, otherwise, slots for offline cpus would not be + * filled up. See smp_send_stop(). + */ + memset(&dummy_regs, 0, sizeof(dummy_regs)); + for_each_possible_cpu(cpu) + crash_save_cpu(&dummy_regs, cpu); + + /* shutdown non-boot cpus */ + smp_send_stop(); + + /* for this cpu */ + crash_save_cpu(regs, smp_processor_id()); + machine_kexec_mask_interrupts(); + + pr_info("Loading crashdump kernel...\n"); } diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 328b8ce..593d969 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -36,6 +36,7 @@ #include <linux/completion.h> #include <linux/of.h> #include <linux/irq_work.h> +#include <linux/kexec.h>
#include <asm/alternative.h> #include <asm/atomic.h> @@ -541,7 +542,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock); /* * ipi_cpu_stop - handle IPI from smp_send_stop() */ -static void ipi_cpu_stop(unsigned int cpu) +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs) { if (system_state == SYSTEM_BOOTING || system_state == SYSTEM_RUNNING) { @@ -555,6 +556,11 @@ static void ipi_cpu_stop(unsigned int cpu)
local_irq_disable();
+#ifdef CONFIG_KEXEC + if (in_crash_kexec) + crash_save_cpu(regs, cpu); +#endif /* CONFIG_KEXEC */ + while (1) cpu_relax(); } @@ -585,7 +591,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
case IPI_CPU_STOP: irq_enter(); - ipi_cpu_stop(cpu); + ipi_cpu_stop(cpu, regs); irq_exit(); break;
Unlike normal kexec case, we don't have a chance to reset EL2 context in a generic way because bad exceptions may directly invoke crash_kexec(). (See die().) Kvm is not useful on crash dump kernel anyway, and so we let it un-initialized across rebooting.
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/kernel/process.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index d894d3e..9859f5c 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -48,6 +48,7 @@ #include <asm/compat.h> #include <asm/cacheflush.h> #include <asm/fpsimd.h> +#include <asm/kexec.h> #include <asm/mmu_context.h> #include <asm/processor.h> #include <asm/stacktrace.h> @@ -64,7 +65,11 @@ void soft_restart(unsigned long addr) setup_mm_for_reboot();
cpu_soft_restart(virt_to_phys(cpu_reset), - is_hyp_mode_available(), addr); +#ifdef CONFIG_KEXEC + !in_crash_kexec && +#endif + is_hyp_mode_available(), + addr);
/* Should never get here */ BUG();
On Thu, 2015-03-26 at 17:28 +0900, AKASHI Takahiro wrote:
@@ -64,7 +65,11 @@ void soft_restart(unsigned long addr) setup_mm_for_reboot(); cpu_soft_restart(virt_to_phys(cpu_reset),
is_hyp_mode_available(), addr);
+#ifdef CONFIG_KEXEC
!in_crash_kexec &&
+#endif
Why not define in_crash_kexec without condition on CONFIG_KEXEC, say here in process.c and then avoid these preprocessor conditionals.
is_hyp_mode_available(),
addr);
/* Should never get here */ BUG();
On 03/27/2015 07:29 AM, Geoff Levand wrote:
On Thu, 2015-03-26 at 17:28 +0900, AKASHI Takahiro wrote:
@@ -64,7 +65,11 @@ void soft_restart(unsigned long addr) setup_mm_for_reboot();
cpu_soft_restart(virt_to_phys(cpu_reset),
is_hyp_mode_available(), addr);
+#ifdef CONFIG_KEXEC
!in_crash_kexec &&
+#endif
Why not define in_crash_kexec without condition on CONFIG_KEXEC, say here in process.c and then avoid these preprocessor conditionals.
Well, I thought of that, but as its name suggested, the variable should be basically part of kdump code. Moreover, I suspect that some one might reject a whole idea of patch #3.
-Takahiro AKASHI
is_hyp_mode_available(),
addr);
/* Should never get here */ BUG();
Please read the following commits for arm64-specific constraints: arm64: kdump: reserve memory for crash dump kernel arm64: kdump: do not go into EL2 before starting a crash dump kernel
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- Documentation/kdump/kdump.txt | 31 ++++++++++++++++++++++++++++++- arch/arm64/Kconfig | 12 ++++++++++++ 2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt index bc4bd5a..f9cf6f5 100644 --- a/Documentation/kdump/kdump.txt +++ b/Documentation/kdump/kdump.txt @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to a remote system.
Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64, -s390x and arm architectures. +s390x, arm and arm64 architectures.
When the system kernel boots, it reserves a small section of memory for the dump-capture kernel. This ensures that ongoing Direct Memory Access @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
AUTO_ZRELADDR=y
+Dump-capture kernel config options (Arch Dependent, arm64) +---------------------------------------------------------- + +1) Disable symmetric multi-processing support under "Processor type and + features": + + CONFIG_SMP=n + + (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line + when loading the dump-capture kernel, see section "Load the Dump-capture + Kernel".) + +2) Under uefi, the maximum memory size on the dump-capture kernel must be + limited by specifying: + + mem=X[MG] + + where X should be less than or equal to the size in "crashkernel=" + boot parameter. Kexec-tools will automatically add this. + +3) Currently, kvm will not be initialized on the dump-capture kernel even + if it is configured. + Extended crashkernel syntax ===========================
@@ -312,6 +335,7 @@ Boot into System Kernel any space below the alignment point may be overwritten by the dump-capture kernel, which means it is possible that the vmcore is not that precise as expected.
+ On arm64, use "crashkernel=Y@X".
Load the Dump-capture Kernel ============================ @@ -334,6 +358,8 @@ For s390x: - Use image or bzImage For arm: - Use zImage +For arm64: + - Use vmlinux
If you are using a uncompressed vmlinux image then use following command to load dump-capture kernel. @@ -377,6 +403,9 @@ For s390x: For arm: "1 maxcpus=1 reset_devices"
+For arm64: + "1 maxcpus=1 mem=X[MG] reset_devices" + Notes on loading the dump-capture kernel:
* By default, the ELF headers are stored in ELF64 format to support diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 5a606d1..06a6ed4 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -537,6 +537,18 @@ config KEXEC but it is independent of the system firmware. And like a reboot you can start any kernel with it, not just Linux.
+config CRASH_DUMP + bool "Build kdump crash kernel" + help + Generate crash dump after being started by kexec. This should + be normally only set in special crash dump kernels which are + loaded in the main kernel with kexec-tools into a specially + reserved region and then later executed after a crash by + kdump/kexec. The crash dump kernel must be compiled to a + memory address not used by the main kernel. + + For more details see Documentation/kdump/kdump.txt + config XEN_DOM0 def_bool y depends on XEN
Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org --- arch/arm64/configs/defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 3f1c593..f5af17f 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -45,6 +45,7 @@ CONFIG_KSM=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_CMA=y CONFIG_KEXEC=y +CONFIG_CRASH_DUMP=y CONFIG_CMDLINE="console=ttyAMA0" # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set CONFIG_COMPAT=y
Hi Akashi,
On Thursday 26 March 2015 01:58 PM, AKASHI Takahiro wrote:
This patch set enables kdump (crash dump kernel) support on arm64 on top of Geoff's kexec patchset.
In this version, there are some arm64-specific usage/constraints:
- "mem=" boot parameter must be specified on crash dump kernel
- Kvm will not be enabled on crash dump kernel even if configured
See commit messages and Documentation/kdump/kdump.txt for details.
The only concern I have is whether or not we can use the exact same kernel as both system kernel and crash dump kernel. The current arm64 kernel is not relocatable in the exact sense but I have no problems in using the same binary for testing kdump.
I tested the code with
- ATF v1.1 + EDK2(UEFI) v3.0-rc0
- kernel v4.0-rc4 + Geoff' kexec v8
on Base fast model, using my yet-to-be-submitted kexec-tools [1]. You may want to start a kernel with the following boot parameter: crashkernel=64M@2240M and try $ kexec -p --load <vmlinux> --append ... $ echo c > /proc/sysrq-trigger
I tried to use your kdump patches for kernel and kexec-tools. I am not able to load the crash kernel properly.
I passed crashkernel=64M@259G (My 8GB RAM starts at 256G) to primary kernel.
Used following to load the crash kernel kexec -p --load vmlinux --append="$( cat /proc/cmdline ) maxcpus=1 mem=64M reset_devices"
I see:
kexec_load failed: Cannot assign requested address entry = 0x40c40005d0 flags = 0xb70001
What I noticed that arm64_load_other_segments does not calculate correct load address for purgatory or if I pass initrd then for that too within crash kernel allocated memory.
Shouldn't we have a function similar to locate_dtb_in_crashmem for putgatory as well as initrd? I can try to fix this allocation, but I was just wondering if I understood correctly or I am missing something.
~Pratyush
To examine vmcore (/proc/vmcore), you may use
- gdb v7.7 or later
- crash + a small patch (to recognize v4.0 kernel)
[1] https://git.linaro.org/people/takahiro.akashi/kexec-tools.git
AKASHI Takahiro (5): arm64: kdump: reserve memory for crash dump kernel arm64: kdump: implement machine_crash_shutdown() arm64: kdump: do not go into EL2 before starting a crash dump kernel arm64: add kdump support arm64: enable kdump in the arm64 defconfig
Documentation/kdump/kdump.txt | 31 ++++++++++++++- arch/arm64/Kconfig | 12 ++++++ arch/arm64/configs/defconfig | 1 + arch/arm64/include/asm/kexec.h | 34 +++++++++++++++- arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/crash_dump.c | 71 +++++++++++++++++++++++++++++++++ arch/arm64/kernel/machine_kexec.c | 55 +++++++++++++++++++++++++- arch/arm64/kernel/process.c | 7 +++- arch/arm64/kernel/setup.c | 78 +++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/smp.c | 10 ++++- 10 files changed, 294 insertions(+), 6 deletions(-) create mode 100644 arch/arm64/kernel/crash_dump.c
Hi Pratyush,
On 04/02/2015 12:56 AM, Pratyush Anand wrote:
Hi Akashi,
On Thursday 26 March 2015 01:58 PM, AKASHI Takahiro wrote:
This patch set enables kdump (crash dump kernel) support on arm64 on top of Geoff's kexec patchset.
In this version, there are some arm64-specific usage/constraints:
- "mem=" boot parameter must be specified on crash dump kernel
- Kvm will not be enabled on crash dump kernel even if configured
See commit messages and Documentation/kdump/kdump.txt for details.
The only concern I have is whether or not we can use the exact same kernel as both system kernel and crash dump kernel. The current arm64 kernel is not relocatable in the exact sense but I have no problems in using the same binary for testing kdump.
I tested the code with - ATF v1.1 + EDK2(UEFI) v3.0-rc0 - kernel v4.0-rc4 + Geoff' kexec v8 on Base fast model, using my yet-to-be-submitted kexec-tools [1]. You may want to start a kernel with the following boot parameter: crashkernel=64M@2240M and try $ kexec -p --load <vmlinux> --append ... $ echo c > /proc/sysrq-trigger
I tried to use your kdump patches for kernel and kexec-tools. I am not able to load the crash kernel properly.
I passed crashkernel=64M@259G (My 8GB RAM starts at 256G) to primary kernel.
Used following to load the crash kernel kexec -p --load vmlinux --append="$( cat /proc/cmdline ) maxcpus=1 mem=64M reset_devices"
I see:
kexec_load failed: Cannot assign requested address entry = 0x40c40005d0 flags = 0xb70001
What I noticed that arm64_load_other_segments does not calculate correct load address for purgatory or if I pass initrd then for that too within crash kernel allocated memory.
It seems that you hit the same problem that I mentioned in: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/334613.html
Please try my latest kexec-tools in my linaro repo (branch name is kdump/v0.11) and let me know the result.
Shouldn't we have a function similar to locate_dtb_in_crashmem for putgatory as well as initrd? I can try to fix this allocation, but I was just wondering if I understood correctly or I am missing something.
I don't think a function for purgatory is necessary since we try to find a hole just after dtb.
Thanks, -Takahiro AKASHI
~Pratyush
To examine vmcore (/proc/vmcore), you may use - gdb v7.7 or later - crash + a small patch (to recognize v4.0 kernel)
[1] https://git.linaro.org/people/takahiro.akashi/kexec-tools.git
AKASHI Takahiro (5): arm64: kdump: reserve memory for crash dump kernel arm64: kdump: implement machine_crash_shutdown() arm64: kdump: do not go into EL2 before starting a crash dump kernel arm64: add kdump support arm64: enable kdump in the arm64 defconfig
Documentation/kdump/kdump.txt | 31 ++++++++++++++- arch/arm64/Kconfig | 12 ++++++ arch/arm64/configs/defconfig | 1 + arch/arm64/include/asm/kexec.h | 34 +++++++++++++++- arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/crash_dump.c | 71 +++++++++++++++++++++++++++++++++ arch/arm64/kernel/machine_kexec.c | 55 +++++++++++++++++++++++++- arch/arm64/kernel/process.c | 7 +++- arch/arm64/kernel/setup.c | 78 +++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/smp.c | 10 ++++- 10 files changed, 294 insertions(+), 6 deletions(-) create mode 100644 arch/arm64/kernel/crash_dump.c
On Thursday 02 April 2015 04:57 AM, AKASHI Takahiro wrote:
Please try my latest kexec-tools in my linaro repo (branch name is kdump/v0.11) and let me know the result.
Thanks a lot.. Just fetched your repo and found v.0.11.
With this crash kernel loaded successfully, if I do not use initrd.
With following I still see Overlapping memory segments
kexec -p /home/panand/work/kernel/bsa2_kdump/vmlinux --initrd=/boot/initramfs-3.19.0.bz1198945+.img --append="$( cat /proc/cmdline ) maxcpus=1 mem=64M reset_devices"
~Pratyush
Pratyush,
On 04/02/2015 01:58 PM, Pratyush Anand wrote:
On Thursday 02 April 2015 04:57 AM, AKASHI Takahiro wrote:
Please try my latest kexec-tools in my linaro repo (branch name is kdump/v0.11) and let me know the result.
Thanks a lot.. Just fetched your repo and found v.0.11.
With this crash kernel loaded successfully, if I do not use initrd.
With following I still see Overlapping memory segments
kexec -p /home/panand/work/kernel/bsa2_kdump/vmlinux --initrd=/boot/initramfs-3.19.0.bz1198945+.img --append="$( cat /proc/cmdline ) maxcpus=1 mem=64M reset_devices"
How big is your initrd? If it is good small, please tell me segments info, or messages from add_segment_phys_virt() for all the segments.
FYI, my latest kexec-tools, ie. v0.11, automatically appends "mem=".
Thanks, -Takahiro AKASHI
~Pratyush
On Thursday 02 April 2015 11:07 AM, AKASHI Takahiro wrote:
Pratyush,
On 04/02/2015 01:58 PM, Pratyush Anand wrote:
On Thursday 02 April 2015 04:57 AM, AKASHI Takahiro wrote:
Please try my latest kexec-tools in my linaro repo (branch name is kdump/v0.11) and let me know the result.
Thanks a lot.. Just fetched your repo and found v.0.11.
With this crash kernel loaded successfully, if I do not use initrd.
With following I still see Overlapping memory segments
kexec -p /home/panand/work/kernel/bsa2_kdump/vmlinux --initrd=/boot/initramfs-3.19.0.bz1198945+.img --append="$( cat /proc/cmdline ) maxcpus=1 mem=64M reset_devices"
How big is your initrd? If it is good small, please tell me segments info, or messages from add_segment_phys_virt() for all the segments.
add_segment_phys_virt: 000000000dcd0b90 - 000000000dcd0f90 (00000400) -> 00000040c3ff0000 - 00000040c4000000 (00010000) add_segment_phys_virt: 000003ff88c10010 - 000003ff8984a010 (00c3a000) -> 00000040c0080000 - 00000040c1310000 (01290000) add_segment_phys_virt: 000000000dcd53c0 - 000000000dcd96b8 (000042f8) -> 00000040c0000000 - 00000040c0010000 (00010000) add_segment_phys_virt: 000003ff87360010 - 000003ff88bfcc2f (0189cc1f) -> 00000040c0010000 - 00000040c18b0000 (018a0000) Overlapping memory segments at 0x40c18b0000 sort_segments failed
Why do we try to fit dtb just after crash_reserved_mem.start. Should n't it should start after crash_reserved_mem.start + arm64_mem.text_offset + arm64_mem.image_size
I tried following and it works perfectly:
diff --git a/kexec/arch/arm64/crashdump-arm64.c b/kexec/arch/arm64/crashdump-arm64.c index 41266f294589..75f4e4d269ca 100644 --- a/kexec/arch/arm64/crashdump-arm64.c +++ b/kexec/arch/arm64/crashdump-arm64.c @@ -312,5 +312,6 @@ void set_crash_entry(struct mem_ehdr *ehdr, struct kexec_info *info) off_t locate_dtb_in_crashmem(struct kexec_info *info, off_t dtb_size) { return locate_hole(info, dtb_size, 128UL * 1024, - crash_reserved_mem.start, crash_reserved_mem.end, 1); + crash_reserved_mem.start + arm64_mem.text_offset + + arm64_mem.image_size, crash_reserved_mem.end, 1); }
With this changes new allocations are: add_segment_phys_virt: 0000000010350b90 - 0000000010350f90 (00000400) -> 00000040c3ff0000 - 00000040c4000000 (00010000) add_segment_phys_virt: 000003ff7ad70010 - 000003ff7b9aa010 (00c3a000) -> 00000040c0080000 - 00000040c1310000 (01290000) add_segment_phys_virt: 00000000103553c0 - 00000000103596b8 (000042f8) -> 00000040c1360000 - 00000040c1370000 (00010000) add_segment_phys_virt: 000003ff794c0010 - 000003ff7ad5cc2f (0189cc1f) -> 00000040c1370000 - 00000040c2c10000 (018a0000) add_segment_phys_virt: 00000000103596c0 - 0000000010360190 (00006ad0) -> 00000040c2c10000 - 00000040c2c20000 (00010000)
Crash kernel loaded upon panic.
~Pratyush
On 04/02/2015 03:01 PM, Pratyush Anand wrote:
On Thursday 02 April 2015 11:07 AM, AKASHI Takahiro wrote:
Pratyush,
On 04/02/2015 01:58 PM, Pratyush Anand wrote:
On Thursday 02 April 2015 04:57 AM, AKASHI Takahiro wrote:
Please try my latest kexec-tools in my linaro repo (branch name is kdump/v0.11) and let me know the result.
Thanks a lot.. Just fetched your repo and found v.0.11.
With this crash kernel loaded successfully, if I do not use initrd.
With following I still see Overlapping memory segments
kexec -p /home/panand/work/kernel/bsa2_kdump/vmlinux --initrd=/boot/initramfs-3.19.0.bz1198945+.img --append="$( cat /proc/cmdline ) maxcpus=1 mem=64M reset_devices"
How big is your initrd? If it is good small, please tell me segments info, or messages from add_segment_phys_virt() for all the segments.
add_segment_phys_virt: 000000000dcd0b90 - 000000000dcd0f90 (00000400) -> 00000040c3ff0000 - 00000040c4000000 (00010000) add_segment_phys_virt: 000003ff88c10010 - 000003ff8984a010 (00c3a000) -> 00000040c0080000 - 00000040c1310000 (01290000) add_segment_phys_virt: 000000000dcd53c0 - 000000000dcd96b8 (000042f8) -> 00000040c0000000 - 00000040c0010000 (00010000) add_segment_phys_virt: 000003ff87360010 - 000003ff88bfcc2f (0189cc1f) -> 00000040c0010000 - 00000040c18b0000 (018a0000) Overlapping memory segments at 0x40c18b0000 sort_segments failed
Why do we try to fit dtb just after crash_reserved_mem.start. Should n't it should start after crash_reserved_mem.start
- arm64_mem.text_offset + arm64_mem.image_size
Yeah, worth considering :)
-Takahiro AKASHI
I tried following and it works perfectly:
diff --git a/kexec/arch/arm64/crashdump-arm64.c b/kexec/arch/arm64/crashdump-arm64.c index 41266f294589..75f4e4d269ca 100644 --- a/kexec/arch/arm64/crashdump-arm64.c +++ b/kexec/arch/arm64/crashdump-arm64.c @@ -312,5 +312,6 @@ void set_crash_entry(struct mem_ehdr *ehdr, struct kexec_info *info) off_t locate_dtb_in_crashmem(struct kexec_info *info, off_t dtb_size) { return locate_hole(info, dtb_size, 128UL * 1024,
crash_reserved_mem.start, crash_reserved_mem.end, 1);
crash_reserved_mem.start + arm64_mem.text_offset +
}arm64_mem.image_size, crash_reserved_mem.end, 1);
With this changes new allocations are: add_segment_phys_virt: 0000000010350b90 - 0000000010350f90 (00000400) -> 00000040c3ff0000 - 00000040c4000000 (00010000) add_segment_phys_virt: 000003ff7ad70010 - 000003ff7b9aa010 (00c3a000) -> 00000040c0080000 - 00000040c1310000 (01290000) add_segment_phys_virt: 00000000103553c0 - 00000000103596b8 (000042f8) -> 00000040c1360000 - 00000040c1370000 (00010000) add_segment_phys_virt: 000003ff794c0010 - 000003ff7ad5cc2f (0189cc1f) -> 00000040c1370000 - 00000040c2c10000 (018a0000) add_segment_phys_virt: 00000000103596c0 - 0000000010360190 (00006ad0) -> 00000040c2c10000 - 00000040c2c20000 (00010000)
Crash kernel loaded upon panic.
~Pratyush
linaro-kernel@lists.linaro.org