Patches adding support for hibernation on ARM - ARM hibernation / suspend-to-disk - Change soft_restart to use non-tracing raw_local_irq_disable
Patches based on v3.13 tag, verified hibernation on beaglebone black on a branch based on 3.13 merged with initial omap support from Russ Dill which can be found here (includes v1 patchset): http://git.linaro.org/git-ro/people/sebastian.capella/linux.git hibernation_3.13_russMerge
[PATCH v6 1/2] ARM: avoid tracers in soft_restart
arch/arm/kernel/process.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Use raw_local_irq_disable in place of local_irq_disable to avoid infinite abort recursion while tracing. (unchanged since v3)
[PATCH v6 2/2] ARM hibernation / suspend-to-disk arch/arm/include/asm/memory.h | 1 + arch/arm/kernel/Makefile | 1 + arch/arm/kernel/hibernate.c | 113 +++++++++++++++++++++++++++++++++++++++++ arch/arm/mm/Kconfig | 5 ++ include/linux/suspend.h | 2 + 5 files changed, 122 insertions(+)
Adds support for ARM based hibernation
Additional notes: -----------------
There are two checkpatch warnings added by this patch. These follow behavior in existing hibernation implementations on other platforms.
WARNING: externs should be avoided in .c files #116: FILE: arch/arm/kernel/hibernate.c:26: +extern const void __nosave_begin, __nosave_end;
This extern is picking up the linker nosave region definitions, only used in hibernate. Follows same extern line used mips, powerpc, s390, sh, sparc, x86 & unicore32
WARNING: externs should be avoided in .c files #199: FILE: arch/arm/kernel/hibernate.c:109: +extern void call_with_stack(void (*fn)(void *), void *arg, void *sp);
This extern is used in the arch/arm/ in hibernate, process and bL_switcher
Changes in v6: -------------- * Simplify static variable names
Changes in v5: -------------- * Fixed checkpatch warning on trailing whitespace
Changes in v4: -------------- * updated comment for soft_restart with review feedback * dropped freeze_processes patch which was queued separately to 3.14 by Rafael Wysocki: https://lkml.org/lkml/2014/2/25/683
Changes in v3: -------------- * added comment to use of soft_restart * drop irq disable soft_restart patch * add patch to avoid tracers in soft_restart by using raw_local_irq_*
Changes in v2: -------------- * Removed unneeded flush_thread, use of __naked and cpu_init. * dropped Cyril Chemparathy cyril@ti.com from Cc: list as emails are bouncing.
Thanks,
Sebastian Capella
Use of tracers in local_irq_disable is causes recursive aborts when called with irqs disabled and using a temporary stack (hibernation). Replace local_irq_disable with raw_local_irq_disable instead to avoid tracers.
Signed-off-by: Sebastian Capella sebastian.capella@linaro.org Cc: Russell King linux@arm.linux.org.uk Cc: Andrew Morton akpm@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will.deacon@arm.com Cc: Robin Holt robin.m.holt@gmail.com Cc: Lorenzo Pieralisi lorenzo.pieralisi@arm.com --- arch/arm/kernel/process.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 92f7b15..f58b723 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -100,7 +100,7 @@ void soft_restart(unsigned long addr) u64 *stack = soft_restart_stack + ARRAY_SIZE(soft_restart_stack);
/* Disable interrupts first */ - local_irq_disable(); + raw_local_irq_disable(); local_fiq_disable();
/* Disable the L2 if we're the last man standing. */
From: Russ Dill Russ.Dill@ti.com
Enable hibernation for ARM architectures and provide ARM architecture specific calls used during hibernation.
The swsusp hibernation framework depends on the platform first having functional suspend/resume.
Then, in order to enable hibernation on a given platform, a platform_hibernation_ops structure may need to be registered with the system in order to save/restore any SoC-specific / cpu specific state needing (re)init over a suspend-to-disk/resume-from-disk cycle.
For example:
- "secure" SoCs that have different sets of control registers and/or different CR reg access patterns.
- SoCs with L2 caches as the activation sequence there is SoC-dependent; a full off-on cycle for L2 is not done by the hibernation support code.
- SoCs requiring steps on wakeup _before_ the "generic" parts done by cpu_suspend / cpu_resume can work correctly.
- SoCs having persistent state which is maintained during suspend and resume, but will be lost during the power off cycle after suspend-to-disk.
This is a rebase/rework of Frank Hofmann's v5 hibernation patchset.
Acked-by: Russ Dill Russ.Dill@ti.com Cc: "Rafael J. Wysocki" rjw@rjwysocki.net Signed-off-by: Sebastian Capella sebastian.capella@linaro.org Acked-by: Pavel Machek pavel@ucw.cz Cc: Russell King linux@arm.linux.org.uk Cc: Len Brown len.brown@intel.com Cc: Nicolas Pitre nico@linaro.org Cc: Santosh Shilimkar santosh.shilimkar@ti.com Cc: Will Deacon will.deacon@arm.com Cc: Jonathan Austin jonathan.austin@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: "Uwe Kleine-König" u.kleine-koenig@pengutronix.de Cc: Stephen Boyd sboyd@codeaurora.org Cc: Lorenzo Pieralisi lorenzo.pieralisi@arm.com --- arch/arm/include/asm/memory.h | 1 + arch/arm/kernel/Makefile | 1 + arch/arm/kernel/hibernate.c | 113 +++++++++++++++++++++++++++++++++++++++++ arch/arm/mm/Kconfig | 5 ++ include/linux/suspend.h | 2 + 5 files changed, 122 insertions(+) create mode 100644 arch/arm/kernel/hibernate.c
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 8756e4b..1079ea8 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -291,6 +291,7 @@ static inline void *phys_to_virt(phys_addr_t x) */ #define __pa(x) __virt_to_phys((unsigned long)(x)) #define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x))) +#define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0)) #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
extern phys_addr_t (*arch_virt_to_idmap)(unsigned long x); diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index a30fc9b..8afa848 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -39,6 +39,7 @@ obj-$(CONFIG_ARTHUR) += arthur.o obj-$(CONFIG_ISA_DMA) += dma-isa.o obj-$(CONFIG_PCI) += bios32.o isa.o obj-$(CONFIG_ARM_CPU_SUSPEND) += sleep.o suspend.o +obj-$(CONFIG_HIBERNATION) += hibernate.o obj-$(CONFIG_SMP) += smp.o ifdef CONFIG_MMU obj-$(CONFIG_SMP) += smp_tlb.o diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c new file mode 100644 index 0000000..a41e0e3 --- /dev/null +++ b/arch/arm/kernel/hibernate.c @@ -0,0 +1,113 @@ +/* + * Hibernation support specific for ARM + * + * Derived from work on ARM hibernation support by: + * + * Ubuntu project, hibernation support for mach-dove + * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu) + * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.) + * https://lkml.org/lkml/2010/6/18/4 + * https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html + * https://patchwork.kernel.org/patch/96442/ + * + * Copyright (C) 2006 Rafael J. Wysocki rjw@sisk.pl + * + * License terms: GNU General Public License (GPL) version 2 + */ + +#include <linux/mm.h> +#include <linux/suspend.h> +#include <asm/tlbflush.h> +#include <asm/cacheflush.h> +#include <asm/system_misc.h> +#include <asm/idmap.h> +#include <asm/suspend.h> + +extern const void __nosave_begin, __nosave_end; + +int pfn_is_nosave(unsigned long pfn) +{ + unsigned long nosave_begin_pfn = + __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; + unsigned long nosave_end_pfn = + PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; + + return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); +} + +void notrace save_processor_state(void) +{ + WARN_ON(num_online_cpus() != 1); + local_fiq_disable(); +} + +void notrace restore_processor_state(void) +{ + local_fiq_enable(); +} + +/* + * Snapshot kernel memory and reset the system. + * + * swsusp_save() is executed in the suspend finisher so that the CPU + * context pointer and memory are part of the saved image, which is + * required by the resume kernel image to restart execution from + * swsusp_arch_suspend(). + * + * soft_restart is not technically needed, but is used to get success + * returned from cpu_suspend. + * + * When soft reboot completes, the hibernation snapshot is written out. + */ +static int notrace arch_save_image(unsigned long unused) +{ + int ret; + + ret = swsusp_save(); + if (ret == 0) + soft_restart(virt_to_phys(cpu_resume)); + return ret; +} + +/* + * Save the current CPU state before suspend / poweroff. + */ +int notrace swsusp_arch_suspend(void) +{ + return cpu_suspend(0, arch_save_image); +} + +/* + * The framework loads the hibernation image into a linked list anchored + * at restore_pblist, for swsusp_arch_resume() to copy back to the proper + * destinations. + * + * To make this work if resume is triggered from initramfs, the + * pagetables need to be switched to allow writes to kernel mem. + */ +static void notrace arch_restore_image(void *unused) +{ + struct pbe *pbe; + + cpu_switch_mm(idmap_pgd, &init_mm); + for (pbe = restore_pblist; pbe; pbe = pbe->next) + copy_page(pbe->orig_address, pbe->address); + + soft_restart(virt_to_phys(cpu_resume)); +} + +static u8 resume_stack[PAGE_SIZE/2] __nosavedata; + +/* + * Resume from the hibernation image. + * Due to the kernel heap / data restore, stack contents change underneath + * and that would make function calls impossible; switch to a temporary + * stack within the nosave region to avoid that problem. + */ +int swsusp_arch_resume(void) +{ + extern void call_with_stack(void (*fn)(void *), void *arg, void *sp); + call_with_stack(arch_restore_image, 0, + resume_stack + sizeof(resume_stack)); + return 0; +} diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 1f8fed9..83707702 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -611,6 +611,11 @@ config CPU_USE_DOMAINS config IO_36 bool
+config ARCH_HIBERNATION_POSSIBLE + bool + depends on MMU + default y if CPU_ARM920T || CPU_ARM926T || CPU_SA1100 || CPU_XSCALE || CPU_XSC3 || CPU_V6 || CPU_V6K || CPU_V7 + comment "Processor Features"
config ARM_LPAE diff --git a/include/linux/suspend.h b/include/linux/suspend.h index f73cabf..38bbf95 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -320,6 +320,8 @@ extern unsigned long get_safe_page(gfp_t gfp_mask); extern void hibernation_set_ops(const struct platform_hibernation_ops *ops); extern int hibernate(void); extern bool system_entering_hibernation(void); +asmlinkage int swsusp_save(void); +extern struct pbe *restore_pblist; #else /* CONFIG_HIBERNATION */ static inline void register_nosave_region(unsigned long b, unsigned long e) {} static inline void register_nosave_region_late(unsigned long b, unsigned long e) {}
On 02/27/14 15:57, Sebastian Capella wrote:
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 8756e4b..1079ea8 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -291,6 +291,7 @@ static inline void *phys_to_virt(phys_addr_t x) */ #define __pa(x) __virt_to_phys((unsigned long)(x)) #define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x))) +#define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0))
Just curious, is there a reason for the RELOC_HIDE() here? Or __pa_symbol() for that matter? It looks like only x86 uses this on the __nosave_{begin,end} symbol. Maybe it's copy-pasta?
I also wonder if anyone has thought about making a __weak pfn_is_nosave() function so that architectures don't need to implement the same thing every time. Consolidating those shouldn't be part of this patch though.
Quoting Stephen Boyd (2014-02-27 16:09:51)
On 02/27/14 15:57, Sebastian Capella wrote: Just curious, is there a reason for the RELOC_HIDE() here? Or __pa_symbol() for that matter? It looks like only x86 uses this on the __nosave_{begin,end} symbol. Maybe it's copy-pasta?
Hi Stephen,
Thanks for your comments!
I see this in the original version of the patch in 2011, we took it from there.
From what I read, it sounds like the RELOC_HIDE is used to prevent optimizations by gcc where it may assume that any arithmetic on the address of a symbol will remain within the bounds of the symbol. (http://lists.linuxcoding.com/kernel/2006-q3/msg17979.html)
Since we're using this symbol's address to get the physical address, which is well out of the symbol's bounds, and not really a C behavior, it seems like this would continue to be appropriate.
pfn_is_nosave seems like variations of the same checks on other architectures. Maybe some clever person will know a nice way to make it common :)
Thanks!
Sebastian
On 02/27/2014 04:09 PM, Stephen Boyd wrote:
On 02/27/14 15:57, Sebastian Capella wrote:
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 8756e4b..1079ea8 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -291,6 +291,7 @@ static inline void *phys_to_virt(phys_addr_t x) */ #define __pa(x) __virt_to_phys((unsigned long)(x)) #define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x))) +#define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0))
Just curious, is there a reason for the RELOC_HIDE() here? Or __pa_symbol() for that matter? It looks like only x86 uses this on the __nosave_{begin,end} symbol. Maybe it's copy-pasta?
From my understanding this needs to stick around so long as gcc 3.x is
supported (did it get dropped yet?) on ARM Linux since it doesn't support -fno-strict-overflow.
I also wonder if anyone has thought about making a __weak pfn_is_nosave() function so that architectures don't need to implement the same thing every time. Consolidating those shouldn't be part of this patch though.
Yes, I think just a couple of the architectures do anything besides checking if the address falls within the nosave section.
On 02/27/14 17:47, Russ Dill wrote:
On 02/27/2014 04:09 PM, Stephen Boyd wrote:
On 02/27/14 15:57, Sebastian Capella wrote:
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 8756e4b..1079ea8 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -291,6 +291,7 @@ static inline void *phys_to_virt(phys_addr_t x) */ #define __pa(x) __virt_to_phys((unsigned long)(x)) #define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x))) +#define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0))
Just curious, is there a reason for the RELOC_HIDE() here? Or __pa_symbol() for that matter? It looks like only x86 uses this on the __nosave_{begin,end} symbol. Maybe it's copy-pasta?
From my understanding this needs to stick around so long as gcc 3.x is supported (did it get dropped yet?) on ARM Linux since it doesn't support -fno-strict-overflow.
I don't think it's been dropped yet but I wonder if anyone has tried recent kernels with such a compiler?
Would the usage of &__pv_table_begin in arch/arm/mm/mmu.c also need the same treatment? Or the tagtable loop in atags_parse.c? Do the other architectures also need to be fixed? That link Sebastian points to says that ppc originally needed it but pfn_is_nosave() on ppc doesn't use RELOC_HIDE anywhere in their __pa() macro from what I can tell.
On Thu, Feb 27, 2014 at 06:19:49PM -0800, Stephen Boyd wrote:
On 02/27/14 17:47, Russ Dill wrote:
On 02/27/2014 04:09 PM, Stephen Boyd wrote:
On 02/27/14 15:57, Sebastian Capella wrote:
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 8756e4b..1079ea8 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -291,6 +291,7 @@ static inline void *phys_to_virt(phys_addr_t x) */ #define __pa(x) __virt_to_phys((unsigned long)(x)) #define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x))) +#define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0))
Just curious, is there a reason for the RELOC_HIDE() here? Or __pa_symbol() for that matter? It looks like only x86 uses this on the __nosave_{begin,end} symbol. Maybe it's copy-pasta?
From my understanding this needs to stick around so long as gcc 3.x is supported (did it get dropped yet?) on ARM Linux since it doesn't support -fno-strict-overflow.
I don't think it's been dropped yet but I wonder if anyone has tried recent kernels with such a compiler?
Would the usage of &__pv_table_begin in arch/arm/mm/mmu.c also need the same treatment?
We've never had to play these kinds of games on ARM irrespective of compiler version.
On Thu, Feb 27, 2014 at 11:57:58PM +0000, Sebastian Capella wrote:
[...]
diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c new file mode 100644 index 0000000..a41e0e3 --- /dev/null +++ b/arch/arm/kernel/hibernate.c @@ -0,0 +1,113 @@ +/*
- Hibernation support specific for ARM
- Derived from work on ARM hibernation support by:
- Ubuntu project, hibernation support for mach-dove
- Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)
- Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)
- Copyright (C) 2006 Rafael J. Wysocki rjw@sisk.pl
- License terms: GNU General Public License (GPL) version 2
- */
+#include <linux/mm.h> +#include <linux/suspend.h> +#include <asm/tlbflush.h> +#include <asm/cacheflush.h>
You can drop tlbflush.h and cacheflush.h, they do not seem to be needed.
+#include <asm/system_misc.h> +#include <asm/idmap.h> +#include <asm/suspend.h>
+extern const void __nosave_begin, __nosave_end;
+int pfn_is_nosave(unsigned long pfn) +{
- unsigned long nosave_begin_pfn =
__pa_symbol(&__nosave_begin) >> PAGE_SHIFT;
- unsigned long nosave_end_pfn =
PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT;
- return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn);
+}
+void notrace save_processor_state(void) +{
- WARN_ON(num_online_cpus() != 1);
- local_fiq_disable();
+}
+void notrace restore_processor_state(void) +{
- local_fiq_enable();
+}
+/*
- Snapshot kernel memory and reset the system.
- swsusp_save() is executed in the suspend finisher so that the CPU
- context pointer and memory are part of the saved image, which is
- required by the resume kernel image to restart execution from
- swsusp_arch_suspend().
- soft_restart is not technically needed, but is used to get success
- returned from cpu_suspend.
- When soft reboot completes, the hibernation snapshot is written out.
- */
+static int notrace arch_save_image(unsigned long unused) +{
- int ret;
- ret = swsusp_save();
- if (ret == 0)
soft_restart(virt_to_phys(cpu_resume));
- return ret;
+}
+/*
- Save the current CPU state before suspend / poweroff.
- */
+int notrace swsusp_arch_suspend(void) +{
- return cpu_suspend(0, arch_save_image);
+}
+/*
- The framework loads the hibernation image into a linked list anchored
- at restore_pblist, for swsusp_arch_resume() to copy back to the proper
- destinations.
- To make this work if resume is triggered from initramfs, the
- pagetables need to be switched to allow writes to kernel mem.
- */
Comment above needs updating. We are switching page tables to a set of page tables that are certain to live at the same location in the older kernel, that's the only reason, as we discussed. soft_restart will make sure (again) to switch to 1:1 page tables so that we can call cpu_resume with the MMU off.
+static void notrace arch_restore_image(void *unused) +{
- struct pbe *pbe;
- cpu_switch_mm(idmap_pgd, &init_mm);
- for (pbe = restore_pblist; pbe; pbe = pbe->next)
copy_page(pbe->orig_address, pbe->address);
- soft_restart(virt_to_phys(cpu_resume));
+}
+static u8 resume_stack[PAGE_SIZE/2] __nosavedata;
+/*
- Resume from the hibernation image.
- Due to the kernel heap / data restore, stack contents change underneath
- and that would make function calls impossible; switch to a temporary
- stack within the nosave region to avoid that problem.
- */
+int swsusp_arch_resume(void) +{
- extern void call_with_stack(void (*fn)(void *), void *arg, void *sp);
- call_with_stack(arch_restore_image, 0,
resume_stack + sizeof(resume_stack));
This does not guarantee your stack is 8-byte aligned, that's not AAPCS compliant and might buy you trouble.
Either you align the stack or you align the pointer you are passing.
Please have a look at kernel/process.c
Thanks, Lorenzo
Sorry, I appear to be having problems with my emails where list emails are bouncing back to me. I'm working on correcting this now.
Quoting Lorenzo Pieralisi (2014-02-28 01:50:22)
On Thu, Feb 27, 2014 at 11:57:58PM +0000, Sebastian Capella wrote:
[...]
diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c new file mode 100644 index 0000000..a41e0e3 --- /dev/null +++ b/arch/arm/kernel/hibernate.c
...
+#include <linux/suspend.h> +#include <asm/tlbflush.h> +#include <asm/cacheflush.h>
You can drop tlbflush.h and cacheflush.h, they do not seem to be needed.
Done! Thanks!
+/*
- The framework loads the hibernation image into a linked list anchored
- at restore_pblist, for swsusp_arch_resume() to copy back to the proper
- destinations.
- To make this work if resume is triggered from initramfs, the
- pagetables need to be switched to allow writes to kernel mem.
- */
Comment above needs updating. We are switching page tables to a set of page tables that are certain to live at the same location in the older kernel, that's the only reason, as we discussed. soft_restart will make sure (again) to switch to 1:1 page tables so that we can call cpu_resume with the MMU off.
How does this look?
The framework loads as much of the hibernation image to final physical pages as possible. Any pages that were in use, will need to be restored prior to the soft_restart. The pages to restore are maintained in the list anchored at restore_pblist. At this point, we can swap the pages to their final location. We must switch the mapping to 1:1 to ensure that when we overwrite the page table physical pages we're using a known physical location (idmap_pgd) with known contents.
+/*
- Resume from the hibernation image.
- Due to the kernel heap / data restore, stack contents change underneath
- and that would make function calls impossible; switch to a temporary
- stack within the nosave region to avoid that problem.
- */
+int swsusp_arch_resume(void) +{
extern void call_with_stack(void (*fn)(void *), void *arg, void *sp);
call_with_stack(arch_restore_image, 0,
resume_stack + sizeof(resume_stack));
This does not guarantee your stack is 8-byte aligned, that's not AAPCS compliant and might buy you trouble.
Either you align the stack or you align the pointer you are passing.
Please have a look at kernel/process.c
I've added this for now, do you see any issues?
-static u8 resume_stack[PAGE_SIZE/2] __nosavedata; +static u64 resume_stack[PAGE_SIZE/2/sizeof(u64)] __nosavedata; - resume_stack + sizeof(resume_stack)); + resume_stack + ARRAY_SIZE(resume_stack));
Thanks Lorenzo!
Sebastian
On Fri, Feb 28, 2014 at 08:15:57PM +0000, Sebastian Capella wrote:
[...]
+/*
- The framework loads the hibernation image into a linked list anchored
- at restore_pblist, for swsusp_arch_resume() to copy back to the proper
- destinations.
- To make this work if resume is triggered from initramfs, the
- pagetables need to be switched to allow writes to kernel mem.
- */
Comment above needs updating. We are switching page tables to a set of page tables that are certain to live at the same location in the older kernel, that's the only reason, as we discussed. soft_restart will make sure (again) to switch to 1:1 page tables so that we can call cpu_resume with the MMU off.
How does this look?
The framework loads as much of the hibernation image to final physical pages as possible. Any pages that were in use, will need to be restored prior to the soft_restart. The pages to restore are maintained in the list anchored at restore_pblist. At this point, we can swap the pages to their final location. We must switch the mapping to 1:1 to ensure that when we overwrite the page table physical pages we're using a known physical location (idmap_pgd) with known contents.
It is ok, a tad too verbose. All I care about is a comment describing what's really needed, the existing one was confusing and wrong.
+/*
- Resume from the hibernation image.
- Due to the kernel heap / data restore, stack contents change underneath
- and that would make function calls impossible; switch to a temporary
- stack within the nosave region to avoid that problem.
- */
+int swsusp_arch_resume(void) +{
extern void call_with_stack(void (*fn)(void *), void *arg, void *sp);
call_with_stack(arch_restore_image, 0,
resume_stack + sizeof(resume_stack));
This does not guarantee your stack is 8-byte aligned, that's not AAPCS compliant and might buy you trouble.
Either you align the stack or you align the pointer you are passing.
Please have a look at kernel/process.c
I've added this for now, do you see any issues?
-static u8 resume_stack[PAGE_SIZE/2] __nosavedata; +static u64 resume_stack[PAGE_SIZE/2/sizeof(u64)] __nosavedata;
resume_stack + sizeof(resume_stack));
resume_stack + ARRAY_SIZE(resume_stack));
I do not see why the stack depends on the PAGE_SIZE. I would be surprised if you need more than a few bytes (given that soft_restart switches stack again...), go through it with a debugger, it is easy to check the stack usage and allow for some extra buffer (but half a page is not needed).
My main concern was alignment, and now that's fixed.
Thanks ! Lorenzo
Quoting Lorenzo Pieralisi (2014-02-28 14:49:33)
On Fri, Feb 28, 2014 at 08:15:57PM +0000, Sebastian Capella wrote:
[...]
+/*
- The framework loads the hibernation image into a linked list anchored
- at restore_pblist, for swsusp_arch_resume() to copy back to the proper
- destinations.
- To make this work if resume is triggered from initramfs, the
- pagetables need to be switched to allow writes to kernel mem.
- */
Comment above needs updating. We are switching page tables to a set of page tables that are certain to live at the same location in the older kernel, that's the only reason, as we discussed. soft_restart will make sure (again) to switch to 1:1 page tables so that we can call cpu_resume with the MMU off.
How does this look?
The framework loads as much of the hibernation image to final physical pages as possible. Any pages that were in use, will need to be restored prior to the soft_restart. The pages to restore are maintained in the list anchored at restore_pblist. At this point, we can swap the pages to their final location. We must switch the mapping to 1:1 to ensure that when we overwrite the page table physical pages we're using a known physical location (idmap_pgd) with known contents.
It is ok, a tad too verbose. All I care about is a comment describing what's really needed, the existing one was confusing and wrong.
Maybe more like:
Restore physical pages that were in use while loading hibernation image. Use idmap_pgd so our page tables use the same physical address as the hibernation image. Will be overwriten with the same contents.
+/*
- Resume from the hibernation image.
- Due to the kernel heap / data restore, stack contents change underneath
- and that would make function calls impossible; switch to a temporary
- stack within the nosave region to avoid that problem.
- */
+int swsusp_arch_resume(void) +{
extern void call_with_stack(void (*fn)(void *), void *arg, void *sp);
call_with_stack(arch_restore_image, 0,
resume_stack + sizeof(resume_stack));
This does not guarantee your stack is 8-byte aligned, that's not AAPCS compliant and might buy you trouble.
Either you align the stack or you align the pointer you are passing.
Please have a look at kernel/process.c
I've added this for now, do you see any issues?
-static u8 resume_stack[PAGE_SIZE/2] __nosavedata; +static u64 resume_stack[PAGE_SIZE/2/sizeof(u64)] __nosavedata;
resume_stack + sizeof(resume_stack));
resume_stack + ARRAY_SIZE(resume_stack));
I do not see why the stack depends on the PAGE_SIZE. I would be surprised if you need more than a few bytes (given that soft_restart switches stack again...), go through it with a debugger, it is easy to check the stack usage and allow for some extra buffer (but half a page is not needed).
I assuming this is becase the no-save region is one page anyway (we skip restoring the no-save region physical page). So maybe 1/2 is a way to leave some room for whatever else may need to be here, but in any case the 4k is used for nosave. I think you're right that it can be much less.
I'll check into this...
Thanks!
Sebastian
Quoting Sebastian Capella (2014-02-28 15:38:54)
Quoting Lorenzo Pieralisi (2014-02-28 14:49:33)
On Fri, Feb 28, 2014 at 08:15:57PM +0000, Sebastian Capella wrote:
This does not guarantee your stack is 8-byte aligned, that's not AAPCS compliant and might buy you trouble.
Either you align the stack or you align the pointer you are passing.
Please have a look at kernel/process.c
I've added this for now, do you see any issues?
-static u8 resume_stack[PAGE_SIZE/2] __nosavedata; +static u64 resume_stack[PAGE_SIZE/2/sizeof(u64)] __nosavedata;
resume_stack + sizeof(resume_stack));
resume_stack + ARRAY_SIZE(resume_stack));
I do not see why the stack depends on the PAGE_SIZE. I would be surprised if you need more than a few bytes (given that soft_restart switches stack again...), go through it with a debugger, it is easy to check the stack usage and allow for some extra buffer (but half a page is not needed).
I assuming this is becase the no-save region is one page anyway (we skip restoring the no-save region physical page). So maybe 1/2 is a way to leave some room for whatever else may need to be here, but in any case the 4k is used for nosave. I think you're right that it can be much less.
Hi Lorenzo,
Are you ok with this just being half a page? Or do you want me to try to reduce the stack size? I am at Connect without my debugger, so in that case it would have to wait until next week.
The change for alignment is in as discussed.
Thanks!
Sebastian
On Tue, Mar 04, 2014 at 09:55:31AM +0000, Sebastian Capella wrote:
Quoting Sebastian Capella (2014-02-28 15:38:54)
Quoting Lorenzo Pieralisi (2014-02-28 14:49:33)
On Fri, Feb 28, 2014 at 08:15:57PM +0000, Sebastian Capella wrote:
This does not guarantee your stack is 8-byte aligned, that's not AAPCS compliant and might buy you trouble.
Either you align the stack or you align the pointer you are passing.
Please have a look at kernel/process.c
I've added this for now, do you see any issues?
-static u8 resume_stack[PAGE_SIZE/2] __nosavedata; +static u64 resume_stack[PAGE_SIZE/2/sizeof(u64)] __nosavedata;
resume_stack + sizeof(resume_stack));
resume_stack + ARRAY_SIZE(resume_stack));
I do not see why the stack depends on the PAGE_SIZE. I would be surprised if you need more than a few bytes (given that soft_restart switches stack again...), go through it with a debugger, it is easy to check the stack usage and allow for some extra buffer (but half a page is not needed).
I assuming this is becase the no-save region is one page anyway (we skip restoring the no-save region physical page). So maybe 1/2 is a way to leave some room for whatever else may need to be here, but in any case the 4k is used for nosave. I think you're right that it can be much less.
Hi Lorenzo,
Are you ok with this just being half a page? Or do you want me to try to reduce the stack size? I am at Connect without my debugger, so in that case it would have to wait until next week.
I am ok, either you leave that as it is (that multiple division looks horrible but it is just nitpicking on my side) or define it as an u8 array, stick __attribute__((aligned(8)) to the definition (and explain why) and be done with it.
You can add my:
Reviewed-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com
Quoting Lorenzo Pieralisi (2014-03-04 03:17:46)
On Tue, Mar 04, 2014 at 09:55:31AM +0000, Sebastian Capella wrote:
Quoting Sebastian Capella (2014-02-28 15:38:54)
Quoting Lorenzo Pieralisi (2014-02-28 14:49:33)
On Fri, Feb 28, 2014 at 08:15:57PM +0000, Sebastian Capella wrote:
This does not guarantee your stack is 8-byte aligned, that's not AAPCS compliant and might buy you trouble.
Either you align the stack or you align the pointer you are passing.
Please have a look at kernel/process.c
I've added this for now, do you see any issues?
-static u8 resume_stack[PAGE_SIZE/2] __nosavedata; +static u64 resume_stack[PAGE_SIZE/2/sizeof(u64)] __nosavedata;
resume_stack + sizeof(resume_stack));
resume_stack + ARRAY_SIZE(resume_stack));
I do not see why the stack depends on the PAGE_SIZE. I would be surprised if you need more than a few bytes (given that soft_restart switches stack again...), go through it with a debugger, it is easy to check the stack usage and allow for some extra buffer (but half a page is not needed).
I assuming this is becase the no-save region is one page anyway (we skip restoring the no-save region physical page). So maybe 1/2 is a way to leave some room for whatever else may need to be here, but in any case the 4k is used for nosave. I think you're right that it can be much less.
Hi Lorenzo,
Are you ok with this just being half a page? Or do you want me to try to reduce the stack size? I am at Connect without my debugger, so in that case it would have to wait until next week.
I am ok, either you leave that as it is (that multiple division looks horrible but it is just nitpicking on my side) or define it as an u8 array, stick __attribute__((aligned(8)) to the definition (and explain why) and be done with it.
You can add my:
Reviewed-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com
Thanks Lorenzo!
Sebastian
linaro-kernel@lists.linaro.org