Implement and enable context tracking for arm64 (which is a prerequisite for FULL_NOHZ support). This patchset builds upon earlier work by Kevin Hilman and is based on Will Deacon's tree.
Changes v4 to v5:
* Improvement to code restoring far_el1 (suggested by Christopher Covington) * Improvement to register save/restore in ct_user_enter
Changes v3 to v4:
* Rename parameter of ct_user_exit from save to restore * Rebased patch to Will Deacon's tree (branch remotes/origin/aarch64 of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git)
Changes v2 to v3:
* Save/restore necessary registers in ct_user_enter and ct_user_exit * Annotate "error paths" out of el0_sync with ct_user_exit
Changes v1 to v2:
* Save far_el1 in x26 temporarily
Larry Bassel (2): arm64: adjust el0_sync so that a function can be called arm64: enable context tracking
arch/arm64/Kconfig | 1 + arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/entry.S | 69 ++++++++++++++++++++++++++++++++---- 3 files changed, 64 insertions(+), 7 deletions(-)
To implement the context tracker properly on arm64, a function call needs to be made after debugging and interrupts are turned on, but before the lr is changed to point to ret_to_user(). If the function call is made after the lr is changed the function will not return to the correct place.
For similar reasons, defer the setting of x0 so that it doesn't need to be saved around the function call (save far_el1 in x26 temporarily instead).
Signed-off-by: Larry Bassel larry.bassel@linaro.org --- arch/arm64/kernel/entry.S | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index e8b23a3..c6bc1a3 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -354,7 +354,6 @@ el0_sync: lsr x24, x25, #ESR_EL1_EC_SHIFT // exception class cmp x24, #ESR_EL1_EC_SVC64 // SVC in 64-bit state b.eq el0_svc - adr lr, ret_to_user cmp x24, #ESR_EL1_EC_DABT_EL0 // data abort in EL0 b.eq el0_da cmp x24, #ESR_EL1_EC_IABT_EL0 // instruction abort in EL0 @@ -383,7 +382,6 @@ el0_sync_compat: lsr x24, x25, #ESR_EL1_EC_SHIFT // exception class cmp x24, #ESR_EL1_EC_SVC32 // SVC in 32-bit state b.eq el0_svc_compat - adr lr, ret_to_user cmp x24, #ESR_EL1_EC_DABT_EL0 // data abort in EL0 b.eq el0_da cmp x24, #ESR_EL1_EC_IABT_EL0 // instruction abort in EL0 @@ -426,22 +424,25 @@ el0_da: /* * Data abort handling */ - mrs x0, far_el1 - bic x0, x0, #(0xff << 56) + mrs x26, far_el1 // enable interrupts before calling the main handler enable_dbg_and_irq + bic x0, x26, #(0xff << 56) mov x1, x25 mov x2, sp + adr lr, ret_to_user b do_mem_abort el0_ia: /* * Instruction abort handling */ - mrs x0, far_el1 + mrs x26, far_el1 // enable interrupts before calling the main handler enable_dbg_and_irq + mov x0, x26 orr x1, x25, #1 << 24 // use reserved ISS bit for instruction aborts mov x2, sp + adr lr, ret_to_user b do_mem_abort el0_fpsimd_acc: /* @@ -450,6 +451,7 @@ el0_fpsimd_acc: enable_dbg mov x0, x25 mov x1, sp + adr lr, ret_to_user b do_fpsimd_acc el0_fpsimd_exc: /* @@ -458,16 +460,19 @@ el0_fpsimd_exc: enable_dbg mov x0, x25 mov x1, sp + adr lr, ret_to_user b do_fpsimd_exc el0_sp_pc: /* * Stack or PC alignment exception handling */ - mrs x0, far_el1 + mrs x26, far_el1 // enable interrupts before calling the main handler enable_dbg_and_irq + mov x0, x26 mov x1, x25 mov x2, sp + adr lr, ret_to_user b do_sp_pc_abort el0_undef: /* @@ -476,23 +481,27 @@ el0_undef: // enable interrupts before calling the main handler enable_dbg_and_irq mov x0, sp + adr lr, ret_to_user b do_undefinstr el0_dbg: /* * Debug exception handling */ tbnz x24, #0, el0_inv // EL0 only - mrs x0, far_el1 + mrs x26, far_el1 + mov x0, x26 mov x1, x25 mov x2, sp bl do_debug_exception enable_dbg + mov x0, x26 b ret_to_user el0_inv: enable_dbg mov x0, sp mov x1, #BAD_SYNC mrs x2, esr_el1 + adr lr, ret_to_user b bad_mode ENDPROC(el0_sync)
Hi Larry,
On Mon, May 26, 2014 at 07:56:12PM +0100, Larry Bassel wrote:
To implement the context tracker properly on arm64, a function call needs to be made after debugging and interrupts are turned on, but before the lr is changed to point to ret_to_user(). If the function call is made after the lr is changed the function will not return to the correct place.
For similar reasons, defer the setting of x0 so that it doesn't need to be saved around the function call (save far_el1 in x26 temporarily instead).
Signed-off-by: Larry Bassel larry.bassel@linaro.org
[...]
@@ -476,23 +481,27 @@ el0_undef: // enable interrupts before calling the main handler enable_dbg_and_irq mov x0, sp
- adr lr, ret_to_user b do_undefinstr
el0_dbg: /* * Debug exception handling */ tbnz x24, #0, el0_inv // EL0 only
- mrs x0, far_el1
- mrs x26, far_el1
- mov x0, x26 mov x1, x25 mov x2, sp bl do_debug_exception enable_dbg
- mov x0, x26 b ret_to_user
Why have you added this mov instruction?
Will
On 28 May 14 12:27, Will Deacon wrote:
Hi Larry,
On Mon, May 26, 2014 at 07:56:12PM +0100, Larry Bassel wrote:
To implement the context tracker properly on arm64, a function call needs to be made after debugging and interrupts are turned on, but before the lr is changed to point to ret_to_user(). If the function call is made after the lr is changed the function will not return to the correct place.
For similar reasons, defer the setting of x0 so that it doesn't need to be saved around the function call (save far_el1 in x26 temporarily instead).
Signed-off-by: Larry Bassel larry.bassel@linaro.org
[...]
Why have you added this mov instruction?
I believe (please correct me if I'm wrong) that it is necessary. Here is why:
@@ -476,23 +481,27 @@ el0_undef: // enable interrupts before calling the main handler enable_dbg_and_irq mov x0, sp
- adr lr, ret_to_user b do_undefinstr
el0_dbg: /* * Debug exception handling */ tbnz x24, #0, el0_inv // EL0 only
- mrs x0, far_el1
- mrs x26, far_el1
needed because do_debug_exception may clobber x0, so save far_el1 in x26 (as other parts of this patch do)
- mov x0, x26
needed because far_el1 is expected to be in x0 here
mov x1, x25 mov x2, sp bl do_debug_exception enable_dbg
[call to ct_user_exit will go here in the next patch, this may re-clobber x0]
- mov x0, x26
needed because far_el1 is expected to be in x0 here
Since the purpose of this patch is to make calling a function possible in this code path, the "extra" mov instruction above is necessary and IMHO should be added in this patch and not in the next one whose purpose is to define the ct_user_* macros and add calls to them in the proper places.
b ret_to_user
Will
Larry
On Wed, May 28, 2014 at 08:35:51PM +0100, Larry Bassel wrote:
On 28 May 14 12:27, Will Deacon wrote:
On Mon, May 26, 2014 at 07:56:12PM +0100, Larry Bassel wrote:
To implement the context tracker properly on arm64, a function call needs to be made after debugging and interrupts are turned on, but before the lr is changed to point to ret_to_user(). If the function call is made after the lr is changed the function will not return to the correct place.
For similar reasons, defer the setting of x0 so that it doesn't need to be saved around the function call (save far_el1 in x26 temporarily instead).
Signed-off-by: Larry Bassel larry.bassel@linaro.org
[...]
Why have you added this mov instruction?
I believe (please correct me if I'm wrong) that it is necessary. Here is why:
@@ -476,23 +481,27 @@ el0_undef: // enable interrupts before calling the main handler enable_dbg_and_irq mov x0, sp
- adr lr, ret_to_user b do_undefinstr
el0_dbg: /* * Debug exception handling */ tbnz x24, #0, el0_inv // EL0 only
- mrs x0, far_el1
- mrs x26, far_el1
needed because do_debug_exception may clobber x0, so save far_el1 in x26 (as other parts of this patch do)
Actually, do_debug_exception consumes the FAR as its first parameter, so you don't need to put this in x26 afaict.
- mov x0, x26
needed because far_el1 is expected to be in x0 here
mov x1, x25 mov x2, sp bl do_debug_exception enable_dbg
[call to ct_user_exit will go here in the next patch, this may re-clobber x0]
- mov x0, x26
needed because far_el1 is expected to be in x0 here
Is it? ret_to_user doesn't care. Does ct_user_exit use the FAR? I don't think it does...
Will
Make calls to ct_user_enter when the kernel is exited and ct_user_exit when the kernel is entered (in el0_da, el0_ia, el0_svc, el0_irq and all of the "error" paths).
These macros expand to function calls which will only work properly if el0_sync and related code has been rearranged (in a previous patch of this series).
The calls to ct_user_exit are made after hw debugging has been enabled (enable_dbg_and_irq).
The call to ct_user_enter is made at the beginning of the kernel_exit macro.
This patch is based on earlier work by Kevin Hilman. Save/restore optimizations were also done by Kevin.
Signed-off-by: Kevin Hilman khilman@linaro.org Signed-off-by: Larry Bassel larry.bassel@linaro.org --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/entry.S | 46 ++++++++++++++++++++++++++++++++++++ 3 files changed, 48 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e759af5..ef18ae5 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -55,6 +55,7 @@ config ARM64 select RTC_LIB select SPARSE_IRQ select SYSCTL_EXCEPTION_TRACE + select HAVE_CONTEXT_TRACKING help ARM 64-bit (AArch64) Linux support.
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 720e70b..301ea6a 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -108,6 +108,7 @@ static inline struct thread_info *current_thread_info(void) #define TIF_SINGLESTEP 21 #define TIF_32BIT 22 /* 32bit process */ #define TIF_SWITCH_MM 23 /* deferred switch_mm */ +#define TIF_NOHZ 24
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index c6bc1a3..0605963 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -30,6 +30,42 @@ #include <asm/unistd32.h>
/* + * Context tracking subsystem. Used to instrument transitions + * between user and kernel mode. + */ + .macro ct_user_exit, restore = 0 +#ifdef CONFIG_CONTEXT_TRACKING + bl context_tracking_user_exit + .if \restore == 1 + /* + * Save/restore needed during syscalls. Restore syscall arguments from + * the values already saved on stack during kernel_entry. + */ + ldp x0, x1, [sp] + ldp x2, x3, [sp, #S_X2] + ldp x4, x5, [sp, #S_X4] + ldp x6, x7, [sp, #S_X6] + .endif +#endif + .endm + + .macro ct_user_enter, save = 0 +#ifdef CONFIG_CONTEXT_TRACKING + .if \save == 1 + /* + * We only have to save/restore x0 on the fast syscall path where + * x0 contains the syscall return. + */ + mov x19, x0 + .endif + bl context_tracking_user_enter + .if \save == 1 + mov x0, x19 + .endif +#endif + .endm + +/* * Bad Abort numbers *----------------- */ @@ -91,6 +127,7 @@ .macro kernel_exit, el, ret = 0 ldp x21, x22, [sp, #S_PC] // load ELR, SPSR .if \el == 0 + ct_user_enter \ret ldr x23, [sp, #S_SP] // load return stack pointer .endif .if \ret @@ -318,6 +355,7 @@ el1_irq: bl trace_hardirqs_off #endif
+ ct_user_exit irq_handler
#ifdef CONFIG_PREEMPT @@ -427,6 +465,7 @@ el0_da: mrs x26, far_el1 // enable interrupts before calling the main handler enable_dbg_and_irq + ct_user_exit bic x0, x26, #(0xff << 56) mov x1, x25 mov x2, sp @@ -439,6 +478,7 @@ el0_ia: mrs x26, far_el1 // enable interrupts before calling the main handler enable_dbg_and_irq + ct_user_exit mov x0, x26 orr x1, x25, #1 << 24 // use reserved ISS bit for instruction aborts mov x2, sp @@ -449,6 +489,7 @@ el0_fpsimd_acc: * Floating Point or Advanced SIMD access */ enable_dbg + ct_user_exit mov x0, x25 mov x1, sp adr lr, ret_to_user @@ -458,6 +499,7 @@ el0_fpsimd_exc: * Floating Point or Advanced SIMD exception */ enable_dbg + ct_user_exit mov x0, x25 mov x1, sp adr lr, ret_to_user @@ -480,6 +522,7 @@ el0_undef: */ // enable interrupts before calling the main handler enable_dbg_and_irq + ct_user_exit mov x0, sp adr lr, ret_to_user b do_undefinstr @@ -494,10 +537,12 @@ el0_dbg: mov x2, sp bl do_debug_exception enable_dbg + ct_user_exit mov x0, x26 b ret_to_user el0_inv: enable_dbg + ct_user_exit mov x0, sp mov x1, #BAD_SYNC mrs x2, esr_el1 @@ -618,6 +663,7 @@ el0_svc: el0_svc_naked: // compat entry point stp x0, scno, [sp, #S_ORIG_X0] // save the original x0 and syscall number enable_dbg_and_irq + ct_user_exit 1
ldr x16, [tsk, #TI_FLAGS] // check for syscall tracing tbnz x16, #TIF_SYSCALL_TRACE, __sys_trace // are we tracing syscalls?
Hi Larry,
On Mon, May 26, 2014 at 07:56:13PM +0100, Larry Bassel wrote:
Make calls to ct_user_enter when the kernel is exited and ct_user_exit when the kernel is entered (in el0_da, el0_ia, el0_svc, el0_irq and all of the "error" paths).
These macros expand to function calls which will only work properly if el0_sync and related code has been rearranged (in a previous patch of this series).
The calls to ct_user_exit are made after hw debugging has been enabled (enable_dbg_and_irq).
The call to ct_user_enter is made at the beginning of the kernel_exit macro.
This patch is based on earlier work by Kevin Hilman. Save/restore optimizations were also done by Kevin.
Apologies if we've discussed this before (it rings a bell), but why are we penalising the fast syscall path with this? Shouldn't TIF_NOHZ contribute to out _TIF_WORK_MASK, then we could do the tracking on the syscall slow path?
I think that would tidy up your mov into x19 too.
Also -- how do you track ret_from_fork in the child with these patches?
Will
Hi Will,
Will Deacon will.deacon@arm.com writes:
On Mon, May 26, 2014 at 07:56:13PM +0100, Larry Bassel wrote:
Make calls to ct_user_enter when the kernel is exited and ct_user_exit when the kernel is entered (in el0_da, el0_ia, el0_svc, el0_irq and all of the "error" paths).
These macros expand to function calls which will only work properly if el0_sync and related code has been rearranged (in a previous patch of this series).
The calls to ct_user_exit are made after hw debugging has been enabled (enable_dbg_and_irq).
The call to ct_user_enter is made at the beginning of the kernel_exit macro.
This patch is based on earlier work by Kevin Hilman. Save/restore optimizations were also done by Kevin.
Apologies if we've discussed this before (it rings a bell), but why are we penalising the fast syscall path with this? Shouldn't TIF_NOHZ contribute to out _TIF_WORK_MASK, then we could do the tracking on the syscall slow path?
I'll answer here since Larry inherited this design decision from me.
I considered (and even implemented) forcing the slow syscall path based on TIF_NOHZ but decided (perhaps wrongly) not to. I guess the choice is between:
- forcing the overhead of syscall tracing path on all TIF_NOHZ processes
- forcing the (much smaller) ct_user_exit overhead on all syscalls, (including the fast syscall path)
I had decided that the former was better, but as I write this, I'm thinking that the NOHZ tasks should probably eat the extra overhead since we expect their interactions with the kernel to be minimal anyways (part of the goal of full NOHZ.)
Ultimately, I'm OK with either way and have the other version ready.
I think that would tidy up your mov into x19 too.
That's correct. If we force the syscall_trace path, the ct_user_enter wouldn't have to do any context save/restore.
Also -- how do you track ret_from_fork in the child with these patches?
Not sure I follow the question, but ret_from_fork calls ret_to_user, which calls kernel_exit, which calls ct_user_enter.
Kevin
On Wed, May 28, 2014 at 04:55:39PM +0100, Kevin Hilman wrote:
Hi Will,
Hey Kevin,
Will Deacon will.deacon@arm.com writes:
Apologies if we've discussed this before (it rings a bell), but why are we penalising the fast syscall path with this? Shouldn't TIF_NOHZ contribute to out _TIF_WORK_MASK, then we could do the tracking on the syscall slow path?
I'll answer here since Larry inherited this design decision from me.
I considered (and even implemented) forcing the slow syscall path based on TIF_NOHZ but decided (perhaps wrongly) not to. I guess the choice is between:
forcing the overhead of syscall tracing path on all TIF_NOHZ processes
forcing the (much smaller) ct_user_exit overhead on all syscalls, (including the fast syscall path)
I had decided that the former was better, but as I write this, I'm thinking that the NOHZ tasks should probably eat the extra overhead since we expect their interactions with the kernel to be minimal anyways (part of the goal of full NOHZ.)
Ultimately, I'm OK with either way and have the other version ready.
I was just going by the comment in kernel/context_tracking.c:
* The context tracking uses the syscall slow path to implement its user-kernel * boundaries probes on syscalls. This way it doesn't impact the syscall fast * path on CPUs that don't do context tracking.
which doesn't match what the current patch does. It also makes it sounds like context tracking is really a per-CPU thing, but I've never knowingly used it before.
I think putting this on the slowpath is inline with the expectations in the core code.
I think that would tidy up your mov into x19 too.
That's correct. If we force the syscall_trace path, the ct_user_enter wouldn't have to do any context save/restore.
That would be nice.
Also -- how do you track ret_from_fork in the child with these patches?
Not sure I follow the question, but ret_from_fork calls ret_to_user, which calls kernel_exit, which calls ct_user_enter.
Sorry, I got myself in a muddle. I noticed that x19 is live in ret_from_fork so made a mental note to check that is ok (I think it is) but then concluded incorrectly that you don't trace there.
Will
linaro-kernel@lists.linaro.org