EL0 exception handlers should always call `exit_to_user_mode()` with interrupts unmasked. When handling a completed single-step, we skip the if block and `local_daif_restore(DAIF_PROCCTX)` never gets called, which ends up calling `exit_to_user_mode()` with interrupts masked.
This is broken if pNMI is in use, as `do_notify_resume()` will try to enable interrupts, but `local_irq_enable()` will only change the PMR, leaving interrupts masked via DAIF.
Move the call to `try_step_suspended_breakpoints()` outside of the check so that interrupts can be unmasked even if we don't call the step handler.
Fixes: 0ac7584c08ce ("arm64: debug: split single stepping exception entry") Cc: stable@vger.kernel.org # 6.17 Signed-off-by: Ada Couprie Diaz ada.coupriediaz@arm.com ---- This was already broken in a similar fashion in kernels prior to v6.17, as `local_daif_restore()` was called _after_ the debug handlers, with some calling `send_user_sigtrap()` which would unmask interrupts via PMR while leaving them masked by DAIF. --- arch/arm64/kernel/entry-common.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 2b0c5925502e..780cbcf12695 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -832,6 +832,7 @@ static void noinstr el0_breakpt(struct pt_regs *regs, unsigned long esr)
static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr) { + bool step_done; if (!is_ttbr0_addr(regs->pc)) arm64_apply_bp_hardening();
@@ -842,10 +843,11 @@ static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr) * If we are stepping a suspended breakpoint there's nothing more to do: * the single-step is complete. */ - if (!try_step_suspended_breakpoints(regs)) { - local_daif_restore(DAIF_PROCCTX); + step_done = try_step_suspended_breakpoints(regs) + local_daif_restore(DAIF_PROCCTX); + if (!step_done) do_el0_softstep(esr, regs); - } + exit_to_user_mode(regs); }
base-commit: 449d48b1b99fdaa076166e200132705ac2bee711
EL0 exception handlers should always call `exit_to_user_mode()` with interrupts unmasked. When handling a completed single-step, we skip the if block and `local_daif_restore(DAIF_PROCCTX)` never gets called, which ends up calling `exit_to_user_mode()` with interrupts masked.
This is broken if pNMI is in use, as `do_notify_resume()` will try to enable interrupts, but `local_irq_enable()` will only change the PMR, leaving interrupts masked via DAIF.
Move the call to `try_step_suspended_breakpoints()` outside of the check so that interrupts can be unmasked even if we don't call the step handler.
Fixes: 0ac7584c08ce ("arm64: debug: split single stepping exception entry") Cc: stable@vger.kernel.org # 6.17 Signed-off-by: Ada Couprie Diaz ada.coupriediaz@arm.com ---- This was already broken in a similar fashion in kernels prior to v6.17, as `local_daif_restore()` was called _after_ the debug handlers, with some calling `send_user_sigtrap()` which would unmask interrupts via PMR while leaving them masked by DAIF.
v2: Add missing semicolon so that the patch compiles. My mistake for rushing a fix late, apologies. --- arch/arm64/kernel/entry-common.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 2b0c5925502e..8473fe4791bc 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -832,6 +832,7 @@ static void noinstr el0_breakpt(struct pt_regs *regs, unsigned long esr)
static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr) { + bool step_done; if (!is_ttbr0_addr(regs->pc)) arm64_apply_bp_hardening();
@@ -842,10 +843,11 @@ static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr) * If we are stepping a suspended breakpoint there's nothing more to do: * the single-step is complete. */ - if (!try_step_suspended_breakpoints(regs)) { - local_daif_restore(DAIF_PROCCTX); + step_done = try_step_suspended_breakpoints(regs); + local_daif_restore(DAIF_PROCCTX); + if (!step_done) do_el0_softstep(esr, regs); - } + exit_to_user_mode(regs); }
base-commit: 449d48b1b99fdaa076166e200132705ac2bee711
On Tue, Oct 14, 2025 at 10:25:36AM +0100, Ada Couprie Diaz wrote:
EL0 exception handlers should always call `exit_to_user_mode()` with interrupts unmasked. When handling a completed single-step, we skip the if block and `local_daif_restore(DAIF_PROCCTX)` never gets called, which ends up calling `exit_to_user_mode()` with interrupts masked.
This is broken if pNMI is in use, as `do_notify_resume()` will try to enable interrupts, but `local_irq_enable()` will only change the PMR, leaving interrupts masked via DAIF.
I think we might want to spell thius out a bit more, e.g.
| We intend that EL0 exception handlers unmask all DAIF exceptions | before calling exit_to_user_mode(). | | When completing single-step of a suspended breakpoint, we do not call | local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(), | leaving all DAIF exceptions masked. | | When pseudo-NMIs are not in use this is benign. | | When pseudo-NMIs are in use, this is unsound. At this point interrupts | are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag | manipulation may not work correctly. For example, a subsequent | local_irq_enable() within exit_to_user_mode_loop() will only unmask | interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and | anything depending on interrupts being unmasked (e.g. delivery of | signals) will not work correctly. | | This can by detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING.
With or without that, this looks good to me, so:
Acked-by: Mark Rutland mark.rutland@arm.com
Mark.
Move the call to `try_step_suspended_breakpoints()` outside of the check so that interrupts can be unmasked even if we don't call the step handler.
Fixes: 0ac7584c08ce ("arm64: debug: split single stepping exception entry") Cc: stable@vger.kernel.org # 6.17 Signed-off-by: Ada Couprie Diaz ada.coupriediaz@arm.com
This was already broken in a similar fashion in kernels prior to v6.17, as `local_daif_restore()` was called _after_ the debug handlers, with some calling `send_user_sigtrap()` which would unmask interrupts via PMR while leaving them masked by DAIF.
v2: Add missing semicolon so that the patch compiles. My mistake for rushing a fix late, apologies.
arch/arm64/kernel/entry-common.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index 2b0c5925502e..8473fe4791bc 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -832,6 +832,7 @@ static void noinstr el0_breakpt(struct pt_regs *regs, unsigned long esr) static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr) {
- bool step_done; if (!is_ttbr0_addr(regs->pc)) arm64_apply_bp_hardening();
@@ -842,10 +843,11 @@ static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr) * If we are stepping a suspended breakpoint there's nothing more to do: * the single-step is complete. */
- if (!try_step_suspended_breakpoints(regs)) {
local_daif_restore(DAIF_PROCCTX);
- step_done = try_step_suspended_breakpoints(regs);
- local_daif_restore(DAIF_PROCCTX);
- if (!step_done) do_el0_softstep(esr, regs);
- }
- exit_to_user_mode(regs);
}
base-commit: 449d48b1b99fdaa076166e200132705ac2bee711
2.43.0
On 14/10/2025 10:46, Mark Rutland wrote:
On Tue, Oct 14, 2025 at 10:25:36AM +0100, Ada Couprie Diaz wrote:
EL0 exception handlers should always call `exit_to_user_mode()` with interrupts unmasked. When handling a completed single-step, we skip the if block and `local_daif_restore(DAIF_PROCCTX)` never gets called, which ends up calling `exit_to_user_mode()` with interrupts masked.
This is broken if pNMI is in use, as `do_notify_resume()` will try to enable interrupts, but `local_irq_enable()` will only change the PMR, leaving interrupts masked via DAIF.
I think we might want to spell thius out a bit more, e.g.
That's fair, your version lays it out better and is probably more accessible !
| We intend that EL0 exception handlers unmask all DAIF exceptions | before calling exit_to_user_mode(). | | When completing single-step of a suspended breakpoint, we do not call | local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(), | leaving all DAIF exceptions masked. | | When pseudo-NMIs are not in use this is benign. | | When pseudo-NMIs are in use, this is unsound. At this point interrupts | are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag | manipulation may not work correctly. For example, a subsequent | local_irq_enable() within exit_to_user_mode_loop() will only unmask | interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and | anything depending on interrupts being unmasked (e.g. delivery of | signals) will not work correctly. | | This can by detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING.
This could be "This was detected [...]", as that's what I did to detect the error, but happy with either : we don't need to bikeshed this !
With or without that, this looks good to me, so:
Acked-by: Mark Rutland mark.rutland@arm.com
Mark.
Thanks for the quick review, Ada
linux-stable-mirror@lists.linaro.org