Otherwise when the tracer changes syscall number to -1, the kernel fails to initialize a0 with -ENOSYS and subsequently fails to return the error code of the failed syscall to userspace. For example, it will break strace syscall tampering.
Fixes: 52449c17bdd1 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1") Reported-by: "Dmitry V. Levin" ldv@strace.io Reviewed-by: Björn Töpel bjorn@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Celeste Liu CoelacanthusHex@gmail.com --- arch/riscv/kernel/traps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 05a16b1f0aee..51ebfd23e007 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
regs->epc += 4; regs->orig_a0 = regs->a0; + regs->a0 = -ENOSYS;
riscv_v_vstate_discard(regs);
@@ -328,8 +329,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
if (syscall >= 0 && syscall < NR_syscalls) syscall_handler(regs, syscall); - else if (syscall != -1) - regs->a0 = -ENOSYS; + /* * Ultimately, this value will get limited by KSTACK_OFFSET_MAX(), * so the maximum stack offset is 1k bytes (10 bits).
Hello:
This patch was applied to riscv/linux.git (fixes) by Palmer Dabbelt palmer@rivosinc.com:
On Thu, 27 Jun 2024 22:23:39 +0800 you wrote:
Otherwise when the tracer changes syscall number to -1, the kernel fails to initialize a0 with -ENOSYS and subsequently fails to return the error code of the failed syscall to userspace. For example, it will break strace syscall tampering.
Fixes: 52449c17bdd1 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1") Reported-by: "Dmitry V. Levin" ldv@strace.io Reviewed-by: Björn Töpel bjorn@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Celeste Liu CoelacanthusHex@gmail.com
[...]
Here is the summary with links: - [v2] riscv: entry: always initialize regs->a0 to -ENOSYS https://git.kernel.org/riscv/c/61119394631f
You are awesome, thank you!
On Thu, Jun 27, 2024 at 10:23:39PM GMT, Celeste Liu wrote:
Otherwise when the tracer changes syscall number to -1, the kernel fails to initialize a0 with -ENOSYS and subsequently fails to return the error code of the failed syscall to userspace. For example, it will break strace syscall tampering.
Fixes: 52449c17bdd1 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1") Reported-by: "Dmitry V. Levin" ldv@strace.io Reviewed-by: Björn Töpel bjorn@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Celeste Liu CoelacanthusHex@gmail.com
arch/riscv/kernel/traps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 05a16b1f0aee..51ebfd23e007 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
regs->epc += 4; regs->orig_a0 = regs->a0;
regs->a0 = -ENOSYS;
riscv_v_vstate_discard(regs);
@@ -328,8 +329,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
if (syscall >= 0 && syscall < NR_syscalls) syscall_handler(regs, syscall);
else if (syscall != -1)
regs->a0 = -ENOSYS;
- /*
- Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
- so the maximum stack offset is 1k bytes (10 bits).
Hi,
this change seems to have broken strace's test suite.
In particular, the "legacy_syscall_info" test, which is meant to verify that strace behaves correctly when PTRACE_GET_SYSCALL_INFO is not available, reports a bogus value for the first argument of the syscall (the one passed via a0).
The bogus value comes directly from the ptrace() call, before strace has a chance to meddle with it, hence why the maintainer suggested that the issue would likely be traced back to the kernel.
I have built a kernel with this change reverted and, as expected, the strace test suite passes. Admittedly I've used the 6.11-rc7 Fedora kernel as the baseline for this test, but none of the Fedora patches touch the RISC-V code at all and the file itself hasn't been touched since rc7, so I'm fairly confident the same behavior is present in vanilla 6.11 too.
See
https://github.com/strace/strace/issues/315
for the original report. Please let me know if I need to provide additional information, report this anywhere else (bugzilla?), and so on...
Thanks!
On Tue, Sep 17, 2024 at 01:49:52AM +0900, Andrea Bolognani wrote:
On Thu, Jun 27, 2024 at 10:23:39PM GMT, Celeste Liu wrote:
Otherwise when the tracer changes syscall number to -1, the kernel fails to initialize a0 with -ENOSYS and subsequently fails to return the error code of the failed syscall to userspace. For example, it will break strace syscall tampering.
Fixes: 52449c17bdd1 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1") Reported-by: "Dmitry V. Levin" ldv@strace.io Reviewed-by: Björn Töpel bjorn@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Celeste Liu CoelacanthusHex@gmail.com
arch/riscv/kernel/traps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 05a16b1f0aee..51ebfd23e007 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
regs->epc += 4; regs->orig_a0 = regs->a0;
regs->a0 = -ENOSYS;
riscv_v_vstate_discard(regs);
@@ -328,8 +329,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
if (syscall >= 0 && syscall < NR_syscalls) syscall_handler(regs, syscall);
else if (syscall != -1)
regs->a0 = -ENOSYS;
- /*
- Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
- so the maximum stack offset is 1k bytes (10 bits).
Hi,
this change seems to have broken strace's test suite.
In particular, the "legacy_syscall_info" test, which is meant to verify that strace behaves correctly when PTRACE_GET_SYSCALL_INFO is not available, reports a bogus value for the first argument of the syscall (the one passed via a0).
The bogus value comes directly from the ptrace() call, before strace has a chance to meddle with it, hence why the maintainer suggested that the issue would likely be traced back to the kernel.
I have built a kernel with this change reverted and, as expected, the strace test suite passes. Admittedly I've used the 6.11-rc7 Fedora kernel as the baseline for this test, but none of the Fedora patches touch the RISC-V code at all and the file itself hasn't been touched since rc7, so I'm fairly confident the same behavior is present in vanilla 6.11 too.
See
https://github.com/strace/strace/issues/315
for the original report. Please let me know if I need to provide additional information, report this anywhere else (bugzilla?), and so on...
By the way, in strace we had to apply a workaround [1] for the riscv ptrace regression caused by commit 52449c17bdd1540940e21511612b58acebc49c06.
As result, reverting commit 61119394631f219e23ce98bcc3eb993a64a8ea64 that fixed the regression but introduced a PTRACE_GETREGSET syscall argument clobbering which is more serious regression seems to be the least of two evils.
This essentially means strace would have to keep the workaround indefinitely, but we can live with that.
[1] https://github.com/strace/strace/commit/c3ae2b27732952663a3600269884e363cb77...
linux-stable-mirror@lists.linaro.org