Re: next-20250605: Test regression: qemu-x86_64-compat mode ltp tracing Oops int3 kernel panic

10 Jun 2025

      On Tue, 10 Jun 2025 18:50:05 +0530
Naresh Kamboju naresh.kamboju@linaro.org wrote:
...
On Mon, 9 Jun 2025 at 18:39, Masami Hiramatsu mhiramat@kernel.org wrote:
...
On Thu, 5 Jun 2025 17:12:10 +0530
Naresh Kamboju naresh.kamboju@linaro.org wrote:
...
Regressions found on qemu-x86_64 with compat mode (64-bit kernel
running on 32-bit userspace) while running LTP tracing test suite
on Linux next-20250605 tag kernel.
Regressions found on

LTP tracing

Regression Analysis:

New regression? Yes
Reproducible? Intermittent

Test regression: qemu-x86_64-compat mode ltp tracing Oops int3 kernel panic
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
## Test log
ftrace-stress-test: <12>[   21.971153] /usr/local/bin/kirk[277]:
starting test ftrace-stress-test (ftrace_stress_test.sh 90)
<4>[   58.997439] Oops: int3: 0000 [#1] SMP PTI
<4>[   58.998089] CPU: 0 UID: 0 PID: 323 Comm: sh Not tainted
6.15.0-next-20250605 #1 PREEMPT(voluntary)
<4>[   58.998152] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.16.3-debian-1.16.3-2 04/01/2014
<4>[   58.998260] RIP: 0010:_raw_spin_lock+0x5/0x50
Interesting. This hits a stray int3 for ftrace on _raw_spin_lock.
Here is the compiled code of _raw_spin_lock.
ffffffff825daa00 <_raw_spin_lock>:
ffffffff825daa00:       f3 0f 1e fa             endbr64
ffffffff825daa04:       e8 47 a6 d5 fe          call   ffffffff81335050 <__fentry__>
Since int3 exception happens after decoded int3 (1 byte), the RIP
`_raw_spin_lock+0x05` is not an instruction boundary.
...
<4>[   58.998563] Code: 5d e9 ff 12 00 00 66 66 2e 0f 1f 84 00 00 00
00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3
0f 1e fa 0f <1f> 44 00 00 55 48 89 e5 53 48 89 fb bf 01 00 00 00 e8 15
12 e4 fe
And the call is already modified back to a 5-bytes nop when we
dump the code. Thus it may hit the intermediate int3 for transforming
code.
e8 47 a6 d5 fe
 (first step)
cc 47 a6 d5 fe
 (second step)
cc 1f 44 00 00 <- hit?
 (third step)
0f 1f 44 00 00 <- handle int3
It is very unlikely scenario (and I'm not sure qemu can correctly
emulate it.) But if a CPU hits the int3 (cc) on _raw_spin_lock()+0x4
before anoter CPU' runs third step in smp_text_poke_batch_finish(),
and before the CPU runs smp_text_poke_int3_handler(), the CPU' runs
the thrid step and sets text_poke_array_refs 0,
the smp_text_poke_int3_handler() returns 0 and causes the same
problem.
<CPU0>                                  <CPU1>
                                        Start smp_text_poke_batch_finish().
                                        Finish second step.
Hit int3 (*)
                                        Finish third step.
                                        Run smp_text_poke_sync_each_cpu().(**)
                                        Clear text_poke_array_refs[cpu0]
Start smp_text_poke_int3_handler()
Failed to get text_poke_array_refs[cpu0]
Oops: int3
But as I said it is very unlikely, because as far as I know;
(*) smp_text_poke_int3_handler() is called directly from exc_int3()
   which is a kind of NMI, so other interrupt should not run.
(**) In the third step, smp_text_poke_batch_finish() sends IPI for
   sync core after removing int3. Thus any int3 exception handling
   should be finished.
Is this bug reproducible easier recently?
Yes. It is easy to reproduce.
Good, can you test the following 2 patches (I'll send a series)?
I think [1/2] may avoid the kernel crash, but still shows a
warning, and [2/2] may fix it if my guess is correct.
Thank you,
...
...
Thanks,
--
Masami Hiramatsu (Google) mhiramat@kernel.org

Naresh

-- 
Masami Hiramatsu (Google) mhiramat@kernel.org

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: next-20250605: Test regression: qemu-x86_64-compat mode ltp tracing Oops int3 kernel panic