On Tue, 17 Jun 2025 10:29:51 -0400 Steven Rostedt rostedt@goodmis.org wrote:
From e2a49c7cefb4148ea3142c752396d39f103c9f4d Mon Sep 17 00:00:00 2001
From: "Masami Hiramatsu (Google)" mhiramat@kernel.org Date: Tue, 17 Jun 2025 19:18:37 +0900 Subject: [PATCH] x86: alternative: Fix int3 handling failure from broken text_poke array
Since smp_text_poke_single() does not expect there is another text_poke request is queued, it can make text_poke_array not sorted or cause a buffer overflow on the text_poke_array.vec[]. This will cause an Oops in int3, or kernel page fault if it causes a buffer overflow.
I would add more of what you found above in the change log. And the issue that was triggered I don't think was because of a buffer overflow. It was because an entry was added to the text_poke_array out of order causing the bsearch to fail.
There are two patterns of bugs I saw, one is "Oops: int3" and another is "#PF in smp_text_poke_batch_finish (or smp_text_poke_int3_handler)". The latter comes from buffer overflow.
----- [ 164.164215] BUG: unable to handle page fault for address: ffffffff32c00000 [ 164.166999] #PF: supervisor read access in kernel mode [ 164.169096] #PF: error_code(0x0000) - not-present page [ 164.171143] PGD 8364b067 P4D 8364b067 PUD 0 [ 164.172954] Oops: Oops: 0000 [#1] SMP PTI [ 164.174581] CPU: 4 UID: 0 PID: 2702 Comm: sh Tainted: G W 6.15.0-next-20250606-00002-g75b4e49588c2 #239 PREEMPT(voluntary) [ 164.179193] Tainted: [W]=WARN [ 164.180926] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 164.184696] RIP: 0010:smp_text_poke_batch_finish+0xb9/0x400 [ 164.186873] Code: e4 4c 8d 6d c2 85 c9 74 39 48 63 03 b9 01 00 00 00 4c 89 ea 41 83 c4 01 48 c7 c7 d0 f7 f7 b2 48 83 c3 10 48 8d b0 00 00 c0 b2 <0f> b6 80 00 00 c0 b2 88 43 ff e8 68 e3 ff ff 44 3b 25 d1 29 5f 02 -----
This is because smp_text_poke_single() overwrites the text_poke_array.vec[TEXT_POKE_ARRAY_MAX], which is nr_entries (and the variables next to text_poke_array.)
----- static struct smp_text_poke_array { struct smp_text_poke_loc vec[TEXT_POKE_ARRAY_MAX]; int nr_entries; } text_poke_array; -----
Please add to the change log that the issue is that smp_text_poke_single() can be called while smp_text_poke_batch*() is being used. The locking is around the called functions but nothing prevents them from being intermingled.
OK.
This means that if we have:
CPU 0 CPU 1 CPU 2
smp_text_poke_batch_add()
smp_text_poke_single() <<-- Adds out of order <int3> [Fails o find address in text_poke_array ] OOPS!
Thanks for the chart!
No overflow. This could possibly happen with just two entries!
Yes, that was actually I observed (by a debug patch)
Use smp_text_poke_batch_add() instead of __smp_text_poke_batch_add() so that it correctly flush the queue if needed.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org Closes: https://lore.kernel.org/all/CA+G9fYsLu0roY3DV=tKyqP7FEKbOEETRvTDhnpPxJGbA=Cg... Fixes: 8976ade0c1b ("x86/alternatives: Simplify smp_text_poke_single() by using tp_vec and existing APIs") Signed-off-by: Masami Hiramatsu (Google)
Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org
Thank you!
-- Steve
mhiramat@kernel.org --- arch/x86/kernel/alternative.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index ecfe7b497cad..8038951650c6 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -3107,6 +3107,6 @@ void __ref smp_text_poke_batch_add(void *addr, const void *opcode, size_t len, c */ void __ref smp_text_poke_single(void *addr, const void *opcode, size_t len, const void *emulate) {
- __smp_text_poke_batch_add(addr, opcode, len, emulate);
- smp_text_poke_batch_add(addr, opcode, len, emulate); smp_text_poke_batch_finish();
}