Hi Ivan,
On Sat, 2025-02-01 at 10:46 +0100, John Paul Adrian Glaubitz wrote:
On Fri, 2025-01-31 at 11:41 +0100, Ivan Kokshaysky wrote:
This series fixes oopses on Alpha/SMP observed since kernel v6.9. [1] Thanks to Magnus Lindholm for identifying that remarkably longstanding bug.
The problem is that GCC expects 16-byte alignment of the incoming stack since early 2004, as Maciej found out [2]: Having actually dug speculatively I can see that the psABI was changed in GCC 3.5 with commit e5e10fb4a350 ("re PR target/14539 (128-bit long double improperly aligned)") back in Mar 2004, when the stack pointer alignment was increased from 8 bytes to 16 bytes, and arch/alpha/kernel/entry.S has various suspicious stack pointer adjustments, starting with SP_OFF which is not a whole multiple of 16.
Also, as Magnus noted, "ALPHA Calling Standard" [3] required the same: D.3.1 Stack Alignment This standard requires that stacks be octaword aligned at the time a new procedure is invoked.
However:
- the "normal" kernel stack is always misaligned by 8 bytes, thanks to the odd number of 64-bit words in 'struct pt_regs', which is the very first thing pushed onto the kernel thread stack;
- syscall, fault, interrupt etc. handlers may, or may not, receive aligned stack depending on numerous factors.
Somehow we got away with it until recently, when we ended up with a stack corruption in kernel/smp.c:smp_call_function_single() due to its use of 32-byte aligned local data and the compiler doing clever things allocating it on the stack.
Patches 1-2 are preparatory; 3 - the main fix; 4 - fixes remaining special cases.
Ivan.
[1] https://lore.kernel.org/rcu/CA+=Fv5R9NG+1SHU9QV9hjmavycHKpnNyerQ=Ei90G98ukRc... [2] https://lore.kernel.org/rcu/alpine.DEB.2.21.2501130248010.18889@angie.orcam.... [3] https://bitsavers.org/pdf/dec/alpha/Alpha_Calling_Standard_Rev_2.0_19900427....
Changes in v2:
- patch #1: provide empty 'struct pt_regs' to fix compile failure in libbpf, reported by John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de; update comment and commit message accordingly;
- cc'ed stable@vger.kernel.org as older kernels ought to be fixed as well.
Ivan Kokshaysky (4): alpha/uapi: do not expose kernel-only stack frame structures alpha: replace hardcoded stack offsets with autogenerated ones alpha: make stack 16-byte aligned (most cases) alpha: align stack for page fault and user unaligned trap handlers
arch/alpha/include/asm/ptrace.h | 64 ++++++++++++++++++++++++++- arch/alpha/include/uapi/asm/ptrace.h | 65 ++-------------------------- arch/alpha/kernel/asm-offsets.c | 4 ++ arch/alpha/kernel/entry.S | 24 +++++----- arch/alpha/kernel/traps.c | 2 +- arch/alpha/mm/fault.c | 4 +- 6 files changed, 83 insertions(+), 80 deletions(-)
Thanks, I'm testing the v2 series of the patches now.
I have applied the series, but I am seeing gcc crashes from time to time:
/build/reproducible-path/palapeli-24.12.1/obj-alpha-linux-gnu/mime/palathumbcreator_autogen/include/thumbnail-creator.moc: In function ‘QObject* qt_plugin_instance()’: /build/reproducible-path/palapeli-24.12.1/obj-alpha-linux-gnu/mime/palathumbcreator_autogen/include/thumbnail-creator.moc:328:1: error: unrecognizable insn: 328 | QT_MOC_EXPORT_PLUGIN_V2(palathumbcreator_factory, palathumbcreator_factory, qt_pluginMetaDataV2_palathumbcreator_factory) | ^~~~~~~~~~~~~~~~~~~~~~~ (jump_insn 331 295 332 3 (set (pc) (address:DI 1)) -1 (nil) -> 40) during RTL pass: sched1 /build/reproducible-path/palapeli-24.12.1/obj-alpha-linux-gnu/mime/palathumbcreator_autogen/include/thumbnail-creator.moc:328:1: internal compiler error: in extract_insn, at recog.cc:2812 0x12195fc8b internal_error(char const*, ...) ???:0 0x1201f37b7 fancy_abort(char const*, int, char const*) ???:0 0x1201f0a6f _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ???:0 0x1201f0ab7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ???:0 0x120b5ff97 extract_insn(rtx_insn*) ???:0 0x12179d003 deps_analyze_insn(deps_desc*, rtx_insn*) ???:0 0x12179d98f sched_analyze(deps_desc*, rtx_insn*, rtx_insn*) ???:0 0x120bb0517 sched_rgn_compute_dependencies(int) ???:0 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See file:///usr/share/doc/gcc-14/README.Bugs for instructions. The bug is not reproducible, so it is likely a hardware or OS problem.
See: https://buildd.debian.org/status/fetch.php?pkg=palapeli&arch=alpha&v...
But this might be related to CONFIG_COMPACTION as Michael Cree already mentioned as this option is enabled in Debian by default on all architectures except for m68k.
Adrian