On Sat, 2024-12-07 at 10:32 +0100, Greg Kroah-Hartman wrote:
On Sat, Dec 07, 2024 at 05:21:00PM +0800, Huacai Chen wrote:
Hi, Greg,
On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Fri, Dec 06, 2024 at 04:58:07PM +0800, Huacai Chen wrote:
Backport this series to 6.1&6.6 because LoongArch gets build errors with latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print error about linking without -fPIC or -fPIE flag in more detail").
CC .vmlinux.export.o UPD include/generated/utsversion.h CC init/version-timestamp.o LD .tmp_vmlinux.kallsyms1 loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673
In theory 5.10&5.15 also need this, but since LoongArch get upstream at 5.19, so I just ignore them because there is no error report about other archs now.
Odd, why doesn't this affect other arches as well using new binutils? I hate to have to backport all of this just for one arch, as that feels odd.
The related binutils commit is only for LoongArch, so build errors only occured on LoongArch. I don't know why other archs have no problem exactly, but may be related to their CFLAGS (for example, if we disable CONFIG_RELOCATABLE, LoongArch also has no build errors because CFLAGS changes).
does LoongArch depend on that option?
"That option" is -mdirect-extern-access. Without it we'll use GOT in the kernel image to address anything out of the current TU, bloating the kernel size and making it slower.
The problem is the linker failed to handle a direct access to undefined weak symbol on LoongArch. With GCC 14.2 and Binutils 2.43:
$ cat t.c extern int x __attribute__ ((weak));
int main() { __builtin_printf("%p\n", &x); } $ cc t.c -mdirect-extern-access -static-pie -fPIE $ ./a.out 0x7ffff27ac000
The output should be (nil) instead, as an undefined weak symbol should be resolved to address 0. I'm not sure why the kernel was not blown up by this issue.
With Binutils trunk, an error is emitted instead of silently producing buggy executable. Still I don't think emitting an error is correct when linking a static PIE (our vmlinux is a static PIE). Instead the linker should just rewrite
pcalau12i rd, %pc_hi20(undef_weak)
to
move rd, $zero
Also the "recompile with -fPIE" suggestion in the error message is completely misleading. We are *already* compiling relocatable kernel with -fPIE.
I'm making some Binutils patches to implement the rewrite and reword the error message (for instances where emitting an error is the correct thing, e.g. someone attempts to build a dynamically linked program with -mdirect-extern-access).
What happens if it is enabled for other arches? Why doesn't it break them?
The other arches have copy relocation, so their -mdirect-extern-access is intended to work with dynamically linked executable, thus it's the default and not as strong as ours. On them -mdirect-extern-access still uses GOT to address weak symbols.
We don't have copy relocation, thus our default is -mno-direct-extern- access, and -mdirect-extern-access is only intended for static executables (including OS kernel, embedded firmware, etc). So it's designed to be stronger, unfortunately the toolchain failed to implement it correctly.