Hi, Willy
Here is the powerpc support, includes 32-bit big-endian powerpc, 64-bit little endian and big endian powerpc.
All of them passes run-user with the default powerpc toolchain from Ubuntu 20.04:
$ make run-user DEFCONFIG=tinyconfig XARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- | grep status 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning $ make run-user DEFCONFIG=tinyconfig XARCH=powerpc64 CROSS_COMPILE=powerpc64le-linux-gnu- | grep status 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning $ make run-user DEFCONFIG=tinyconfig XARCH=powerpc64le CROSS_COMPILE=powerpc64le-linux-gnu- | grep status 165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
For the slow 'run' target, I have run with defconfig before, and just verified them via the fast tinyconfig + run with a new patch from next patchset, all of them passes:
165 test(s): 156 passed, 9 skipped, 0 failed => status: warning
Note, the big endian crosstool powerpc64-linux-gcc from https://mirrors.edge.kernel.org/pub/tools/crosstool/ has been used to test both little endian and big endian powerpc64 too, both passed.
Here simply explain what they are:
* tools/nolibc: add support for powerpc tools/nolibc: add support for powerpc64
32-bit & 64-bit powerpc support of nolibc.
* selftests/nolibc: select_null: fix up for big endian powerpc64
fix up a test case for big endian powerpc64.
* selftests/nolibc: add extra config file customize support
add extconfig target to allow enable extra config options via configs/<ARCH>.config
applied suggestion from Thomas to use config files instead of config lines.
* selftests/nolibc: add XARCH and ARCH mapping support
applied suggestions from Willy, use XARCH as the input of our nolibc test, use ARCH as the pure kernel input, at last build the mapping between XARCH and ARCH.
Customize the variables via the input XARCH.
* selftests/nolibc: add test support for powerpc
Require to use extconfig to enable the console options specified in configs/powerpc.config
currently, we should manually run extconfig after defconfig, in next patchset, we will do this automatically.
* selftests/nolibc: add test support for powerpc64le selftests/nolibc: add test support for powerpc64
Very simple, but customize CFLAGS carefully to let them work with powerpc64le-linux-gnu-gcc (from Linux distributions) and powerpc64-linux-gcc (from mirrors.edge.kernel.org)
The next patchset will not be tinyconfig, but some prepare patches, will be sent out soon.
Best regards, Zhangjin ---
Zhangjin Wu (8): tools/nolibc: add support for powerpc tools/nolibc: add support for powerpc64 selftests/nolibc: select_null: fix up for big endian powerpc64 selftests/nolibc: add extra config file customize support selftests/nolibc: add XARCH and ARCH mapping support selftests/nolibc: add test support for powerpc selftests/nolibc: add test support for powerpc64le selftests/nolibc: add test support for powerpc64
tools/include/nolibc/arch-powerpc.h | 170 ++++++++++++++++++ tools/testing/selftests/nolibc/Makefile | 55 ++++-- .../selftests/nolibc/configs/powerpc.config | 3 + tools/testing/selftests/nolibc/nolibc-test.c | 2 +- 4 files changed, 217 insertions(+), 13 deletions(-) create mode 100644 tools/include/nolibc/arch-powerpc.h create mode 100644 tools/testing/selftests/nolibc/configs/powerpc.config
Both syscall declarations and _start code definition are added for powerpc to nolibc.
Like mips, powerpc uses a register (exactly, the summary overflow bit) to record the error occurred, and uses another register to return the value [1]. So, the return value of every syscall declaration must be normalized to easier the __sysret helper, return -value when there is an error, otheriwse, return value directly.
Glibc and musl use different methods to check the summary overflow bit, glibc (sysdeps/unix/sysv/linux/powerpc/sysdep.h) saves the cr register to r0 at first, and then check the summary overflow bit in cr0:
mfcr r0 r0 & (1 << 28) ? -r3 : r3
-->
10003c14: 7c 00 00 26 mfcr r0 10003c18: 74 09 10 00 andis. r9,r0,4096 10003c1c: 41 82 00 08 beq 0x10003c24 10003c20: 7c 63 00 d0 neg r3,r3
Musl (arch/powerpc/syscall_arch.h) directly checks the summary overflow bit with the 'bns' instruction:
/* no summary overflow bit means no error, return value directly */ bns+ 1f /* otherwise, return negated value */ neg r3, r3 1:
-->
10000418: 40 a3 00 08 bns 0x10000420 1000041c: 7c 63 00 d0 neg r3,r3
The later one is smaller, here applies it.
arch/powerpc/include/asm/vdso/gettimeofday.h file uses the smaller method for do_syscall_2() too.
[1]: https://man7.org/linux/man-pages/man2/syscall.2.html
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/arch-powerpc.h | 156 ++++++++++++++++++++++++++++ 1 file changed, 156 insertions(+) create mode 100644 tools/include/nolibc/arch-powerpc.h
diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h new file mode 100644 index 000000000000..100ec0f412dc --- /dev/null +++ b/tools/include/nolibc/arch-powerpc.h @@ -0,0 +1,156 @@ +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */ +/* + * PowerPC specific definitions for NOLIBC + * Copyright (C) 2023 Zhangjin Wu falcon@tinylab.org + */ + +#ifndef _NOLIBC_ARCH_POWERPC_H +#define _NOLIBC_ARCH_POWERPC_H + +#include "compiler.h" +#include "crt.h" + +/* Syscalls for PowerPC : + * - stack is 16-byte aligned + * - syscall number is passed in r0 + * - arguments are in r3, r4, r5, r6, r7, r8, r9 + * - the system call is performed by calling "sc" + * - syscall return comes in r3, and the summary overflow bit is checked + * to know if an error occurred, in which case errno is in r3. + * - the arguments are cast to long and assigned into the target + * registers which are then simply passed as registers to the asm code, + * so that we don't have to experience issues with register constraints. + */ + +#define _NOLIBC_SYSCALL_CLOBBERLIST \ + "memory", "cr0", "r9", "r10", "r11", "r12" + +#define my_syscall0(num) \ +({ \ + register long r0 __asm__ ("r0") = (num); \ + register long r3 __asm__ ("r3"); \ + \ + __asm__ volatile ( \ + "sc; bns+ 1f; neg %1, %1; 1:\n" \ + : "+r"(r0), "=r"(r3) \ + :: "r4", "r5", "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST \ + ); \ + r3; \ +}) + +#define my_syscall1(num, arg1) \ +({ \ + register long r0 __asm__ ("r0") = (num); \ + register long r3 __asm__ ("r3") = (long)(arg1); \ + \ + __asm__ volatile ( \ + "sc; bns+ 1f; neg %1, %1; 1:\n" \ + : "+r"(r0), "+r"(r3) \ + :: "r4", "r5", "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST \ + ); \ + r3; \ +}) + + +#define my_syscall2(num, arg1, arg2) \ +({ \ + register long r0 __asm__ ("r0") = (num); \ + register long r3 __asm__ ("r3") = (long)(arg1); \ + register long r4 __asm__ ("r4") = (long)(arg2); \ + \ + __asm__ volatile ( \ + "sc; bns+ 1f; neg %1, %1; 1:\n" \ + : "+r"(r0), "+r"(r3), \ + "+r"(r4) \ + :: "r5", "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST \ + ); \ + r3; \ +}) + + +#define my_syscall3(num, arg1, arg2, arg3) \ +({ \ + register long r0 __asm__ ("r0") = (num); \ + register long r3 __asm__ ("r3") = (long)(arg1); \ + register long r4 __asm__ ("r4") = (long)(arg2); \ + register long r5 __asm__ ("r5") = (long)(arg3); \ + \ + __asm__ volatile ( \ + "sc; bns+ 1f; neg %1, %1; 1:\n" \ + : "+r"(r0), "+r"(r3), \ + "+r"(r4), "+r"(r5) \ + :: "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST \ + ); \ + r3; \ +}) + + +#define my_syscall4(num, arg1, arg2, arg3, arg4) \ +({ \ + register long r0 __asm__ ("r0") = (num); \ + register long r3 __asm__ ("r3") = (long)(arg1); \ + register long r4 __asm__ ("r4") = (long)(arg2); \ + register long r5 __asm__ ("r5") = (long)(arg3); \ + register long r6 __asm__ ("r6") = (long)(arg4); \ + \ + __asm__ volatile ( \ + "sc; bns+ 1f; neg %1, %1; 1:\n" \ + : "+r"(r0), "+r"(r3), \ + "+r"(r4), "+r"(r5), "+r"(r6) \ + :: "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST \ + ); \ + r3; \ +}) + + +#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \ +({ \ + register long r0 __asm__ ("r0") = (num); \ + register long r3 __asm__ ("r3") = (long)(arg1); \ + register long r4 __asm__ ("r4") = (long)(arg2); \ + register long r5 __asm__ ("r5") = (long)(arg3); \ + register long r6 __asm__ ("r6") = (long)(arg4); \ + register long r7 __asm__ ("r7") = (long)(arg5); \ + \ + __asm__ volatile ( \ + "sc; bns+ 1f; neg %1, %1; 1:\n" \ + : "+r"(r0), "+r"(r3), \ + "+r"(r4), "+r"(r5), "+r"(r6), "+r"(r7) \ + :: "r8", _NOLIBC_SYSCALL_CLOBBERLIST \ + ); \ + r3; \ +}) + +#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ +({ \ + register long r0 __asm__ ("r0") = (num); \ + register long r3 __asm__ ("r3") = (long)(arg1); \ + register long r4 __asm__ ("r4") = (long)(arg2); \ + register long r5 __asm__ ("r5") = (long)(arg3); \ + register long r6 __asm__ ("r6") = (long)(arg4); \ + register long r7 __asm__ ("r7") = (long)(arg5); \ + register long r8 __asm__ ("r8") = (long)(arg6); \ + \ + __asm__ volatile ( \ + "sc; bns+ 1f; neg %1, %1; 1:\n" \ + : "+r"(r0), "+r"(r3), \ + "+r"(r4), "+r"(r5), "+r"(r6), "+r"(r7), "+r"(r8) \ + :: _NOLIBC_SYSCALL_CLOBBERLIST \ + ); \ + r3; \ +}) + +/* startup code */ +void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void) +{ + __asm__ volatile ( + "mr 3, 1\n" /* save stack pointer to r3, as arg1 of _start_c */ + "clrrwi 1, 1, 4\n" /* align the stack to 16 bytes */ + "li 0, 0\n" /* zero the frame pointer */ + "stwu 1, -16(1)\n" /* the initial stack frame */ + "bl _start_c\n" /* transfer to c runtime */ + ); + __builtin_unreachable(); +} + +#endif /* _NOLIBC_ARCH_POWERPC_H */
On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
Both syscall declarations and _start code definition are added for powerpc to nolibc.
Like mips, powerpc uses a register (exactly, the summary overflow bit) to record the error occurred, and uses another register to return the value [1]. So, the return value of every syscall declaration must be normalized to easier the __sysret helper, return -value when there is an error, otheriwse, return value directly.
Glibc and musl use different methods to check the summary overflow bit, glibc (sysdeps/unix/sysv/linux/powerpc/sysdep.h) saves the cr register to r0 at first, and then check the summary overflow bit in cr0:
mfcr r0 r0 & (1 << 28) ? -r3 : r3 --> 10003c14: 7c 00 00 26 mfcr r0 10003c18: 74 09 10 00 andis. r9,r0,4096 10003c1c: 41 82 00 08 beq 0x10003c24 10003c20: 7c 63 00 d0 neg r3,r3
Musl (arch/powerpc/syscall_arch.h) directly checks the summary overflow bit with the 'bns' instruction:
/* no summary overflow bit means no error, return value directly */ bns+ 1f /* otherwise, return negated value */ neg r3, r3 1: --> 10000418: 40 a3 00 08 bns 0x10000420 1000041c: 7c 63 00 d0 neg r3,r3
The later one is smaller, here applies it.
arch/powerpc/include/asm/vdso/gettimeofday.h file uses the smaller method for do_syscall_2() too.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/arch-powerpc.h | 156 ++++++++++++++++++++++++++++
This also should be added to nolibc/arch.h.
1 file changed, 156 insertions(+) create mode 100644 tools/include/nolibc/arch-powerpc.h
diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h new file mode 100644 index 000000000000..100ec0f412dc --- /dev/null +++ b/tools/include/nolibc/arch-powerpc.h @@ -0,0 +1,156 @@ +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */ +/*
- PowerPC specific definitions for NOLIBC
- Copyright (C) 2023 Zhangjin Wu falcon@tinylab.org
If it is taken from musl, shouldn't there also be a musl copyright?
[..]
On Sun, Jul 23, 2023 at 09:32:37AM +0200, Thomas Weißschuh wrote:
On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h new file mode 100644 index 000000000000..100ec0f412dc --- /dev/null +++ b/tools/include/nolibc/arch-powerpc.h @@ -0,0 +1,156 @@ +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */ +/*
- PowerPC specific definitions for NOLIBC
- Copyright (C) 2023 Zhangjin Wu falcon@tinylab.org
If it is taken from musl, shouldn't there also be a musl copyright?
In fact it depends. If code was taken there, not only the copyright is needed, but the license' compatibility must be verified. If, however, the code was only disassembled to be understood and reimplemented (as it seems to me), then no code was taken there and it's not needed.
Thanks, Willy
Hi, Thomas, Willy
On Sun, Jul 23, 2023 at 09:32:37AM +0200, Thomas Wei�schuh wrote:
On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h new file mode 100644 index 000000000000..100ec0f412dc --- /dev/null +++ b/tools/include/nolibc/arch-powerpc.h @@ -0,0 +1,156 @@ +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */ +/*
- PowerPC specific definitions for NOLIBC
- Copyright (C) 2023 Zhangjin Wu falcon@tinylab.org
If it is taken from musl, shouldn't there also be a musl copyright?
In fact it depends. If code was taken there, not only the copyright is needed, but the license' compatibility must be verified. If, however, the code was only disassembled to be understood and reimplemented (as it seems to me), then no code was taken there and it's not needed.
This discussion does inspire me a lot to shrink the whole architecture specific nolibc my_syscall<N>() macros, like crt.h, a common syscall.h is added to do so. I have finished most of them except the ones passing arguments via stack, still trying to merge these ones.
With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h will only require to add ~10 lines to define their own syscall instructions, registers and clobberlist, which looks like this (for powerpc):
#define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg %0, %0; 1:"
/* PowerPC doesn't always restore r3-r12 for us */ #define _NOLIBC_SYSCALL_CLOBBERLIST "memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4"
/* PowerPC write GPRS in kernel side but not restore them */ #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
#define _NOLIBC_REG_NUM "r0" #define _NOLIBC_REG_RET "r3" #define _NOLIBC_REG_arg1 "r3" #define _NOLIBC_REG_arg2 "r4" #define _NOLIBC_REG_arg3 "r5" #define _NOLIBC_REG_arg4 "r6" #define _NOLIBC_REG_arg5 "r7" #define _NOLIBC_REG_arg6 "r8"
Before:
$ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done 157 tools/include/nolibc/arch-aarch64.h 199 tools/include/nolibc/arch-arm.h 178 tools/include/nolibc/arch-i386.h 164 tools/include/nolibc/arch-loongarch.h 195 tools/include/nolibc/arch-mips.h 0 tools/include/nolibc/arch-powerpc.h 160 tools/include/nolibc/arch-riscv.h 186 tools/include/nolibc/arch-s390.h 176 tools/include/nolibc/arch-x86_64.h
After:
$ wc -l tools/include/nolibc/arch-*.h 54 tools/include/nolibc/arch-aarch64.h 84 tools/include/nolibc/arch-arm.h 90 tools/include/nolibc/arch-i386.h /* the last one use stack to pass arguments, reserve as-is */ 59 tools/include/nolibc/arch-loongarch.h 120 tools/include/nolibc/arch-mips.h /* the last two use stack to pass arguments, reserve as-is */ 73 tools/include/nolibc/arch-powerpc.h 58 tools/include/nolibc/arch-riscv.h 87 tools/include/nolibc/arch-s390.h 67 tools/include/nolibc/arch-x86_64.h
syscall.h itself:
$ wc -l tools/include/nolibc/syscall.h 112 tools/include/nolibc/syscall.h
Willy, do we need to rename my_syscall<N> to _nolibc_syscall<N> to limit these macros nolibc internally? I plan to rename all of the new adding macros with _nolibc_ (for funcs) or _NOLIBC_ (for variables).
Thomas, do we need to merge the syscall macros from unistd.h to this new syscall.h? we do reuse the macros between them, like the _syscall_narg* ones.
Since this new syscall.h does help a lot to shrink the arch-powerpc.h, I plan to send this syscall.h at first and then renew our powerpc patchset, what's your idea?
Thanks, Zhangjin
Thanks, Willy
On Tue, Jul 25, 2023 at 01:44:14PM +0800, Zhangjin Wu wrote:
This discussion does inspire me a lot to shrink the whole architecture specific nolibc my_syscall<N>() macros, like crt.h, a common syscall.h is added to do so. I have finished most of them except the ones passing arguments via stack, still trying to merge these ones.
With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h will only require to add ~10 lines to define their own syscall instructions, registers and clobberlist, which looks like this (for powerpc):
#define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg %0, %0; 1:" /* PowerPC doesn't always restore r3-r12 for us */ #define _NOLIBC_SYSCALL_CLOBBERLIST "memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4" /* PowerPC write GPRS in kernel side but not restore them */ #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
#define _NOLIBC_REG_NUM "r0" #define _NOLIBC_REG_RET "r3" #define _NOLIBC_REG_arg1 "r3" #define _NOLIBC_REG_arg2 "r4" #define _NOLIBC_REG_arg3 "r5" #define _NOLIBC_REG_arg4 "r6" #define _NOLIBC_REG_arg5 "r7" #define _NOLIBC_REG_arg6 "r8"
Before:
$ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done 157 tools/include/nolibc/arch-aarch64.h 199 tools/include/nolibc/arch-arm.h 178 tools/include/nolibc/arch-i386.h 164 tools/include/nolibc/arch-loongarch.h 195 tools/include/nolibc/arch-mips.h 0 tools/include/nolibc/arch-powerpc.h 160 tools/include/nolibc/arch-riscv.h 186 tools/include/nolibc/arch-s390.h 176 tools/include/nolibc/arch-x86_64.h
After:
$ wc -l tools/include/nolibc/arch-*.h 54 tools/include/nolibc/arch-aarch64.h 84 tools/include/nolibc/arch-arm.h 90 tools/include/nolibc/arch-i386.h /* the last one use stack to pass arguments, reserve as-is */ 59 tools/include/nolibc/arch-loongarch.h 120 tools/include/nolibc/arch-mips.h /* the last two use stack to pass arguments, reserve as-is */ 73 tools/include/nolibc/arch-powerpc.h 58 tools/include/nolibc/arch-riscv.h 87 tools/include/nolibc/arch-s390.h 67 tools/include/nolibc/arch-x86_64.h
syscall.h itself:
$ wc -l tools/include/nolibc/syscall.h 112 tools/include/nolibc/syscall.h
The important thing to consider is not the number of lines but the *maintainability*. You factored the syscall code so much above with all these macros that I don't even understand how they're going to interact with each other, especially "%0". Also I don't know what the macro _NOLIBC_GPRS_AS_OUTPUT_OPERANDS does. And when someone reports a bug like we had in the past with programs randomly crashing depending on stack alignment and such, it becomes particularly tricky to figure what is expected and how it differs from reality.
Maybe it's possible to do something in between, like defining a few more slightly larger blocks, I don't know. I still tend to think that this significantly complicates the understanding of the whole thing.
Also, looking at different archs' syscall code, they're not all defined the same way. Some like i386 use "=a" for the return value. Others use "+r" as an input/output value, others "=r", others "+d" or "=d". And reading the comments, there are some arch-specific choices for these declarations, that are sometimes there to force the toolchain a little bit. Others like MIPS pass some of their args in the stack, so a name only is not sufficient. Thus, trying to factor all of this will not only make it very hard to insert arch-specific adjusments, but it will also make the code much less readable.
Also think about this for a moment: you had the opportunity to add two new archs by simply restarting from existing ones. s390 and loongarch were added the same way, leaving enough freedom to the implementers to adjust the definitions to fit their environment's constraints. If you make it very complicated, I suspect we won't get any such contributions anymore just because nobody figures how to find their way through it, or it would require to change again for everyone just to add one specific case. I tend to think that the current situation is already fairly minimal with quite readable and debuggable calls, and that it's what *really* matters.
When you're trying to reorganize code, it's important to ask yourself whether you would prefer to debug the old one or the new one at 3am. Hint: at 3am, the more abstractions there are, the less understandable it becomes.
Willy, do we need to rename my_syscall<N> to _nolibc_syscall<N> to limit these macros nolibc internally? I plan to rename all of the new adding macros with _nolibc_ (for funcs) or _NOLIBC_ (for variables).
Why not, I'm not opposed to that. Just keep in mind that every single rename complicates backports (or may even silently break them), so it's important to know where you want to go and to try to make changes converge towards something stable and durable.
Thanks, Willy
Hi, Willy
Thanks very much for your comments on the new syscall.h proposal, just replied more, it is a very long email, hope it explains more clearly ;-)
Also CCed i386 and s390 committers, welcome your dicussion.
On Tue, Jul 25, 2023 at 01:44:14PM +0800, Zhangjin Wu wrote:
This discussion does inspire me a lot to shrink the whole architecture specific nolibc my_syscall<N>() macros, like crt.h, a common syscall.h is added to do so. I have finished most of them except the ones passing arguments via stack, still trying to merge these ones.
With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h will only require to add ~10 lines to define their own syscall instructions, registers and clobberlist, which looks like this (for powerpc):
#define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg %0, %0; 1:" /* PowerPC doesn't always restore r3-r12 for us */ #define _NOLIBC_SYSCALL_CLOBBERLIST "memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4" /* PowerPC write GPRS in kernel side but not restore them */ #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
#define _NOLIBC_REG_NUM "r0" #define _NOLIBC_REG_RET "r3" #define _NOLIBC_REG_arg1 "r3" #define _NOLIBC_REG_arg2 "r4" #define _NOLIBC_REG_arg3 "r5" #define _NOLIBC_REG_arg4 "r6" #define _NOLIBC_REG_arg5 "r7" #define _NOLIBC_REG_arg6 "r8"
Before:
$ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done 157 tools/include/nolibc/arch-aarch64.h 199 tools/include/nolibc/arch-arm.h 178 tools/include/nolibc/arch-i386.h 164 tools/include/nolibc/arch-loongarch.h 195 tools/include/nolibc/arch-mips.h 0 tools/include/nolibc/arch-powerpc.h 160 tools/include/nolibc/arch-riscv.h 186 tools/include/nolibc/arch-s390.h 176 tools/include/nolibc/arch-x86_64.h
After:
$ wc -l tools/include/nolibc/arch-*.h 54 tools/include/nolibc/arch-aarch64.h 84 tools/include/nolibc/arch-arm.h 90 tools/include/nolibc/arch-i386.h /* the last one use stack to pass arguments, reserve as-is */ 59 tools/include/nolibc/arch-loongarch.h 120 tools/include/nolibc/arch-mips.h /* the last two use stack to pass arguments, reserve as-is */ 73 tools/include/nolibc/arch-powerpc.h 58 tools/include/nolibc/arch-riscv.h 87 tools/include/nolibc/arch-s390.h 67 tools/include/nolibc/arch-x86_64.h
syscall.h itself:
$ wc -l tools/include/nolibc/syscall.h 112 tools/include/nolibc/syscall.h
The important thing to consider is not the number of lines but the *maintainability*.
The original goal is not really the number of lines (only a 'side-effect'), but is exactly easier porting/maintainability with clearer code architecture, I have wrotten another message to explain more about this background, so, let's directly reply here and discuss more.
You factored the syscall code so much above with all these macros that I don't even understand how they're going to interact with each other, especially "%0".
Yeah, it is my fault, this should be cleaned up with the return register directly:
#define _NOLIBC_SYSCALL_CALL \ "sc; bns+ 1f; neg 3, 3; 1:"
Also I don't know what the macro _NOLIBC_GPRS_AS_OUTPUT_OPERANDS does.
This is the root cause to inspire me to add the new syscall.h, let's explain the background step by step.
All of the other architectures (except PowerPC) restore GPRS for us when return from syscall, so, their clobber list simply not include the GPRS and only need to add the input registers in the 'INPUT Operands' list.
But PowerPC doesn't restore such GPRS for us, I'm not sure if it is a feature (Maybe) or a bug. for PowerPC32, the following line will restore the GPRS for us, but may be not a good idea to do so for it may decrease the syscall performance although save some instructions in user-space and also, the other libcs also follow the current rule, so, this may be a design intention we must follow (welcome your suggestions).
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index fe27d41f9a3d..1ed535e9144c 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -155,6 +155,7 @@ syscall_exit_finish: bne 3f mtcr r5
+ REST_GPRS(3, 12, r1) 1: REST_GPR(2, r1) REST_GPR(1, r1) rfi
For PowerPC, if with the previous method like the other architectures, the clobber list differs for every my_syscall<N> and all of the input registers must be put in the 'OUTPUT Operands' too to avoid compiler to save and resue a variable in such GPRS across my_syscall<N> calls.
Originally in my local new version of arch-powerpc.h, we got such code for every my_syscall<N>, use my_syscall6() as an example:
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ ({ \ register long _ret __asm__ ("r3"); \ register long _num __asm__ ("r0") = (num); \ register long _arg1 __asm__ ("r3") = (long)(arg1); \ register long _arg2 __asm__ ("r4") = (long)(arg2); \ register long _arg3 __asm__ ("r5") = (long)(arg3); \ register long _arg4 __asm__ ("r6") = (long)(arg4); \ register long _arg5 __asm__ ("r7") = (long)(arg5); \ register long _arg6 __asm__ ("r8") = (long)(arg6); \ \ __asm__ volatile ( \ " sc\n" \ " bns+ 1f\n" \ " neg %0, %0\n" \ "1:\n" \ : "=r"(_ret), \ "+r"(_arg2), "+r"(_arg3), "+r"(_arg4), \ "+r"(_arg5), "+r"(_arg6) \ : "0"(_arg1), "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST \ ); \ _ret; \ })
It almost aligns with the other architectures, but the full clobber list differs for every my_syscall<N>, the basic one is:
/* PowerPC kernel doesn't always restore r4-r12 for us */ #define _NOLIBC_SYSCALL_CLOBBERLIST \ "memory", "cr0", "r12", "r11", "r10", "r9",
Use my_syscall0() as a further example, we need something like this:
#define my_syscall0(num) \ ({ \ register long _ret __asm__ ("r3"); \ register long _num __asm__ ("r0") = (num); \ \ __asm__ volatile ( \ " sc\n" \ " bns+ 1f\n" \ " neg %0, %0\n" \ "1:\n" \ : "=r"(_ret) \ : "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST, "r8", "r7", "r6", "r5", "r4" \ ); \ _ret; \ })
The additional "r8"..."r4" must be appended to the clobber list for they can not be put together for every my_syscall<N> due to conflicts between they between the clobber list and the "OUTPUT/INPUT operands".
I found a solution to share the same _NOLIBC_SYSCALL_CLOBBERLIST, that is split the "OUTPUT/INPUT Operands" list out of the core syscall assembly:
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ ({ \ register long _ret __asm__ ("r3"); \ register long _num __asm__ ("r0") = (num); \ register long _arg1 __asm__ ("r3") = (long)(arg1); \ register long _arg2 __asm__ ("r4") = (long)(arg2); \ register long _arg3 __asm__ ("r5") = (long)(arg3); \ register long _arg4 __asm__ ("r6") = (long)(arg4); \ register long _arg5 __asm__ ("r7") = (long)(arg5); \ register long _arg6 __asm__ ("r8") = (long)(arg6); \ \ __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4), \ "+r"(_arg5), "+r"(_arg6)::); \ \ __asm__ volatile ( \ " sc\n" \ " bns+ 1f\n" \ " neg %0, %0\n" \ "1:\n" \ : "=r"(_ret) \ : "0"(_arg1), "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST \ ); \ _ret; \ })
Note, a question here is, the above split still require more discussion to make sure it does work for different toolchains (only test for gcc currently) or even if this method is right from scratch, welcome your suggestion.
As a result, all of the my_syscall<N> are able to share the core syscall calling assembly block, so, here is what we at last have:
#define _my_syscall_tail() \ __asm__ volatile ( \ _NOLIBC_SYSCALL_CALL \ : "=r"(_ret) \ : "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST \ ); \ _ret
And further, we also found it was possible to share most of them among these not ugly but completely duplicated lines:
register long _ret __asm__ ("r3"); \ register long _num __asm__ ("r0") = (num); \ register long _arg1 __asm__ ("r3") = (long)(arg1); \ register long _arg2 __asm__ ("r4") = (long)(arg2); \ register long _arg3 __asm__ ("r5") = (long)(arg3); \ register long _arg4 __asm__ ("r6") = (long)(arg4); \ register long _arg5 __asm__ ("r7") = (long)(arg5); \ register long _arg6 __asm__ ("r8") = (long)(arg6); \ \ __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4), \ "+r"(_arg5), "+r"(_arg6)::);
That's why at last we have such blocks (of course, for PowerPC itself it is a big change and not necessary):
#define _my_syscall_head(num) \ register long _ret __asm__ (_NOLIBC_REG_RET); \ register long _num __asm__ (_NOLIBC_REG_NUM) = (num) \
#ifdef _NOLIBC_REG_ERR #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR #endif
#ifdef _NOLIBC_REG_EXTRA #define _my_syscall_extra() \ register long reg_extra __asm__ (_NOLIBC_REG_EXTRA); \ __asm__ volatile ("": "=r"(reg_extra)::) #else #define _my_syscall_extra() #endif
/* Architectures like PowerPC write GPRS in kernel side and not restore them */ #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS #define _my_syscall_argn(n, arg) \ register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg); \ __asm__ volatile ("":: "r"(_arg##n):) #else #define _my_syscall_argn(n, arg) \ register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg); \ __asm__ volatile ("": "+r"(_arg##n)::) #endif
And this further build args for us:
#define _my_syscall_args0()
#define _my_syscall_args1(...) \ _my_syscall_args0(); \ _my_syscall_argn(1, __VA_ARGS__)
#define _my_syscall_args2(arg2, ...) \ _my_syscall_args1(__VA_ARGS__); \ _my_syscall_argn(2, arg2)
#define _my_syscall_args3(arg3, ...) \ _my_syscall_args2(__VA_ARGS__); \ _my_syscall_argn(3, arg3)
#define _my_syscall_args4(arg4, ...) \ _my_syscall_args3(__VA_ARGS__); \ _my_syscall_argn(4, arg4)
#define _my_syscall_args5(arg5, ...) \ _my_syscall_args4(__VA_ARGS__); \ _my_syscall_argn(5, arg5)
#define _my_syscall_args6(arg6, ...) \ _my_syscall_args5(__VA_ARGS__); \ _my_syscall_argn(6, arg6)
And at last:
#define __my_syscall_args(N, ...) _my_syscall_args##N(__VA_ARGS__) #define _my_syscall_args(N, ...) __my_syscall_args(N, ##__VA_ARGS__)
#define __my_syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N #define _my_syscall_narg(...) __my_syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
#define __my_syscall_argsn(N, argn, ...) \ _my_syscall_args(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__); \ _my_syscall_argn(N, argn)
#define _my_syscall_argsn(...) __my_syscall_argsn(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__)
/* Note, my_syscall0() has no argument, can not use my_syscalln() */ #define my_syscall0(num) \ ({ \ _my_syscall_head(num); \ _my_syscall_extra(); \ _my_syscall_tail(); \ })
#define my_syscalln(num, ...) \ ({ \ _my_syscall_head(num); \ _my_syscall_extra(); \ _my_syscall_argsn(__VA_ARGS__); \ _my_syscall_tail(); \ })
#define my_syscall1(num, arg1) my_syscalln(num, arg1) #define my_syscall2(num, arg1, arg2) my_syscalln(num, arg2, arg1) #define my_syscall3(num, arg1, arg2, arg3) my_syscalln(num, arg3, arg2, arg1) #define my_syscall4(num, arg1, arg2, arg3, arg4) my_syscalln(num, arg4, arg3, arg2, arg1)
#ifndef my_syscall5 #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) my_syscalln(num, arg5, arg4, arg3, arg2, arg1) #endif #ifndef my_syscall6 #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1) #endif
At last, I found this worked on all of the supported architectures, so, the new syscall.h is proposed.
BTW, another question here is, to utilize the feature of __VA_ARGS__ to easier getting the last argument, the order of arguments are reversed during the declarations of the my_syscall<N>, any suggestion on this part? is it possible to not reverse the order? If possible, the my_syscall<N> declarations may be simplified to:
#define my_syscall1(...) my_syscalln(__VA_ARGS__)
And even be possible to define syscall() from unistd.h as a alias of my_syscalln():
#define syscall(...) my_syscalln(__VA_ARGS__)
And further, these three from unistd.h are sharable:
#define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N #define _syscall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0) #define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)
And when someone reports a bug like we had in the past with programs randomly crashing depending on stack alignment and such, it becomes particularly tricky to figure what is expected and how it differs from reality.
Macros are really hard to debug, the above code lines cost me two days ;-)
but after the common syscall.h, the left architecture specific parts are very few and easier to debug and even less copy and paste.
Maybe it's possible to do something in between, like defining a few more slightly larger blocks,
Yeah, here is the method we applied, new blocks added are:
/* for ret and num */ #define _my_syscall_head(num) \
/* for extra err output */ #define _my_syscall_extra() \
/* for every argument, with additional OUTPUT/INPUT operands setting */ #define _my_syscall_argn(n, arg) \
/* for the core syscall calling and return */ #define _my_syscall_tail() \
/* for variable number of args */ #define _my_syscall_args0() #define _my_syscall_args1(...) \ #define _my_syscall_args2(arg2, ...) \ #define _my_syscall_args3(arg3, ...) \ #define _my_syscall_args4(arg4, ...) \ #define _my_syscall_args5(arg5, ...) \ #define _my_syscall_args6(arg6, ...) \
/* unified variable number of args */ #define _my_syscall_argsn(...)
/* my_syscall0 */ #define my_syscall0(num) \
/* my_syscalln */ #define my_syscalln(num, ...) \
Still require more discussion on their names or even more simplification.
I don't know. I still tend to think that this significantly complicates the understanding of the whole thing.
Willy, don't worry, I do think it make things easier, the worse case is using the old code or only use part of our new blocks helpers ;-)
For this new syscall.h, it mainly clear the inline assembly codes, as we know, inline assembly code is a very complicated thing, If we clear the logic carefully (as we target but not yet) in our common code, architecture developers only require to focus on the platform specific definitions, it should be better for portability, review and maintainability.
It is very hard to learn the meanning of the OUTPUT operands, INPUT operands and even the clobber list and even the flags you mentioned below, clearing up them is really required, this new syscall.h also require more comments on the macro functions and variables.
Also, looking at different archs' syscall code, they're not all defined the same way. Some like i386 use "=a" for the return value. Others use "+r" as an input/output value, others "=r",
Agree, this is a very hard part of the inline assembly codes, clearing up the generic versions in syscall.h with additional comments may really help a lot.
For OUTPUT/INPUT operands, they may be enough for the generic versions:
#define _my_syscall_tail() \ __asm__ volatile ( \ _NOLIBC_SYSCALL_CALL \ : "=r"(_ret) \ : "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST \ ); \ _ret
#ifdef _NOLIBC_REG_ERR #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR #endif
#ifdef _NOLIBC_REG_EXTRA #define _my_syscall_extra() \ register long reg_extra __asm__ (_NOLIBC_REG_EXTRA); \ __asm__ volatile ("": "=r"(reg_extra)::) #else #define _my_syscall_extra() #endif
/* Architectures like PowerPC write GPRS in kernel side and not restore them */ #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS #define _my_syscall_argn(n, arg) \ register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg); \ __asm__ volatile ("":: "r"(_arg##n):) #else #define _my_syscall_argn(n, arg) \ register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg); \ __asm__ volatile ("": "+r"(_arg##n)::) #endif
- "=r" is only required for output registers, it is mainly for "RET" register and "ERR" register.
- "r" is for input, mainly for "NUM" register and the input arguments (sometimes, the input arguments are used as "RET" and "ERR" too).
- "+r" means both input and output, only used by powerpc currently.
For the input registers used as "ERR" or "RET" output, "+r" is used before, but now, we split them to two parts, one is "=r"(_ret) in _my_syscall_tail(), another is in _my_syscall_argn(n, arg), they together work for "+r" = "=r" + "r"
Btw, have checked "=r" instead of "=a" works on i386 too for we already bind the _ret variable with "RET" register, but still need to check if "=a" is necessary?
others "+d" or "=d". And reading the comments, there are some arch-specific choices for these declarations, that are sometimes there to force the toolchain a little bit.
Even for s390, test shows, "r" instead of "d" works too (boot and test passed, not yet checked the size and codes generated), but I didn't find any document about "d" for s390 from the gcc online doc. This part (include all of the supported architectures) should be carefully checked if really adding our new syscall.h. add s390 nolibc committer to CC: list.
If "r" really not works on a target, we can use the logic like this (include syscall.h after architecture specific definitions):
#ifndef my_syscall5 #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) my_syscalln(num, arg5, arg4, arg3, arg2, arg1) #endif #ifndef my_syscall6 #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1) #endif
The blocks from syscalls.h are only provided as generic support, the architectures can define their own versions. all of our new blocks can be protected like my_syscall<N> above, for example:
#ifndef _my_syscall_tail #define _my_syscall_tail ... #endif
But for s390 here, if it really requires "d" instead of "r", we are able to allow architectures define their own modifier like this:
#ifndef _NOLIBC_INLINE_ASM_MODIFIER #define _NOLIBC_INLINE_ASM_MODIFIER "r" #endif
Then, we can define _NOLIBC_INLINE_ASM_MODIFIER for s390:
#define _NOLIBC_INLINE_ASM_MODIFIER "d"
But architectures like i386, If "=a", "=b", ... modifiers are necessary, new versions of the blocks should be added for these architectures.
As a short summary for this part, the modifiers used by different architectures should be carefully checked, welcome more suggestions from the toolchains developers or the architecture maintainers, I will compare the code generated too.
Others like MIPS pass some of their args in the stack, so a name only is not sufficient. Thus, trying to factor all of this will not only make it very hard to insert arch-specific adjusments, but it will also make the code much less readable.
Currently, I didn't yet work on merging the 'stack' support, we used '#ifdef's in syscall.h:
#ifndef my_syscall5 #define my_syscall5 ... #endif #ifndef my_syscall6 #define my_syscall6 ... #endif
So, they are the same as before:
$ grep "#define my_syscall" -ur tools/include/nolibc/arch-*.h tools/include/nolibc/arch-i386.h:#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ tools/include/nolibc/arch-mips.h:#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \ tools/include/nolibc/arch-mips.h:#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)
Still require more work on how to support these ones, but it is not that urgent ;-)
Also think about this for a moment: you had the opportunity to add two new archs by simply restarting from existing ones. s390 and loongarch were added the same way, leaving enough freedom to the implementers to adjust the definitions to fit their environment's constraints.
I think this reserves as-is if the new arch-<ARCH>.h simply not include our new syscalls.h, so, the new syscalls.h is optional, the worse case is as-is, no regression, enough freedom as before ;-)
And further, some architectures may resue some helpers from our new syscalls.h or at least learn something from what we have done for all of the supported architectures.
And eventually, if our generic versions fits, just defining the syscall instructions, registers and clobber list should be enough.
If you make it very complicated, I suspect we won't get any such contributions anymore just because nobody figures how to find their way through it, or it would require to change again for everyone just to add one specific case. I tend to think that the current situation is already fairly minimal with quite readable and debuggable calls, and that it's what *really* matters.
This may really influence the upstream and review flow:
- If people may send a new architecture support, if just fits the new syscall.h, we may need to review carefully about the test results, especially for the input/output operands, error register.
As tests for powerpc shows, the above issues should be covered by our nolibc-test.
- If people may send a new architecture support as before, If we find it fits our new syscalls.h or it can apply part of the blocks, we can give some suggestions.
- If people send something not just fit our new syscall.h, we may be able to think about if a new 'hook' is required, but it is not necessary, we can delay this change requirement to after a careful design (just like the argument passing via 'stack' case currently) .
When you're trying to reorganize code, it's important to ask yourself whether you would prefer to debug the old one or the new one at 3am. Hint: at 3am, the more abstractions there are, the less understandable it becomes.
Interesting and agree, but this abstraction does clear something to be more undersatndable too ;-)
I do hate hard-debuggable macros, but as we mentioned above, the inline assembly code is another harder parts, the new carefully tuned blocks may really help us to avoid introduce bugs with manually wrotten new codes and also it may help us to avoid copy and paste multiple duplicated lines of the same codes.
Willy, do we need to rename my_syscall<N> to _nolibc_syscall<N> to limit these macros nolibc internally? I plan to rename all of the new adding macros with _nolibc_ (for funcs) or _NOLIBC_ (for variables).
Why not, I'm not opposed to that. Just keep in mind that every single rename complicates backports (or may even silently break them), so it's important to know where you want to go and to try to make changes converge towards something stable and durable.
Yeah, let's keep the changes minimal, we could prefix the new macros with _nolibc_ or _NOLIBC_, the old ones can be kept as-is.
Best regards, Zhangjin
Thanks, Willy
Hi Zhangjin,
On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
Btw, have checked "=r" instead of "=a" works on i386 too for we already bind the _ret variable with "RET" register, but still need to check if "=a" is necessary?
I need to tell you that syscall6() for i386 can't use "r" and "=r" because there was a historical bug that made GCC stuck in a loop forever when compiling the nolibc code. It's already fixed in the latest version of GCC, but we should still support older compilers.
Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032
I discovered that bug in 2022 in the latest version of GCC at that time, so it's pretty new, and those buggy versions are very likely still in the wild today.
Hi,
Hi Zhangjin,
On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
Btw, have checked "=r" instead of "=a" works on i386 too for we already bind the _ret variable with "RET" register, but still need to check if "=a" is necessary?
I need to tell you that syscall6() for i386 can't use "r" and "=r" because there was a historical bug that made GCC stuck in a loop forever when compiling the nolibc code. It's already fixed in the latest version of GCC, but we should still support older compilers.
Thanks very much, this information is really important.
My old 'reply' is not rigorous, since the syscall6() uses stack to pass the 6th argument, so, our new syscall.h didn't support it currently, the syscalls I have tested about "=r" instead of "=a" were only syscall1-5().
Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032
I discovered that bug in 2022 in the latest version of GCC at that time, so it's pretty new, and those buggy versions are very likely still in the wild today.
Ok, so, with the new syscalls.h proposed, we'd better keep i386 syscall6() as-is.
For the left syscall1-5(), is there any risk when use '=r' instead of 'r'?
Thanks, Zhangjin
-- Ammar Faizi
On Wed, Jul 26, 2023 at 01:04:26AM +0800, Zhangjin Wu wrote:
My old 'reply' is not rigorous, since the syscall6() uses stack to pass the 6th argument, so, our new syscall.h didn't support it currently, the syscalls I have tested about "=r" instead of "=a" were only syscall1-5().
Yeah, it won't fit with the new design.
i386 runs out of GPRs very quickly. Given that, it had a hard time implementing syscall6() properly in nolibc. The calling convention itself actually doesn't require stack for executing 'int $0x80'.
The reason of why it uses stack is because the %ebp register cannot be listed in the clobber list nor in the constraint if -fomit-frame-pointer is not activated. Thus, we have to carefully preserve the value on the stack before using %ebp as the 6-th argument to the syscall. It's a hack to make it work on i386.
Ok, so, with the new syscalls.h proposed, we'd better keep i386 syscall6() as-is.
For the left syscall1-5(), is there any risk when use '=r' instead of 'r'?
Using "=r" instead of "r" doesn't make sense.
Did you mean "=r" instead of "=a"?
If that's what you mean:
So far I don't see the risk of using "=r" instead of "=a" as long as the variable is properly marked as 'register' + asm("eax").
On Wed, Jul 26, 2023 at 01:04:26AM +0800, Zhangjin Wu wrote:
My old 'reply' is not rigorous, since the syscall6() uses stack to pass the 6th argument, so, our new syscall.h didn't support it currently, the syscalls I have tested about "=r" instead of "=a" were only syscall1-5().
Yeah, it won't fit with the new design.
i386 runs out of GPRs very quickly. Given that, it had a hard time implementing syscall6() properly in nolibc. The calling convention itself actually doesn't require stack for executing 'int $0x80'.
The reason of why it uses stack is because the %ebp register cannot be listed in the clobber list nor in the constraint if -fomit-frame-pointer is not activated. Thus, we have to carefully preserve the value on the stack before using %ebp as the 6-th argument to the syscall. It's a hack to make it work on i386.
Ok, so, with the new syscalls.h proposed, we'd better keep i386 syscall6() as-is.
For the left syscall1-5(), is there any risk when use '=r' instead of 'r'?
Using "=r" instead of "r" doesn't make sense.
Did you mean "=r" instead of "=a"?
Yeah, sorry.
If that's what you mean:
So far I don't see the risk of using "=r" instead of "=a" as long as the variable is properly marked as 'register' + asm("eax").
Thanks. Zhangjin
-- Ammar Faizi
Hi Zhangjin,
On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h will only require to add ~10 lines to define their own syscall instructions, registers and clobberlist, which looks like this (for powerpc):
#define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg %0, %0; 1:" /* PowerPC doesn't always restore r3-r12 for us */ #define _NOLIBC_SYSCALL_CLOBBERLIST "memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4" /* PowerPC write GPRS in kernel side but not restore them */ #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
#define _NOLIBC_REG_NUM "r0" #define _NOLIBC_REG_RET "r3" #define _NOLIBC_REG_arg1 "r3" #define _NOLIBC_REG_arg2 "r4" #define _NOLIBC_REG_arg3 "r5" #define _NOLIBC_REG_arg4 "r6" #define _NOLIBC_REG_arg5 "r7" #define _NOLIBC_REG_arg6 "r8"
Before:
$ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done 157 tools/include/nolibc/arch-aarch64.h 199 tools/include/nolibc/arch-arm.h 178 tools/include/nolibc/arch-i386.h 164 tools/include/nolibc/arch-loongarch.h 195 tools/include/nolibc/arch-mips.h 0 tools/include/nolibc/arch-powerpc.h 160 tools/include/nolibc/arch-riscv.h 186 tools/include/nolibc/arch-s390.h 176 tools/include/nolibc/arch-x86_64.h
After:
$ wc -l tools/include/nolibc/arch-*.h 54 tools/include/nolibc/arch-aarch64.h 84 tools/include/nolibc/arch-arm.h 90 tools/include/nolibc/arch-i386.h /* the last one use stack to pass arguments, reserve as-is */ 59 tools/include/nolibc/arch-loongarch.h 120 tools/include/nolibc/arch-mips.h /* the last two use stack to pass arguments, reserve as-is */ 73 tools/include/nolibc/arch-powerpc.h 58 tools/include/nolibc/arch-riscv.h 87 tools/include/nolibc/arch-s390.h 67 tools/include/nolibc/arch-x86_64.h
syscall.h itself:
$ wc -l tools/include/nolibc/syscall.h 112 tools/include/nolibc/syscall.h
The important thing to consider is not the number of lines but the *maintainability*.
The original goal is not really the number of lines (only a 'side-effect'), but is exactly easier porting/maintainability with clearer code architecture,
I do feel the exact opposite. One is totally straightforward with self-explanatory function names and their equivalent machine-specific asm() statements, the other one involves countless cryptic macros for which it is particularly difficult to figure what depends on what.
You factored the syscall code so much above with all these macros that I don't even understand how they're going to interact with each other, especially "%0".
Yeah, it is my fault, this should be cleaned up with the return register directly:
#define _NOLIBC_SYSCALL_CALL \ "sc; bns+ 1f; neg 3, 3; 1:"
This doesn't change my point of view on it, really.
Also I don't know what the macro _NOLIBC_GPRS_AS_OUTPUT_OPERANDS does.
This is the root cause to inspire me to add the new syscall.h, let's explain the background step by step.
All of the other architectures (except PowerPC) restore GPRS for us when return from syscall, so, their clobber list simply not include the GPRS and only need to add the input registers in the 'INPUT Operands' list.
I still have no idea what a GPRS is.
But PowerPC doesn't restore such GPRS for us, I'm not sure if it is a feature (Maybe) or a bug.
We don't really care. The *exact* purpose of an asm() statement is to write stuff that cannot be expressed at a higher level. Sure sometimes you can abuse macros. But this should be extremely light. Here you seem to be using a common asm statement for everyone and going to stuff the combination of all these macros into it. asm() statements are already quite cryptic for a lot of people, and the minimum required is that they are easy to read so that the few who know what these are doing can help debug them. When Ammar spotted the alignment bug in our _start code, it didn't take long to figure the root cause of the issue nor to fix it, precisely because that code was straightforward for someone with a bit of asm skills. But how do you want anyone to figure what's happening in something full of abstractions ? Look for example, in order to add the stackprot support, Thomas just had to append a call at various points. When you need to do that in factored code that's forcefully arranged to try to suit all archs and toolchains at once, it may end up being almost impossible without breaking the organization and starting to create arch-specific definitions again.
for PowerPC32, the following line will restore the GPRS for us, but may be not a good idea to do so for it may decrease the syscall performance although save some instructions in user-space and also, the other libcs also follow the current rule, so, this may be a design intention we must follow (welcome your suggestions).
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index fe27d41f9a3d..1ed535e9144c 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -155,6 +155,7 @@ syscall_exit_finish: bne 3f mtcr r5
+ REST_GPRS(3, 12, r1) 1: REST_GPR(2, r1) REST_GPR(1, r1) rfi
I don't know PPC and I have zero opinion there. For this we'll ask one of the PPC maintainers for guidance, and since we have clean asm code, they will easily be able to say "yes that's fine", "hmmm no that's not the right way to do it", or "I suspect you forgot to save flags here", or anything else, because the code will match a pattern they know well. With all the macros hell, it will just be "hmmm good luck".
For PowerPC, if with the previous method like the other architectures, the clobber list differs for every my_syscall<N> and all of the input registers must be put in the 'OUTPUT Operands' too to avoid compiler to save and resue a variable in such GPRS across my_syscall<N> calls.
But do you realize that you're proposing to write macros to factor things between archs that are already different within a single arch ? How is this supposed to help at doing anything ?
Originally in my local new version of arch-powerpc.h, we got such code for every my_syscall<N>, use my_syscall6() as an example:
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ ({ \ register long _ret __asm__ ("r3"); \ register long _num __asm__ ("r0") = (num); \ register long _arg1 __asm__ ("r3") = (long)(arg1); \ register long _arg2 __asm__ ("r4") = (long)(arg2); \ register long _arg3 __asm__ ("r5") = (long)(arg3); \ register long _arg4 __asm__ ("r6") = (long)(arg4); \ register long _arg5 __asm__ ("r7") = (long)(arg5); \ register long _arg6 __asm__ ("r8") = (long)(arg6); \ \ __asm__ volatile ( \ " sc\n" \ " bns+ 1f\n" \ " neg %0, %0\n" \ "1:\n" \ : "=r"(_ret), \ "+r"(_arg2), "+r"(_arg3), "+r"(_arg4), \ "+r"(_arg5), "+r"(_arg6) \ : "0"(_arg1), "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST \ ); \ _ret; \ })
It almost aligns with the other architectures, but the full clobber list differs for every my_syscall<N>, the basic one is:
/* PowerPC kernel doesn't always restore r4-r12 for us */ #define _NOLIBC_SYSCALL_CLOBBERLIST \ "memory", "cr0", "r12", "r11", "r10", "r9",
Use my_syscall0() as a further example, we need something like this:
#define my_syscall0(num) \ ({ \ register long _ret __asm__ ("r3"); \ register long _num __asm__ ("r0") = (num); \ \ __asm__ volatile ( \ " sc\n" \ " bns+ 1f\n" \ " neg %0, %0\n" \ "1:\n" \ : "=r"(_ret) \ : "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST, "r8", "r7", "r6", "r5", "r4" \ ); \ _ret; \ })
The additional "r8"..."r4" must be appended to the clobber list for they can not be put together for every my_syscall<N> due to conflicts between they between the clobber list and the "OUTPUT/INPUT operands".
Perfect, yet another example of the real purpose of asm() statements. They're not generic and are made to finely tune what you're inserting.
I found a solution to share the same _NOLIBC_SYSCALL_CLOBBERLIST, that is split the "OUTPUT/INPUT Operands" list out of the core syscall assembly:
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \ ({ \ register long _ret __asm__ ("r3"); \ register long _num __asm__ ("r0") = (num); \ register long _arg1 __asm__ ("r3") = (long)(arg1); \ register long _arg2 __asm__ ("r4") = (long)(arg2); \ register long _arg3 __asm__ ("r5") = (long)(arg3); \ register long _arg4 __asm__ ("r6") = (long)(arg4); \ register long _arg5 __asm__ ("r7") = (long)(arg5); \ register long _arg6 __asm__ ("r8") = (long)(arg6); \ \ __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4), \ "+r"(_arg5), "+r"(_arg6)::); \ \ __asm__ volatile ( \ " sc\n" \ " bns+ 1f\n" \ " neg %0, %0\n" \ "1:\n" \ : "=r"(_ret) \ : "0"(_arg1), "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST \ ); \ _ret; \ })
So basically "it happens to work but we don't know why, but this is still much more maintainable" ? Please no, really no, no, no. That's ugly, tricky and you don't even know what the compiler will do between these two statements.
Note, a question here is, the above split still require more discussion to make sure it does work for different toolchains (only test for gcc currently) or even if this method is right from scratch, welcome your suggestion.
asm() statements are used to work around toolchain limitations/differences and bugs. I've seen Ammar's response with the link to the gcc bug, that's a good example as well of the reasons why we MUST NOT do these hacks.
As a result, all of the my_syscall<N> are able to share the core syscall calling assembly block, so, here is what we at last have:
#define _my_syscall_tail() \ __asm__ volatile ( \ _NOLIBC_SYSCALL_CALL \ : "=r"(_ret) \ : "r"(_num) \ : _NOLIBC_SYSCALL_CLOBBERLIST \ ); \ _ret
And further, we also found it was possible to share most of them among these not ugly but completely duplicated lines:
But please, from the beginning, all I understand is "it is possible to", but I still fail to understand the ultimate goal. Making the code uglier and unmaintainable because it is possible is not a valid argument. For doing stuff like above, there must be a serious limitation to work around that has no other solution, and even then a huge #if/#endif could possibly do it.
That's why at last we have such blocks (of course, for PowerPC itself it is a big change and not necessary):
#define _my_syscall_head(num) \ register long _ret __asm__ (_NOLIBC_REG_RET); \ register long _num __asm__ (_NOLIBC_REG_NUM) = (num) \
#ifdef _NOLIBC_REG_ERR #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR #endif #ifdef _NOLIBC_REG_EXTRA #define _my_syscall_extra() \ register long reg_extra __asm__ (_NOLIBC_REG_EXTRA); \ __asm__ volatile ("": "=r"(reg_extra)::) #else #define _my_syscall_extra() #endif /* Architectures like PowerPC write GPRS in kernel side and not restore them */ #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS #define _my_syscall_argn(n, arg) \ register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg); \ __asm__ volatile ("":: "r"(_arg##n):) #else #define _my_syscall_argn(n, arg) \ register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg); \ __asm__ volatile ("": "+r"(_arg##n)::) #endif
And someone is able to help us work around a compiler or assembler bug in this ? I can't even spend enough concentration on the whole block to understand what it's trying to do or what interacts with what. I'm sorry, that's not a way to deal with asm, nor code shared with multiple developers in general.
(...)
And at last:
#define __my_syscall_args(N, ...) _my_syscall_args##N(__VA_ARGS__) #define _my_syscall_args(N, ...) __my_syscall_args(N, ##__VA_ARGS__) #define __my_syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N #define _my_syscall_narg(...) __my_syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0) #define __my_syscall_argsn(N, argn, ...) \ _my_syscall_args(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__); \ _my_syscall_argn(N, argn)
#define _my_syscall_argsn(...) __my_syscall_argsn(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__)
/* Note, my_syscall0() has no argument, can not use my_syscalln() */ #define my_syscall0(num) \ ({ \ _my_syscall_head(num); \ _my_syscall_extra(); \ _my_syscall_tail(); \ })
#define my_syscalln(num, ...) \ ({ \ _my_syscall_head(num); \ _my_syscall_extra(); \ _my_syscall_argsn(__VA_ARGS__); \ _my_syscall_tail(); \ }) #define my_syscall1(num, arg1) my_syscalln(num, arg1) #define my_syscall2(num, arg1, arg2) my_syscalln(num, arg2, arg1) #define my_syscall3(num, arg1, arg2, arg3) my_syscalln(num, arg3, arg2, arg1) #define my_syscall4(num, arg1, arg2, arg3, arg4) my_syscalln(num, arg4, arg3, arg2, arg1) #ifndef my_syscall5 #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) my_syscalln(num, arg5, arg4, arg3, arg2, arg1) #endif #ifndef my_syscall6 #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1) #endif
At last, I found this worked on all of the supported architectures, so, the new syscall.h is proposed.
If the ultimate goal is *just* to provide my_syscalln(), it's not needed to rework all archs like this. Just doing this does the job as well, it will allow my_syscalln(syscall_num, ...) to call the respective my_syscall0/1/2/3/4/5/6 depending on the number of arguments:
/* my_syscalln() will automatically map to my_syscall<n>() depending * on the number of arguments after the syscall, from 0 to 6. It uses * positional arguments after a VA_ARGS to resolve as an argument * count that's then used to build the underlying macro's name. */ #define _my_syscall0(num, a, b, c, d, e, f) my_syscall0(num) #define _my_syscall1(num, a, b, c, d, e, f) my_syscall1(num, a) #define _my_syscall2(num, a, b, c, d, e, f) my_syscall2(num, a, b) #define _my_syscall3(num, a, b, c, d, e, f) my_syscall3(num, a, b, c) #define _my_syscall4(num, a, b, c, d, e, f) my_syscall4(num, a, b, c, d) #define _my_syscall5(num, a, b, c, d, e, f) my_syscall5(num, a, b, c, d, e) #define _my_syscall6(num, a, b, c, d, e, f) my_syscall6(num, a, b, c, d, e, f) #define _my_syscalln(num, a, b, c, d, e, f, g, ...) _my_syscall##g(num, a, b, c, d, e, f) #define my_syscalln(num, ...) _my_syscalln(num, ##__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
This way there's no need to modify the arch-specific syscall definitions, this will simply rely on them and preserve their maintainability.
BTW, another question here is, to utilize the feature of __VA_ARGS__ to easier getting the last argument, the order of arguments are reversed during the declarations of the my_syscall<N>, any suggestion on this part? is it possible to not reverse the order?
There's no reverse order, it's a well-known method consisting in making a number appear at a fixed position depending on the number of preceeding arguments:
my_syscalln(n, a, b, c, d, e, f) becomes _my_syscalln(n, a, b, c, d, e, f, 6, 5, 4, 3, 2, 1, 0) ^ This macro extracts this number above to build the next macro name: _my_syscall6(n, a, b, c, d, e, f) my_syscall6(n, a, b, c, d, e, f)
If you use less arguments, say 3, you get this:
my_syscalln(n, a, b, c) _my_syscalln(n, a, b, c, 6, 5, 4, 3, 2, 1, 0) _my_syscall3(n, a, b, c, d, e, f) my_syscall3(n, a, b, c) // d, e, f are lost
The last level of macro is used to silently drop the extra args. When target is already a macro, it's not even necessary as the macro definition could already end with ", ...".
I've been using a similar one above in other projects for quite a while and I know that it worked at least in gcc-3.4, so it's definitely safe.
And when someone reports a bug like we had in the past with programs randomly crashing depending on stack alignment and such, it becomes particularly tricky to figure what is expected and how it differs from reality.
Macros are really hard to debug, the above code lines cost me two days ;-)
Someone once said that it requires a must stronger brain to solve a problem than the one that created it. If it took you two days to arrange this, imagine how long it will take to someone having not designed this to debug it! The time it took you is definitely not an argument for adopting this, quite the opposite. Instead it should have convinced you that this was going to become unmaintainable.
but after the common syscall.h, the left architecture specific parts are very few and easier to debug and even less copy and paste.
Copy-paste is a problem when bugs need to be fixed. Here there's almost no copy-paste, copy-paste is done initially to create a new arch but in fact we're reusing a skeletton to write completely different code. Because code for PPC and MIPS are different, there's no point imagining that once a bug affects MIPS we need to automatically apply the same fix to PPC, because it will be different.
I don't know. I still tend to think that this significantly complicates the understanding of the whole thing.
Willy, don't worry, I do think it make things easier, the worse case is using the old code or only use part of our new blocks helpers ;-)
Sorry and don't take it bad, I don't want you to feel it as being rude, but for me the worst case would be to use the new method precisely because for now only you probably know how that's supposed to work and nobody can help us with side effects affecting it.
For this new syscall.h, it mainly clear the inline assembly codes, as we know, inline assembly code is a very complicated thing,
That's why it must remain crystal clear.
If we clear the logic carefully (as we target but not yet) in our common code, architecture developers only require to focus on the platform specific definitions, it should be better for portability, review and maintainability.
In order to work on assembly you first need to be able to locate it and read it as a sequence of instructions. Here really, you need a pen and paper to start to resolve it.
It is very hard to learn the meanning of the OUTPUT operands, INPUT operands and even the clobber list and even the flags you mentioned below,
One more reason for not passing through this!
Also, looking at different archs' syscall code, they're not all defined the same way. Some like i386 use "=a" for the return value. Others use "+r" as an input/output value, others "=r",
Agree, this is a very hard part of the inline assembly codes, clearing up the generic versions in syscall.h with additional comments may really help a lot.
No precisely not. It's a hard part for people who don't deal with *that* arch. But you will not find a developer at ease with all archs. Each one has its own specifics. However you will find one or a few developers that are experts on each architecture and who will instantly be able to correct some of our mistakes, or warn us against toolchain bugs they're aware of that we need to take care of. They will also know which constraints to use. The constraint definitions are per-architecture, and for example "a" on x86 is the accumulator (eax/rax). On other archs it can be something else. There are archs which support register pairs, some must be aligned on an even number and depending on how the calls are declared, the compiler may improperly assign them and emit code that is impossible to assemble (I've met this several times with the initial ARM code that we managed to stabilize).
For the input registers used as "ERR" or "RET" output, "+r" is used before, but now, we split them to two parts, one is "=r"(_ret) in _my_syscall_tail(), another is in _my_syscall_argn(n, arg), they together work for "+r" = "=r" + "r"
Please do not generalize. The example I gave above indicate stuff that was initially hard to adjust precisely to help the compiler emit correct code with a wide enough range of tools. The example Ammar talked about is a perfect such example.
Even for s390, test shows, "r" instead of "d" works too
With your toolchain and the code you tested with. There might be a particular reason, I don't know. Maybe the maintainer is used to using this because it also works this way on another compiler and he will be more fluent with this one. That's something important as well when dealing with asm statements.
(boot and test passed, not yet checked the size and codes generated), but I didn't find any document about "d" for s390 from the gcc online doc. This part (include all of the supported architectures) should be carefully checked if really adding our new syscall.h. add s390 nolibc committer to CC: list.
I do trust the s390 maintainers who contributed this code to know better than either you and me, and it's certainly not up to us to ask them to justify their choice. Actually it would be the other way around, you would need a solid argument for changing code that works.
But architectures like i386, If "=a", "=b", ... modifiers are necessary, new versions of the blocks should be added for these architectures.
You'll just end up with as many blocks at the end, but dealing only with a union of exception. That's exactly the worst that can be imagined for maintenance.
And further, some architectures may resue some helpers from our new syscalls.h or at least learn something from what we have done for all of the supported architectures.
The arch-specific code is already minimal. We have 7 asm statements for 7 syscall conventions. That's ridiculously low and they contain what any such arch maintainer would need to find to extend or fix them.
If you make it very complicated, I suspect we won't get any such contributions anymore just because nobody figures how to find their way through it, or it would require to change again for everyone just to add one specific case. I tend to think that the current situation is already fairly minimal with quite readable and debuggable calls, and that it's what *really* matters.
This may really influence the upstream and review flow:
- If people may send a new architecture support, if just fits the new syscall.h, we may need to review carefully about the test results, especially for the input/output operands, error register.
First they'd need to be able to figure what to put in what. Look, they know what are the 3 instructions they need to put in an asm statement and the list of registers, and suddenly they'd need to figure how to spread them into cryptic macros, some of which are sometimes used, others always etc. That turns a 20-minute test into half a day, without big assurance at the end of the day that everything is right.
As tests for powerpc shows, the above issues should be covered by our nolibc-test.
If people may send a new architecture support as before, If we find it fits our new syscalls.h or it can apply part of the blocks, we can give some suggestions.
If people send something not just fit our new syscall.h, we may be able to think about if a new 'hook' is required, but it is not necessary, we can delay this change requirement to after a careful design (just like the argument passing via 'stack' case currently) .
That's already the case with i386, s390 and so on. Except that it adds significant burden for that person.
When you're trying to reorganize code, it's important to ask yourself whether you would prefer to debug the old one or the new one at 3am. Hint: at 3am, the more abstractions there are, the less understandable it becomes.
Interesting and agree, but this abstraction does clear something to be more undersatndable too ;-)
It's really the first time I hear that abstractions makes one-liner ASM code clearer and more understandable. I'm sorry but not, really, that's exactly the opposite.
I do hate hard-debuggable macros, but as we mentioned above, the inline assembly code is another harder parts, the new carefully tuned blocks may really help us to avoid introduce bugs with manually wrotten new codes and also it may help us to avoid copy and paste multiple duplicated lines of the same codes.
No, the asm blocks are trivial for those who speak this language and are hard for other ones. The macros are significantly harder and for everyone. I prefer to ask an s390 or PPC maintainer when I need help with their code rather than tweak the generic code adding a "+r" for every arch then read about reports saying that this arch breaks with that version of the compiler on that program with that version of the assembler.
Please again, don't take any of this personally, I'm just feeling that you tried to address a difficulty to dig into some arch-specific code, that you wanted to hide and that you feel like it is more maintainable, but it's not. Maintainability in a shared project doesn't mean that you are suddenly skilled on everything, but that you are able to find someone skilled on your problem. It's not necessarily your task to debug an architecture you don't know (though it's often very instructive), there are other people for this and that's perfectly fine. We need to make the task easy for them so that they don't have to learn all the nolibc tricks to share their knowledge. In the current form with the asm statements it's perfectly feasible and that's what matters.
Hoping this clarifies my position on this.
Thanks, Willy
Hi Willy,
On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h will only require to add ~10 lines to define their own syscall instructions, registers and clobberlist, which looks like this (for powerpc):
#define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg %0, %0; 1:" /* PowerPC doesn't always restore r3-r12 for us */ #define _NOLIBC_SYSCALL_CLOBBERLIST "memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4" /* PowerPC write GPRS in kernel side but not restore them */ #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
#define _NOLIBC_REG_NUM "r0" #define _NOLIBC_REG_RET "r3" #define _NOLIBC_REG_arg1 "r3" #define _NOLIBC_REG_arg2 "r4" #define _NOLIBC_REG_arg3 "r5" #define _NOLIBC_REG_arg4 "r6" #define _NOLIBC_REG_arg5 "r7" #define _NOLIBC_REG_arg6 "r8"
Before:
$ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done 157 tools/include/nolibc/arch-aarch64.h 199 tools/include/nolibc/arch-arm.h 178 tools/include/nolibc/arch-i386.h 164 tools/include/nolibc/arch-loongarch.h 195 tools/include/nolibc/arch-mips.h 0 tools/include/nolibc/arch-powerpc.h 160 tools/include/nolibc/arch-riscv.h 186 tools/include/nolibc/arch-s390.h 176 tools/include/nolibc/arch-x86_64.h
After:
$ wc -l tools/include/nolibc/arch-*.h 54 tools/include/nolibc/arch-aarch64.h 84 tools/include/nolibc/arch-arm.h 90 tools/include/nolibc/arch-i386.h /* the last one use stack to pass arguments, reserve as-is */ 59 tools/include/nolibc/arch-loongarch.h 120 tools/include/nolibc/arch-mips.h /* the last two use stack to pass arguments, reserve as-is */ 73 tools/include/nolibc/arch-powerpc.h 58 tools/include/nolibc/arch-riscv.h 87 tools/include/nolibc/arch-s390.h 67 tools/include/nolibc/arch-x86_64.h
syscall.h itself:
$ wc -l tools/include/nolibc/syscall.h 112 tools/include/nolibc/syscall.h
[...]
Hoping this clarifies my position on this.
Willy, Thanks very much for your detailed reply, based on your reply, I plan to renew the powerpc patchset itself at first since both you and Thomas have already reviewed it carefully.
After that, I will come back to read your reply again and discuss more about our new syscall.h, I still think it is something valuable to take a look at, although something about it still need more attention, perhaps a RFC patchset is better for more discuss, it may show us the profile easily.
Best regards, Zhangjin
Thanks, Willy
Hi, Thomas
On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
Both syscall declarations and _start code definition are added for powerpc to nolibc.
[...]
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/arch-powerpc.h | 156 ++++++++++++++++++++++++++++
This also should be added to nolibc/arch.h.
Thanks, it should be.
1 file changed, 156 insertions(+) create mode 100644 tools/include/nolibc/arch-powerpc.h
diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h new file mode 100644 index 000000000000..100ec0f412dc --- /dev/null +++ b/tools/include/nolibc/arch-powerpc.h @@ -0,0 +1,156 @@ +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */ +/*
- PowerPC specific definitions for NOLIBC
- Copyright (C) 2023 Zhangjin Wu falcon@tinylab.org
If it is taken from musl, shouldn't there also be a musl copyright?
For this copyright issue, I have prepared two new versions without a line from musl. even in our old version, most of them are different except the 'sc; bns+ 1f; neg %1, %1; 1:' line and the register variables.
Seems 'sc; bns+ 1f; neg %1, %1; 1:' is also used in linux kernel: arch/powerpc/include/asm/vdso/gettimeofday.h, so, it should be ok enough to apply it.
The register varibles have been changed and aligned with othe arch-<ARCH>.h locally, they are completely different now, and even further with the new syscall.h mentioned in this reply [1], the file will be completely different.
Thomas, Have added your Reviewed-by lines too, thanks a lot!
Best regards, Zhangjin ---- [1]: https://lore.kernel.org/lkml/20230725054414.15055-1-falcon@tinylab.org/
This follows the 64-bit PowerPC ABI [1], refers to the slides: "A new ABI for little-endian PowerPC64 Design & Implementation" [2] and the musl code in arch/powerpc64/crt_arch.h.
Firstly, stdu and clrrdi are used instead of stwu and clrrwi for powerpc64.
Second, the stack frame size is increased to 32 bytes for powerpc64, 32 bytes is the minimal stack frame size supported described in [2].
Besides, the TOC pointer (GOT pointer) must be saved to r2.
This works on both little endian and big endian 64-bit PowerPC.
[1]: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf [2]: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/arch-powerpc.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h index 100ec0f412dc..7b28ebcfcc23 100644 --- a/tools/include/nolibc/arch-powerpc.h +++ b/tools/include/nolibc/arch-powerpc.h @@ -143,6 +143,19 @@ /* startup code */ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void) { +#ifdef __powerpc64__ + /* On 64-bit PowerPC, save TOC/GOT pointer to r2 */ + extern char TOC __asm__ (".TOC."); + register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start; + + __asm__ volatile ( + "mr 3, 1\n" /* save stack pointer to r3, as arg1 of _start_c */ + "clrrdi 1, 1, 4\n" /* align the stack to 16 bytes */ + "li 0, 0\n" /* zero the frame pointer */ + "stdu 1, -32(1)\n" /* the initial stack frame */ + "bl _start_c\n" /* transfer to c runtime */ + ); +#else __asm__ volatile ( "mr 3, 1\n" /* save stack pointer to r3, as arg1 of _start_c */ "clrrwi 1, 1, 4\n" /* align the stack to 16 bytes */ @@ -150,6 +163,7 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_ "stwu 1, -16(1)\n" /* the initial stack frame */ "bl _start_c\n" /* transfer to c runtime */ ); +#endif __builtin_unreachable(); }
The following error reported while running nolibc-test on the big endian 64-bit PowerPC kernel compiled with powerpc64le-linux-gnu-gcc in Ubuntu 20.04.
56 select_nullinit[1]: illegal instruction (4) at 100042a8 nip 100042a8 lr 100042a8 code 1 in init[10000000+10000] init[1]: code: 7c6307b4 7c234840 4081f580 7c6300d0 907d0000 3860ffff 4bfff570 3ca2fffe init[1]: code: 38800038 38a5d547 7fc3f378 4bffcd65 <1000038c> 38c10060 38a00000 38800000
Let's explicitly initialize all of the timeval members to zero.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/nolibc-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 03b1d30f5507..ec2c7774522e 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -858,7 +858,7 @@ int run_syscall(int min, int max) CASE_TEST(read_badf); EXPECT_SYSER(1, read(-1, &tmp, 1), -1, EBADF); break; CASE_TEST(rmdir_blah); EXPECT_SYSER(1, rmdir("/blah"), -1, ENOENT); break; CASE_TEST(sched_yield); EXPECT_SYSZR(1, sched_yield()); break; - CASE_TEST(select_null); EXPECT_SYSZR(1, ({ struct timeval tv = { 0 }; select(0, NULL, NULL, NULL, &tv); })); break; + CASE_TEST(select_null); EXPECT_SYSZR(1, ({ struct timeval tv = { 0, 0 }; select(0, NULL, NULL, NULL, &tv); })); break; CASE_TEST(select_stdout); EXPECT_SYSNE(1, ({ fd_set fds; FD_ZERO(&fds); FD_SET(1, &fds); select(2, NULL, &fds, NULL, NULL); }), -1); break; CASE_TEST(select_fault); EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break; CASE_TEST(stat_blah); EXPECT_SYSER(1, stat("/proc/self/blah", &stat_buf), -1, ENOENT); break;
As this would be a generic bugfix it should be at the front of the series, but...
On 2023-07-19 05:13:01+0800, Zhangjin Wu wrote:
The following error reported while running nolibc-test on the big endian 64-bit PowerPC kernel compiled with powerpc64le-linux-gnu-gcc in Ubuntu 20.04.
56 select_nullinit[1]: illegal instruction (4) at 100042a8 nip 100042a8 lr 100042a8 code 1 in init[10000000+10000] init[1]: code: 7c6307b4 7c234840 4081f580 7c6300d0 907d0000 3860ffff 4bfff570 3ca2fffe init[1]: code: 38800038 38a5d547 7fc3f378 4bffcd65 <1000038c> 38c10060 38a00000 38800000
Let's explicitly initialize all of the timeval members to zero.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/nolibc-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 03b1d30f5507..ec2c7774522e 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -858,7 +858,7 @@ int run_syscall(int min, int max) CASE_TEST(read_badf); EXPECT_SYSER(1, read(-1, &tmp, 1), -1, EBADF); break; CASE_TEST(rmdir_blah); EXPECT_SYSER(1, rmdir("/blah"), -1, ENOENT); break; CASE_TEST(sched_yield); EXPECT_SYSZR(1, sched_yield()); break;
CASE_TEST(select_null); EXPECT_SYSZR(1, ({ struct timeval tv = { 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
CASE_TEST(select_null); EXPECT_SYSZR(1, ({ struct timeval tv = { 0, 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
This doesn't really make sense. Firstly, "{ 0 }" zeroes the whole structure.
Also the warning talks about "illegal instruction" while this structure is data and should never be executed as code.
Is this failure reproducible? Maybe the error is actually in the syscall wrapper? I'll also take a look tomorrow.
CASE_TEST(select_stdout); EXPECT_SYSNE(1, ({ fd_set fds; FD_ZERO(&fds); FD_SET(1, &fds); select(2, NULL, &fds, NULL, NULL); }), -1); break; CASE_TEST(select_fault); EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break; CASE_TEST(stat_blah); EXPECT_SYSER(1, stat("/proc/self/blah", &stat_buf), -1, ENOENT); break;
-- 2.25.1
Hi, Thomas
As this would be a generic bugfix it should be at the front of the series, but...
Yes, moved it but not as the first.
On 2023-07-19 05:13:01+0800, Zhangjin Wu wrote:
The following error reported while running nolibc-test on the big endian 64-bit PowerPC kernel compiled with powerpc64le-linux-gnu-gcc in Ubuntu 20.04.
56 select_nullinit[1]: illegal instruction (4) at 100042a8 nip 100042a8 lr 100042a8 code 1 in init[10000000+10000] init[1]: code: 7c6307b4 7c234840 4081f580 7c6300d0 907d0000 3860ffff 4bfff570 3ca2fffe init[1]: code: 38800038 38a5d547 7fc3f378 4bffcd65 <1000038c> 38c10060 38a00000 38800000
Let's explicitly initialize all of the timeval members to zero.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/nolibc-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 03b1d30f5507..ec2c7774522e 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -858,7 +858,7 @@ int run_syscall(int min, int max) CASE_TEST(read_badf); EXPECT_SYSER(1, read(-1, &tmp, 1), -1, EBADF); break; CASE_TEST(rmdir_blah); EXPECT_SYSER(1, rmdir("/blah"), -1, ENOENT); break; CASE_TEST(sched_yield); EXPECT_SYSZR(1, sched_yield()); break;
CASE_TEST(select_null); EXPECT_SYSZR(1, ({ struct timeval tv = { 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
CASE_TEST(select_null); EXPECT_SYSZR(1, ({ struct timeval tv = { 0, 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
This doesn't really make sense. Firstly, "{ 0 }" zeroes the whole structure.
Will compare the difference carefully ...
Also the warning talks about "illegal instruction" while this structure is data and should never be executed as code.
Yeah, I'm a little lazy, test shows no issue happen, so, not looked into it, this may really hide some issues.
Is this failure reproducible?
It could be reproduced with powerpc64le-linux-gnu-gcc (9.3.0) + run:
$ make run XARCH=powerpc64 CROSS_COMPILE=powerpc64le-linux-gnu-
but not happen with powerpc64le-linux-gnu-gcc (9.3.0) + run-user:
$ make run-user XARCH=powerpc64 CROSS_COMPILE=powerpc64le-linux-gnu-
And neither reproduce if switch to the big endian powerpc64-linux-gcc 13.1.0 toolchain from https://mirrors.edge.kernel.org/pub/tools/crosstool/
Maybe the error is actually in the syscall wrapper? I'll also take a look tomorrow.
ok, just rechecked this, found the nolibc side is:
int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeval *timeout) --> return my_syscall5(__NR__newselect, nfds, rfds, wfds, efds, timeout);
And the bad code is (even with -O0):
1000e3ac: 10 00 03 8c vspltisw v0,0 1000e3b0: 39 3f 00 e0 addi r9,r31,224 1000e3b4: 7c 00 4f 99 stxvd2x vs32,0,r9 1000e3b8: 39 3f 00 e0 addi r9,r31,224 1000e3bc: 7d 27 4b 78 mr r7,r9
The error log:
56 select_nullinit[1]: illegal instruction (4) at 1000e3ac nip 1000e3ac lr 1000e398 code 1 in init[10000000+14000] init[1]: code: e93f0076 3ca2fffe 38a5aad0 7d244b78 3c62fffe 3863a630 4bffaedd 7c691b78 init[1]: code: 7d2a4b78 813f008c 7d295214 913f008c <1000038c> 393f00e0 7c004f99 393f00e0 Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
So, the critical info "illegal instruction" means the vspltisw instruction is not supported by this compiled kernel.
Let's look at the good one (only plus one instruction):
1000e3ac: 39 20 00 00 li r9,0 1000e3b0: f9 3f 00 e0 std r9,224(r31) 1000e3b4: 39 20 00 00 li r9,0 1000e3b8: f9 3f 00 e8 std r9,232(r31) 1000e3bc: 39 3f 00 e0 addi r9,r31,224 1000e3c0: 7d 27 4b 78 mr r7,r9
It means, adding one more 0 will let the compiler generate different code, it will not use the vspltisw instruction any more, so, different result.
It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
CONFIG_ALTIVEC CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
Or we can disable the vsx instructions explicitly:
-mno-vsx
Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
+CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
So, this patch itself is wrong, let's drop it from the next revision.
Thanks very much, Zhangjin
CASE_TEST(select_stdout); EXPECT_SYSNE(1, ({ fd_set fds; FD_ZERO(&fds); FD_SET(1, &fds); select(2, NULL, &fds, NULL, NULL); }), -1); break; CASE_TEST(select_fault); EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break; CASE_TEST(stat_blah); EXPECT_SYSER(1, stat("/proc/self/blah", &stat_buf), -1, ENOENT); break;
-- 2.25.1
Hi Zhangjin,
On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
CONFIG_ALTIVEC CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
Or we can disable the vsx instructions explicitly:
-mno-vsx
Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
+CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
So, this patch itself is wrong, let's drop it from the next revision.
Better explicitly disable it in the CFLAGS (2nd option) if we want to make sure we don't want to rely on this, at least for portability purposes.
Willy
Hi, Willy
Hi Zhangjin,
On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
CONFIG_ALTIVEC CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
Or we can disable the vsx instructions explicitly:
-mno-vsx
Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
+CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
So, this patch itself is wrong, let's drop it from the next revision.
Better explicitly disable it in the CFLAGS (2nd option) if we want to make sure we don't want to rely on this, at least for portability purposes.
Ok, thanks, have updated CFLAGS in these two patches locally:
[PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
what about the other ones? I'm ready to send v2 ;-)
Best regards, Zhangjin
Willy
Hi Zhangjin,
On Wed, Jul 19, 2023 at 02:49:12PM +0800, Zhangjin Wu wrote:
Hi, Willy
Hi Zhangjin,
On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
CONFIG_ALTIVEC CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
Or we can disable the vsx instructions explicitly:
-mno-vsx
Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
+CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
So, this patch itself is wrong, let's drop it from the next revision.
Better explicitly disable it in the CFLAGS (2nd option) if we want to make sure we don't want to rely on this, at least for portability purposes.
Ok, thanks, have updated CFLAGS in these two patches locally:
[PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
what about the other ones? I'm ready to send v2 ;-)
I have not had the time to review them yet. Please just don't send another series yet, that just adds more noise and makes it hard to distinguish all of them. I hope to be able to check these and hopefully the tinyconfig series by the week-end.
Thanks, Willy
Hi Zhangjin,
On 2023-07-19 14:49:12+0800, Zhangjin Wu wrote:
On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
CONFIG_ALTIVEC CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
Or we can disable the vsx instructions explicitly:
-mno-vsx
Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
+CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
So, this patch itself is wrong, let's drop it from the next revision.
Better explicitly disable it in the CFLAGS (2nd option) if we want to make sure we don't want to rely on this, at least for portability purposes.
Ok, thanks, have updated CFLAGS in these two patches locally:
[PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
what about the other ones? I'm ready to send v2 ;-)
Unfortunately I won't have the time for a proper review this week.
Thomas
The default DEFCONFIG_<ARCH> may not always work for all architectures, some architectures require to add extra kernel config options, this adds a new 'extconfig' target for this requirement.
It allows to customize extra kernel config options via the common common.config and the architecture specific <ARCH>.config, at last trigger 'allnoconfig' to let them take effect with missing config options as disabled.
The scripts/kconfig/merge_config.sh tool is used to merge the extra config files.
Suggested-by: Thomas Weißschuh linux@weissschuh.net Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/ Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/Makefile | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index f42adef87e12..08a5ca5f418b 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -39,6 +39,9 @@ DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig DEFCONFIG = $(DEFCONFIG_$(ARCH))
+# extra kernel config files under configs/, include common + architecture specific +EXTCONFIG = common.config $(ARCH).config + # optional tests to run (default = all) TEST =
@@ -162,6 +165,10 @@ initramfs: nolibc-test defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
+extconfig: + $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c)) + $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig + kernel: initramfs $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) $(IMAGE_NAME) CONFIG_INITRAMFS_SOURCE=$(CURDIR)/initramfs
On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
The default DEFCONFIG_<ARCH> may not always work for all architectures, some architectures require to add extra kernel config options, this adds a new 'extconfig' target for this requirement.
It allows to customize extra kernel config options via the common common.config and the architecture specific <ARCH>.config, at last trigger 'allnoconfig' to let them take effect with missing config options as disabled.
The scripts/kconfig/merge_config.sh tool is used to merge the extra config files.
Suggested-by: Thomas Weißschuh linux@weissschuh.net Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/ Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/Makefile | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index f42adef87e12..08a5ca5f418b 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -39,6 +39,9 @@ DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig DEFCONFIG = $(DEFCONFIG_$(ARCH)) +# extra kernel config files under configs/, include common + architecture specific +EXTCONFIG = common.config $(ARCH).config
# optional tests to run (default = all) TEST = @@ -162,6 +165,10 @@ initramfs: nolibc-test defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare +extconfig:
- $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
- $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
Please also mention this entry in the "help" message. Other than that, OK.
Willy
Hi, Willy
On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
The default DEFCONFIG_<ARCH> may not always work for all architectures, some architectures require to add extra kernel config options, this adds a new 'extconfig' target for this requirement.
It allows to customize extra kernel config options via the common common.config and the architecture specific <ARCH>.config, at last trigger 'allnoconfig' to let them take effect with missing config options as disabled.
The scripts/kconfig/merge_config.sh tool is used to merge the extra config files.
Suggested-by: Thomas Wei�schuh linux@weissschuh.net Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/ Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/Makefile | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index f42adef87e12..08a5ca5f418b 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -39,6 +39,9 @@ DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig DEFCONFIG = $(DEFCONFIG_$(ARCH)) +# extra kernel config files under configs/, include common + architecture specific +EXTCONFIG = common.config $(ARCH).config
# optional tests to run (default = all) TEST = @@ -162,6 +165,10 @@ initramfs: nolibc-test defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare +extconfig:
- $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
- $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
Please also mention this entry in the "help" message. Other than that, OK.
Willy, as we discussed in another tinyconfig series, in order to avoid add more complexity to users, I plan to drop this standalone 'extconfig' target and move the extra config operations to defconfig target like this:
defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c)) $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
This extra config options are really critical to default boot and test, so, it should be a 'default' config action as the 'defconfig' target originally mean.
Will test carefully about this change, what's your idea?
Thanks, Zhangjin
Willy
Hi Zhangjin,
On Tue, Jul 25, 2023 at 10:30:06PM +0800, Zhangjin Wu wrote:
Hi, Willy
On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
The default DEFCONFIG_<ARCH> may not always work for all architectures, some architectures require to add extra kernel config options, this adds a new 'extconfig' target for this requirement.
It allows to customize extra kernel config options via the common common.config and the architecture specific <ARCH>.config, at last trigger 'allnoconfig' to let them take effect with missing config options as disabled.
The scripts/kconfig/merge_config.sh tool is used to merge the extra config files.
Suggested-by: Thomas Wei?schuh linux@weissschuh.net Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/ Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/Makefile | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index f42adef87e12..08a5ca5f418b 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -39,6 +39,9 @@ DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig DEFCONFIG = $(DEFCONFIG_$(ARCH)) +# extra kernel config files under configs/, include common + architecture specific +EXTCONFIG = common.config $(ARCH).config
# optional tests to run (default = all) TEST = @@ -162,6 +165,10 @@ initramfs: nolibc-test defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare +extconfig:
- $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
- $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
Please also mention this entry in the "help" message. Other than that, OK.
Willy, as we discussed in another tinyconfig series, in order to avoid add more complexity to users, I plan to drop this standalone 'extconfig' target and move the extra config operations to defconfig target like this:
defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c)) $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
This extra config options are really critical to default boot and test, so, it should be a 'default' config action as the 'defconfig' target originally mean.
Will test carefully about this change, what's your idea?
Well, *iff* we need to have per-arch config settings, probably to better support Qemu, the console output, etc, then indeed we'd rather add them to all configs indeed. However the example above is bogus. First you create a default config, then prepare the rest of the kernel, then merge that config with the extensions (based on the arch and not the sub-arch BTW) then erase everything to restart from an allnoconfig.
Also if you're using merge_config you'll need -Q to disable warning about overridden options since you're starting from a defconfig which contains plenty of them. Usually I just do defconfig, append the few changes, then oldconfig and that's done. But merge_config can probably be fine as well. Also make prepare should be done last, to make sure that if it depends on anything in the config (e.g. 32 vs 64 bit etc) it's up to date.
Willy
Hi Zhangjin,
On Tue, Jul 25, 2023 at 10:30:06PM +0800, Zhangjin Wu wrote:
Hi, Willy
On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
The default DEFCONFIG_<ARCH> may not always work for all architectures, some architectures require to add extra kernel config options, this adds a new 'extconfig' target for this requirement.
It allows to customize extra kernel config options via the common common.config and the architecture specific <ARCH>.config, at last trigger 'allnoconfig' to let them take effect with missing config options as disabled.
The scripts/kconfig/merge_config.sh tool is used to merge the extra config files.
Suggested-by: Thomas Wei?schuh linux@weissschuh.net Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/ Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/Makefile | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index f42adef87e12..08a5ca5f418b 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -39,6 +39,9 @@ DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig DEFCONFIG = $(DEFCONFIG_$(ARCH)) +# extra kernel config files under configs/, include common + architecture specific +EXTCONFIG = common.config $(ARCH).config
# optional tests to run (default = all) TEST = @@ -162,6 +165,10 @@ initramfs: nolibc-test defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare +extconfig:
- $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
- $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
Please also mention this entry in the "help" message. Other than that, OK.
Willy, as we discussed in another tinyconfig series, in order to avoid add more complexity to users, I plan to drop this standalone 'extconfig' target and move the extra config operations to defconfig target like this:
defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare $(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c)) $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
This extra config options are really critical to default boot and test, so, it should be a 'default' config action as the 'defconfig' target originally mean.
Will test carefully about this change, what's your idea?
Well, *iff* we need to have per-arch config settings, probably to better support Qemu, the console output, etc, then indeed we'd rather add them to all configs indeed. However the example above is bogus. First you create a default config, then prepare the rest of the kernel , then merge that config with the extensions (based on the arch and not the sub-arch BTW) then erase everything to restart from an allnoconfig.
allnoconfig is based on the "$(srctree)/.config" generated by defconfig plus the extra ARCH specific config options, so, it should be additional?
Btw, what do you mean? "based on the arch and not the sub-arch BTW", in reality, the ARCH will be changed to XARCH, so, this is sub-arch/variant specific.
Also if you're using merge_config you'll need -Q to disable warning about overridden options since you're starting from a defconfig which contains plenty of them.
Yeah, we do need this to silence some warnings.
Usually I just do defconfig, append the few changes, then oldconfig and that's done.
I have used olddefconfig (similar to oldconfig but without prompts) before, but allnoconfig is used to mainly to only compile the ones we explicitly enable, with the new additional options as No, which may be more deterministic, it should not touch the old configured ones in .config? I have learned its usage from the merge_config.sh
But merge_config can probably be fine as well. Also make prepare should be done last, to make sure that if it depends on anything in the config (e.g. 32 vs 64 bit etc) it's up to date.
Ok, let's move 'prepare' at the end of all above.
And also add the -Q option to silence the potential warnings:
defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) $(Q)$(srctree)/scripts/kconfig/merge_config.sh -Q -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c)) $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig $(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) prepare
We may still need to clear the usage of allnoconfig, In my test, it does work as I explained above. If I really misused it, let's change it back to olddefconfig.
Thanks, Zhangjin
Willy
To test the architectures not directly supported by kernel, let's add a new XARCH as our own test input and map between XARCH and ARCH to make sure pass a right ARCH to kernel for a XARCH input and also configure a default XARCH for ARCH.
ARCH is a subset of XARCH, to test more architectures than the ARCH variable directly supported by kernel, the old architecture specific variables used by our test are converted to use XARCH instead of ARCH.
Suggested-by: Willy Tarreau w@1wt.eu Link: https://lore.kernel.org/lkml/20230702171715.GD16233@1wt.eu/ Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/Makefile | 34 +++++++++++++++---------- 1 file changed, 21 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index 08a5ca5f418b..b17a82efe6de 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -14,6 +14,13 @@ include $(srctree)/scripts/subarch.include ARCH = $(SUBARCH) endif
+# XARCH and ARCH mapping +# XARCH is specified by user +XARCH ?= $(or $(XARCH_$(ARCH)),$(ARCH)) + +# ARCH is supported by kernel +ARCH := $(or $(ARCH_$(XARCH)),$(XARCH)) + # kernel image names by architecture IMAGE_i386 = arch/x86/boot/bzImage IMAGE_x86_64 = arch/x86/boot/bzImage @@ -24,7 +31,7 @@ IMAGE_mips = vmlinuz IMAGE_riscv = arch/riscv/boot/Image IMAGE_s390 = arch/s390/boot/bzImage IMAGE_loongarch = arch/loongarch/boot/vmlinuz.efi -IMAGE = $(IMAGE_$(ARCH)) +IMAGE = $(IMAGE_$(XARCH)) IMAGE_NAME = $(notdir $(IMAGE))
# default kernel configurations that appear to be usable @@ -37,10 +44,10 @@ DEFCONFIG_mips = malta_defconfig DEFCONFIG_riscv = defconfig DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig -DEFCONFIG = $(DEFCONFIG_$(ARCH)) +DEFCONFIG = $(DEFCONFIG_$(XARCH))
# extra kernel config files under configs/, include common + architecture specific -EXTCONFIG = common.config $(ARCH).config +EXTCONFIG = common.config $(XARCH).config
# optional tests to run (default = all) TEST = @@ -55,7 +62,7 @@ QEMU_ARCH_mips = mipsel # works with malta_defconfig QEMU_ARCH_riscv = riscv64 QEMU_ARCH_s390 = s390x QEMU_ARCH_loongarch = loongarch64 -QEMU_ARCH = $(QEMU_ARCH_$(ARCH)) +QEMU_ARCH = $(QEMU_ARCH_$(XARCH))
# QEMU_ARGS : some arch-specific args to pass to qemu QEMU_ARGS_i386 = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $(TEST:%=NOLIBC_TEST=%)" @@ -67,7 +74,7 @@ QEMU_ARGS_mips = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)" -QEMU_ARGS = $(QEMU_ARGS_$(ARCH)) $(QEMU_ARGS_EXTRA) +QEMU_ARGS = $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_EXTRA)
# OUTPUT is only set when run from the main makefile, otherwise # it defaults to this nolibc directory. @@ -84,7 +91,7 @@ CFLAGS_mips = -EL CFLAGS_STACKPROTECTOR ?= $(call cc-option,-mstack-protector-guard=global $(call cc-option,-fstack-protector-all)) CFLAGS ?= -Os -fno-ident -fno-asynchronous-unwind-tables -std=c89 \ $(call cc-option,-fno-stack-protector) \ - $(CFLAGS_$(ARCH)) $(CFLAGS_STACKPROTECTOR) + $(CFLAGS_$(XARCH)) $(CFLAGS_STACKPROTECTOR) LDFLAGS := -s
REPORT ?= awk '/[OK][\r]*$$/{p++} /[FAIL][\r]*$$/{if (!f) printf("\n"); f++; print;} /[SKIPPED][\r]*$$/{s++} \ @@ -99,24 +106,25 @@ help: @echo " sysroot create the nolibc sysroot here (uses $$ARCH)" @echo " nolibc-test build the executable (uses $$CC and $$CROSS_COMPILE)" @echo " libc-test build an executable using the compiler's default libc instead" - @echo " run-user runs the executable under QEMU (uses $$ARCH, $$TEST)" + @echo " run-user runs the executable under QEMU (uses $$XARCH, $$TEST)" @echo " initramfs prepare the initramfs with nolibc-test" - @echo " defconfig create a fresh new default config (uses $$ARCH)" - @echo " kernel (re)build the kernel with the initramfs (uses $$ARCH)" - @echo " run runs the kernel in QEMU after building it (uses $$ARCH, $$TEST)" - @echo " rerun runs a previously prebuilt kernel in QEMU (uses $$ARCH, $$TEST)" + @echo " defconfig create a fresh new default config (uses $$XARCH)" + @echo " kernel (re)build the kernel with the initramfs (uses $$XARCH)" + @echo " run runs the kernel in QEMU after building it (uses $$XARCH, $$TEST)" + @echo " rerun runs a previously prebuilt kernel in QEMU (uses $$XARCH, $$TEST)" @echo " clean clean the sysroot, initramfs, build and output files" @echo "" @echo "The output file is "run.out". Test ranges may be passed using $$TEST." @echo "" @echo "Currently using the following variables:" @echo " ARCH = $(ARCH)" + @echo " XARCH = $(XARCH)" @echo " CROSS_COMPILE = $(CROSS_COMPILE)" @echo " CC = $(CC)" @echo " OUTPUT = $(OUTPUT)" @echo " TEST = $(TEST)" - @echo " QEMU_ARCH = $(if $(QEMU_ARCH),$(QEMU_ARCH),UNKNOWN_ARCH) [determined from $$ARCH]" - @echo " IMAGE_NAME = $(if $(IMAGE_NAME),$(IMAGE_NAME),UNKNOWN_ARCH) [determined from $$ARCH]" + @echo " QEMU_ARCH = $(if $(QEMU_ARCH),$(QEMU_ARCH),UNKNOWN_ARCH) [determined from $$XARCH]" + @echo " IMAGE_NAME = $(if $(IMAGE_NAME),$(IMAGE_NAME),UNKNOWN_ARCH) [determined from $$XARCH]" @echo ""
all: run
On Wed, Jul 19, 2023 at 05:15:13AM +0800, Zhangjin Wu wrote:
- @echo " run-user runs the executable under QEMU (uses $$ARCH, $$TEST)"
- @echo " run-user runs the executable under QEMU (uses $$XARCH, $$TEST)"
Most users will neither care about nor need XARCH and will wonder what to put there. Since there's a fallback on ARCH, better indicate something like:
(uses $XARCH or $ARCH, and $TEST)
Willy
The default qemu-system-ppc g3beige machine [1] is used to run 32-bit powerpc kernel.
The pmac32_defconfig is used with extra PMACZILOG console options to enable normal print.
Note, zImage doesn't boot due to "qemu-system-ppc: Some ROM regions are overlapping" error, so, vmlinux is used instead.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powermac.html
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/Makefile | 4 ++++ tools/testing/selftests/nolibc/configs/powerpc.config | 3 +++ 2 files changed, 7 insertions(+) create mode 100644 tools/testing/selftests/nolibc/configs/powerpc.config
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index b17a82efe6de..9c375fab84e5 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -28,6 +28,7 @@ IMAGE_x86 = arch/x86/boot/bzImage IMAGE_arm64 = arch/arm64/boot/Image IMAGE_arm = arch/arm/boot/zImage IMAGE_mips = vmlinuz +IMAGE_powerpc = vmlinux IMAGE_riscv = arch/riscv/boot/Image IMAGE_s390 = arch/s390/boot/bzImage IMAGE_loongarch = arch/loongarch/boot/vmlinuz.efi @@ -41,6 +42,7 @@ DEFCONFIG_x86 = defconfig DEFCONFIG_arm64 = defconfig DEFCONFIG_arm = multi_v7_defconfig DEFCONFIG_mips = malta_defconfig +DEFCONFIG_powerpc = pmac32_defconfig DEFCONFIG_riscv = defconfig DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig @@ -59,6 +61,7 @@ QEMU_ARCH_x86 = x86_64 QEMU_ARCH_arm64 = aarch64 QEMU_ARCH_arm = arm QEMU_ARCH_mips = mipsel # works with malta_defconfig +QEMU_ARCH_powerpc = ppc QEMU_ARCH_riscv = riscv64 QEMU_ARCH_s390 = s390x QEMU_ARCH_loongarch = loongarch64 @@ -71,6 +74,7 @@ QEMU_ARGS_x86 = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $( QEMU_ARGS_arm64 = -M virt -cpu cortex-a53 -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_arm = -M virt -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_mips = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" +QEMU_ARGS_powerpc = -M g3beige -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)" diff --git a/tools/testing/selftests/nolibc/configs/powerpc.config b/tools/testing/selftests/nolibc/configs/powerpc.config new file mode 100644 index 000000000000..b1975f8253f7 --- /dev/null +++ b/tools/testing/selftests/nolibc/configs/powerpc.config @@ -0,0 +1,3 @@ +CONFIG_SERIAL_PMACZILOG=y +CONFIG_SERIAL_PMACZILOG_TTYS=y +CONFIG_SERIAL_PMACZILOG_CONSOLE=y
Here adds test support for little endian 64-bit PowerPC.
The powernv machine of qemu-system-ppc64le is used for there is just a working powernv_defconfig.
As the document [1] shows:
PowerNV (as Non-Virtualized) is the “bare metal” platform using the OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be used as an hypervisor OS, running KVM guests, or simply as a host OS.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/Makefile | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index 9c375fab84e5..fbdf7fd9bf96 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -19,6 +19,7 @@ endif XARCH ?= $(or $(XARCH_$(ARCH)),$(ARCH))
# ARCH is supported by kernel +ARCH_powerpc64le = powerpc ARCH := $(or $(ARCH_$(XARCH)),$(XARCH))
# kernel image names by architecture @@ -29,6 +30,7 @@ IMAGE_arm64 = arch/arm64/boot/Image IMAGE_arm = arch/arm/boot/zImage IMAGE_mips = vmlinuz IMAGE_powerpc = vmlinux +IMAGE_powerpc64le= arch/powerpc/boot/zImage IMAGE_riscv = arch/riscv/boot/Image IMAGE_s390 = arch/s390/boot/bzImage IMAGE_loongarch = arch/loongarch/boot/vmlinuz.efi @@ -43,6 +45,7 @@ DEFCONFIG_arm64 = defconfig DEFCONFIG_arm = multi_v7_defconfig DEFCONFIG_mips = malta_defconfig DEFCONFIG_powerpc = pmac32_defconfig +DEFCONFIG_powerpc64le= powernv_defconfig DEFCONFIG_riscv = defconfig DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig @@ -62,6 +65,7 @@ QEMU_ARCH_arm64 = aarch64 QEMU_ARCH_arm = arm QEMU_ARCH_mips = mipsel # works with malta_defconfig QEMU_ARCH_powerpc = ppc +QEMU_ARCH_powerpc64le= ppc64le QEMU_ARCH_riscv = riscv64 QEMU_ARCH_s390 = s390x QEMU_ARCH_loongarch = loongarch64 @@ -75,6 +79,7 @@ QEMU_ARGS_arm64 = -M virt -cpu cortex-a53 -append "panic=-1 $(TEST:%=NOLIBC QEMU_ARGS_arm = -M virt -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_mips = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_powerpc = -M g3beige -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" +QEMU_ARGS_powerpc64le= -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)" @@ -90,6 +95,7 @@ else Q=@ endif
+CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc CFLAGS_s390 = -m64 CFLAGS_mips = -EL CFLAGS_STACKPROTECTOR ?= $(call cc-option,-mstack-protector-guard=global $(call cc-option,-fstack-protector-all))
On Wed, Jul 19, 2023 at 05:17:26AM +0800, Zhangjin Wu wrote:
Here adds test support for little endian 64-bit PowerPC.
The powernv machine of qemu-system-ppc64le is used for there is just a working powernv_defconfig.
As the document [1] shows:
PowerNV (as Non-Virtualized) is the "bare metal" platform using the OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be used as an hypervisor OS, running KVM guests, or simply as a host OS.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/Makefile | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index 9c375fab84e5..fbdf7fd9bf96 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -19,6 +19,7 @@ endif XARCH ?= $(or $(XARCH_$(ARCH)),$(ARCH)) # ARCH is supported by kernel +ARCH_powerpc64le = powerpc
Given that this one will only be used as an alias, I really think you should call it "ppc64le" and not with that long a name. Everyone knows that arch under the name ppc64 anyway so it's not like it would cause any confusion.
Willy
Here adds test support for big endian 64-bit PowerPC.
The powernv machine of qemu-system-ppc64 is used with powernv_be_defconfig.
As the document [1] shows:
PowerNV (as Non-Virtualized) is the “bare metal” platform using the OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be used as an hypervisor OS, running KVM guests, or simply as a host OS.
Note, differs from little endian 64-bit PowerPC, vmlinux is used instead of zImage, because big endian zImage [2] only boot on qemu with x-vof=on (added from qemu v7.0) and a fixup patch [3] for qemu v7.0.51:
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html [2]: https://github.com/linuxppc/issues/issues/402 [3]: https://lore.kernel.org/qemu-devel/20220504065536.3534488-1-aik@ozlabs.ru/
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/Makefile | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index fbdf7fd9bf96..cced1d60ecf9 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -19,6 +19,7 @@ endif XARCH ?= $(or $(XARCH_$(ARCH)),$(ARCH))
# ARCH is supported by kernel +ARCH_powerpc64 = powerpc ARCH_powerpc64le = powerpc ARCH := $(or $(ARCH_$(XARCH)),$(XARCH))
@@ -30,6 +31,7 @@ IMAGE_arm64 = arch/arm64/boot/Image IMAGE_arm = arch/arm/boot/zImage IMAGE_mips = vmlinuz IMAGE_powerpc = vmlinux +IMAGE_powerpc64 = vmlinux IMAGE_powerpc64le= arch/powerpc/boot/zImage IMAGE_riscv = arch/riscv/boot/Image IMAGE_s390 = arch/s390/boot/bzImage @@ -45,6 +47,7 @@ DEFCONFIG_arm64 = defconfig DEFCONFIG_arm = multi_v7_defconfig DEFCONFIG_mips = malta_defconfig DEFCONFIG_powerpc = pmac32_defconfig +DEFCONFIG_powerpc64 = powernv_be_defconfig DEFCONFIG_powerpc64le= powernv_defconfig DEFCONFIG_riscv = defconfig DEFCONFIG_s390 = defconfig @@ -65,6 +68,7 @@ QEMU_ARCH_arm64 = aarch64 QEMU_ARCH_arm = arm QEMU_ARCH_mips = mipsel # works with malta_defconfig QEMU_ARCH_powerpc = ppc +QEMU_ARCH_powerpc64 = ppc64 QEMU_ARCH_powerpc64le= ppc64le QEMU_ARCH_riscv = riscv64 QEMU_ARCH_s390 = s390x @@ -79,6 +83,7 @@ QEMU_ARGS_arm64 = -M virt -cpu cortex-a53 -append "panic=-1 $(TEST:%=NOLIBC QEMU_ARGS_arm = -M virt -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_mips = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_powerpc = -M g3beige -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" +QEMU_ARGS_powerpc64 = -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_powerpc64le= -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" @@ -95,6 +100,7 @@ else Q=@ endif
+CFLAGS_powerpc64 = -m64 -mbig-endian -mmultiple -Wl,-EB,-melf64ppc CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc CFLAGS_s390 = -m64 CFLAGS_mips = -EL
On Wed, Jul 19, 2023 at 05:18:32AM +0800, Zhangjin Wu wrote:
# ARCH is supported by kernel +ARCH_powerpc64 = powerpc ARCH_powerpc64le = powerpc
And similarly let's simply call this one ppc64.
Aside these few comments, the series looks nice, thanks! Willy
Hi, Willy
On Wed, Jul 19, 2023 at 05:18:32AM +0800, Zhangjin Wu wrote:
# ARCH is supported by kernel +ARCH_powerpc64 = powerpc ARCH_powerpc64le = powerpc
And similarly let's simply call this one ppc64.
Well, I like these short ones too, what about also a ppc alias for powerpc?
ARCH_ppc = powerpc ARCH_ppc64 = powerpc ARCH_ppc64le = powerpc
Aside these few comments, the series looks nice, thanks!
Have applied all of your suggestions, thanks very much.
Best regards, Zhangjin
Willy
On Tue, Jul 25, 2023 at 01:50:31PM +0800, Zhangjin Wu wrote:
Hi, Willy
On Wed, Jul 19, 2023 at 05:18:32AM +0800, Zhangjin Wu wrote:
# ARCH is supported by kernel +ARCH_powerpc64 = powerpc ARCH_powerpc64le = powerpc
And similarly let's simply call this one ppc64.
Well, I like these short ones too, what about also a ppc alias for powerpc?
ARCH_ppc = powerpc ARCH_ppc64 = powerpc ARCH_ppc64le = powerpc
I thought about it as well. It could avoid the confusion between the arch name (powerpc) that's used for both 32/64 and the user-requested one.
Willy
On 2023-07-19 05:09:42+0800, Zhangjin Wu wrote:
Here is the powerpc support, includes 32-bit big-endian powerpc, 64-bit little endian and big endian powerpc.
[..]
Zhangjin Wu (8): tools/nolibc: add support for powerpc tools/nolibc: add support for powerpc64 selftests/nolibc: select_null: fix up for big endian powerpc64 selftests/nolibc: add extra config file customize support selftests/nolibc: add XARCH and ARCH mapping support selftests/nolibc: add test support for powerpc selftests/nolibc: add test support for powerpc64le selftests/nolibc: add test support for powerpc64
For the full series, after comment for patch 1 is addressed:
Reviewed-by: Thomas Weißschuh linux@weissschuh.net
linux-kselftest-mirror@lists.linaro.org