Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc

25 Jul 2023


      Hi, Willy
Thanks very much for your comments on the new syscall.h proposal, just
replied more, it is a very long email, hope it explains more clearly ;-)
Also CCed i386 and s390 committers, welcome your dicussion.
...
On Tue, Jul 25, 2023 at 01:44:14PM +0800, Zhangjin Wu wrote:
...
This discussion does inspire me a lot to shrink the whole architecture
specific nolibc my_syscall<N>() macros, like crt.h, a common syscall.h
is added to do so. I have finished most of them except the ones passing
arguments via stack, still trying to merge these ones.
With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h
will only require to add ~10 lines to define their own syscall
instructions, registers and clobberlist, which looks like this (for
powerpc):
#define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg  %0, %0; 1:"

/* PowerPC doesn't always restore r3-r12 for us */
#define _NOLIBC_SYSCALL_CLOBBERLIST 
	"memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4"

/* PowerPC write GPRS in kernel side but not restore them */
#define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS

#define _NOLIBC_REG_NUM  "r0"
    #define _NOLIBC_REG_RET  "r3"
    #define _NOLIBC_REG_arg1 "r3"
    #define _NOLIBC_REG_arg2 "r4"
    #define _NOLIBC_REG_arg3 "r5"
    #define _NOLIBC_REG_arg4 "r6"
    #define _NOLIBC_REG_arg5 "r7"
    #define _NOLIBC_REG_arg6 "r8"
Before:
$ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done
157 tools/include/nolibc/arch-aarch64.h
199 tools/include/nolibc/arch-arm.h
178 tools/include/nolibc/arch-i386.h
164 tools/include/nolibc/arch-loongarch.h
195 tools/include/nolibc/arch-mips.h
0 tools/include/nolibc/arch-powerpc.h
160 tools/include/nolibc/arch-riscv.h
186 tools/include/nolibc/arch-s390.h
176 tools/include/nolibc/arch-x86_64.h


After:
$ wc -l tools/include/nolibc/arch-*.h
   54 tools/include/nolibc/arch-aarch64.h
   84 tools/include/nolibc/arch-arm.h
   90 tools/include/nolibc/arch-i386.h                        /* the last one use stack to pass arguments, reserve as-is */
   59 tools/include/nolibc/arch-loongarch.h
  120 tools/include/nolibc/arch-mips.h                        /* the last two use stack to pass arguments, reserve as-is */
   73 tools/include/nolibc/arch-powerpc.h
   58 tools/include/nolibc/arch-riscv.h
   87 tools/include/nolibc/arch-s390.h
   67 tools/include/nolibc/arch-x86_64.h


syscall.h itself:
$ wc -l tools/include/nolibc/syscall.h
112 tools/include/nolibc/syscall.h 

The important thing to consider is not the number of lines but the
*maintainability*.
The original goal is not really the number of lines (only a
'side-effect'), but is exactly easier porting/maintainability with
clearer code architecture, I have wrotten another message to explain
more about this background, so, let's directly reply here and discuss
more.
...
You factored the syscall code so much above with all
these macros that I don't even understand how they're going to interact
with each other, especially "%0".
Yeah, it is my fault, this should be cleaned up with the return register
directly:
#define _NOLIBC_SYSCALL_CALL \
    	"sc; bns+ 1f; neg 3, 3; 1:"
...
Also I don't know what the macro
_NOLIBC_GPRS_AS_OUTPUT_OPERANDS does.
This is the root cause to inspire me to add the new syscall.h, let's
explain the background step by step.
All of the other architectures (except PowerPC) restore GPRS for us when
return from syscall, so, their clobber list simply not include the GPRS
and only need to add the input registers in the 'INPUT Operands' list.
But PowerPC doesn't restore such GPRS for us, I'm not sure if it is a
feature (Maybe) or a bug. for PowerPC32, the following line will restore
the GPRS for us, but may be not a good idea to do so for it may decrease
the syscall performance although save some instructions in user-space
and also, the other libcs also follow the current rule, so, this may be
a design intention we must follow (welcome your suggestions).
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
    index fe27d41f9a3d..1ed535e9144c 100644
    --- a/arch/powerpc/kernel/entry_32.S
    +++ b/arch/powerpc/kernel/entry_32.S
    @@ -155,6 +155,7 @@ syscall_exit_finish:
            bne     3f
            mtcr    r5
+       REST_GPRS(3, 12, r1)
     1:     REST_GPR(2, r1)
            REST_GPR(1, r1)
            rfi
For PowerPC, if with the previous method like the other architectures,
the clobber list differs for every my_syscall<N> and all of the input
registers must be put in the 'OUTPUT Operands' too to avoid compiler to
save and resue a variable in such GPRS across my_syscall<N> calls.
Originally in my local new version of arch-powerpc.h, we got such code
for every my_syscall<N>, use my_syscall6() as an example:
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                 \
    ({                                                                           \
            register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
            register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
            register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
            register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
            register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
            register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
            register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
                                                                                 \
            __asm__ volatile (                                                   \
                    "       sc\n"                                                \
                    "       bns+ 1f\n"                                           \
                    "       neg  %0, %0\n"                                       \
                    "1:\n"                                                       \
                    : "=r"(_ret),                                                \
    	      "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),                     \
                      "+r"(_arg5), "+r"(_arg6)                                   \
                    : "0"(_arg1), "r"(_num)                                      \
                    : _NOLIBC_SYSCALL_CLOBBERLIST                                \
            );                                                                   \
            _ret;                                                                \
    })
It almost aligns with the other architectures, but the full clobber list
differs for every my_syscall<N>, the basic one is:
/* PowerPC kernel doesn't always restore r4-r12 for us */
    #define _NOLIBC_SYSCALL_CLOBBERLIST \
        "memory", "cr0", "r12", "r11", "r10", "r9",
Use my_syscall0() as a further example, we need something like this:
#define my_syscall0(num)                                                     \
    ({                                                                           \
            register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
                                                                                 \
            __asm__ volatile (                                                   \
                    "       sc\n"                                                \
                    "       bns+ 1f\n"                                           \
                    "       neg  %0, %0\n"                                       \
                    "1:\n"                                                       \
                    : "=r"(_ret)                                                 \
                    : "r"(_num)                                                  \
                    : _NOLIBC_SYSCALL_CLOBBERLIST, "r8", "r7", "r6", "r5", "r4"  \
            );                                                                   \
            _ret;                                                                \
    })
The additional "r8"..."r4" must be appended to the clobber list for they
can not be put together for every my_syscall<N> due to conflicts between
they between the clobber list and the "OUTPUT/INPUT operands".
I found a solution to share the same _NOLIBC_SYSCALL_CLOBBERLIST, that
is split the "OUTPUT/INPUT Operands" list out of the core syscall
assembly:
#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                 \
    ({                                                                           \
            register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
            register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
            register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
            register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
            register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
            register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
            register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
                                                                                 \
            __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),         \
                                  "+r"(_arg5), "+r"(_arg6)::);                   \
                                                                                 \
            __asm__ volatile (                                                   \
                    "       sc\n"                                                \
                    "       bns+ 1f\n"                                           \
                    "       neg  %0, %0\n"                                       \
                    "1:\n"                                                       \
                    : "=r"(_ret)                                                 \
                    : "0"(_arg1), "r"(_num)                                      \
                    : _NOLIBC_SYSCALL_CLOBBERLIST                                \
            );                                                                   \
            _ret;                                                                \
    })
Note, a question here is, the above split still require more discussion
to make sure it does work for different toolchains (only test for gcc
currently) or even if this method is right from scratch, welcome your
suggestion.
As a result, all of the my_syscall<N> are able to share the core syscall
calling assembly block, so, here is what we at last have:
#define _my_syscall_tail()                                              \
    	__asm__ volatile (                                                  \
    		_NOLIBC_SYSCALL_CALL                                        \
    		: "=r"(_ret)                                                \
    		: "r"(_num)                                                 \
    		: _NOLIBC_SYSCALL_CLOBBERLIST                               \
    	);                                                                  \
    	_ret
And further, we also found it was possible to share most of them among
these not ugly but completely duplicated lines:
register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
            register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
            register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
            register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
            register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
            register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
            register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
                                                                                 \
            __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),         \
                                  "+r"(_arg5), "+r"(_arg6)::);
That's why at last we have such blocks (of course, for PowerPC itself
it is a big change and not necessary):
#define _my_syscall_head(num)                                               \
    	register long _ret __asm__ (_NOLIBC_REG_RET);                           \
    	register long _num __asm__ (_NOLIBC_REG_NUM) = (num)                    \
#ifdef _NOLIBC_REG_ERR
    #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR
    #endif
#ifdef _NOLIBC_REG_EXTRA
    #define _my_syscall_extra() \
    	register long reg_extra __asm__ (_NOLIBC_REG_EXTRA);                   \
    	__asm__ volatile ("": "=r"(reg_extra)::)
    #else
    #define _my_syscall_extra()
    #endif
/* Architectures like PowerPC write GPRS in kernel side and not restore them */
    #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("":: "r"(_arg##n):)
    #else
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("": "+r"(_arg##n)::)
    #endif
And this further build args for us:
#define _my_syscall_args0()
#define _my_syscall_args1(...) \
    	_my_syscall_args0(); \
    	_my_syscall_argn(1, __VA_ARGS__)
#define _my_syscall_args2(arg2, ...) \
    	_my_syscall_args1(__VA_ARGS__); \
    	_my_syscall_argn(2, arg2)
#define _my_syscall_args3(arg3, ...) \
    	_my_syscall_args2(__VA_ARGS__); \
    	_my_syscall_argn(3, arg3)
#define _my_syscall_args4(arg4, ...) \
    	_my_syscall_args3(__VA_ARGS__); \
    	_my_syscall_argn(4, arg4)
#define _my_syscall_args5(arg5, ...) \
    	_my_syscall_args4(__VA_ARGS__); \
    	_my_syscall_argn(5, arg5)
#define _my_syscall_args6(arg6, ...) \
    	_my_syscall_args5(__VA_ARGS__); \
    	_my_syscall_argn(6, arg6)
And at last:
#define __my_syscall_args(N, ...) _my_syscall_args##N(__VA_ARGS__)
    #define _my_syscall_args(N, ...) __my_syscall_args(N, ##__VA_ARGS__)
#define __my_syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
    #define _my_syscall_narg(...) __my_syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
#define __my_syscall_argsn(N, argn, ...) \
    	_my_syscall_args(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__); \
    	_my_syscall_argn(N, argn)
#define _my_syscall_argsn(...) __my_syscall_argsn(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__)
/* Note, my_syscall0() has no argument, can not use my_syscalln() */
    #define my_syscall0(num)                                                           \
    ({                                                                                 \
    	_my_syscall_head(num);                                                         \
    	_my_syscall_extra();                                                           \
    	_my_syscall_tail();                                                            \
    })
#define my_syscalln(num, ...)                                                      \
    ({                                                                                 \
    	_my_syscall_head(num);                                                         \
    	_my_syscall_extra();                                                           \
    	_my_syscall_argsn(__VA_ARGS__);                                                \
    	_my_syscall_tail();                                                            \
    })
#define my_syscall1(num, arg1)                               my_syscalln(num, arg1)
    #define my_syscall2(num, arg1, arg2)                         my_syscalln(num, arg2, arg1)
    #define my_syscall3(num, arg1, arg2, arg3)                   my_syscalln(num, arg3, arg2, arg1)
    #define my_syscall4(num, arg1, arg2, arg3, arg4)             my_syscalln(num, arg4, arg3, arg2, arg1)
#ifndef my_syscall5
    #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)       my_syscalln(num, arg5, arg4, arg3, arg2, arg1)
    #endif
    #ifndef my_syscall6
    #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1)
    #endif
At last, I found this worked on all of the supported architectures, so,
the new syscall.h is proposed.
BTW, another question here is, to utilize the feature of __VA_ARGS__ to
easier getting the last argument, the order of arguments are reversed
during the declarations of the my_syscall<N>, any suggestion on this
part? is it possible to not reverse the order? If possible, the
my_syscall<N> declarations may be simplified to:
#define my_syscall1(...) my_syscalln(__VA_ARGS__)
And even be possible to define syscall() from unistd.h as a alias of
my_syscalln():
#define syscall(...) my_syscalln(__VA_ARGS__)
And further, these three from unistd.h are sharable:
#define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
    #define _syscall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
    #define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)
...
And when someone reports a bug
like we had in the past with programs randomly crashing depending on
stack alignment and such, it becomes particularly tricky to figure what
is expected and how it differs from reality.
Macros are really hard to debug, the above code lines cost me two days ;-)
but after the common syscall.h, the left architecture specific parts are
very few and easier to debug and even less copy and paste.
...
Maybe it's possible to do something in between, like defining a few
more slightly larger blocks,
Yeah, here is the method we applied, new blocks added are:
/* for ret and num */
    #define _my_syscall_head(num)                                           \
/* for extra err output */
    #define _my_syscall_extra()                                             \
/* for every argument, with additional OUTPUT/INPUT operands setting */
    #define _my_syscall_argn(n, arg)                                        \
/* for the core syscall calling and return */
    #define _my_syscall_tail()                                              \
/* for variable number of args */
    #define _my_syscall_args0()
    #define _my_syscall_args1(...) \
    #define _my_syscall_args2(arg2, ...) \
    #define _my_syscall_args3(arg3, ...) \
    #define _my_syscall_args4(arg4, ...) \
    #define _my_syscall_args5(arg5, ...) \
    #define _my_syscall_args6(arg6, ...) \
/* unified variable number of args */
    #define _my_syscall_argsn(...)
/* my_syscall0 */
    #define my_syscall0(num)                                                               \
/* my_syscalln */
    #define my_syscalln(num, ...) \
Still require more discussion on their names or even more
simplification.
...
I don't know. I still tend to think that
this significantly complicates the understanding of the whole thing.
Willy, don't worry, I do think it make things easier, the worse case is
using the old code or only use part of our new blocks helpers ;-)
For this new syscall.h, it mainly clear the inline assembly codes, as we
know, inline assembly code is a very complicated thing, If we clear the
logic carefully (as we target but not yet) in our common code,
architecture developers only require to focus on the platform specific
definitions, it should be better for portability, review and
maintainability.
It is very hard to learn the meanning of the OUTPUT operands, INPUT
operands and even the clobber list and even the flags you mentioned
below, clearing up them is really required, this new syscall.h also
require more comments on the macro functions and variables.
...
Also, looking at different archs' syscall code, they're not all defined
the same way. Some like i386 use "=a" for the return value. Others use
"+r" as an input/output value, others "=r",
Agree, this is a very hard part of the inline assembly codes, clearing
up the generic versions in syscall.h with additional comments may really
help a lot.
For OUTPUT/INPUT operands, they may be enough for the generic versions:
#define _my_syscall_tail()                                              \
    	__asm__ volatile (                                                  \
    		_NOLIBC_SYSCALL_CALL                                        \
    		: "=r"(_ret)                                                \
    		: "r"(_num)                                                 \
    		: _NOLIBC_SYSCALL_CLOBBERLIST                               \
    	);                                                                  \
    	_ret
#ifdef _NOLIBC_REG_ERR
    #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR
    #endif
#ifdef _NOLIBC_REG_EXTRA
    #define _my_syscall_extra() \
    	register long reg_extra __asm__ (_NOLIBC_REG_EXTRA);                   \
    	__asm__ volatile ("": "=r"(reg_extra)::)
    #else
    #define _my_syscall_extra()
    #endif
/* Architectures like PowerPC write GPRS in kernel side and not restore them */
    #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("":: "r"(_arg##n):)
    #else
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("": "+r"(_arg##n)::)
    #endif
- "=r" is only required for output registers, it is mainly for "RET"
  register and "ERR" register.
- "r" is for input, mainly for "NUM" register and the input arguments
  (sometimes, the input arguments are used as "RET" and "ERR" too).
- "+r" means both input and output, only used by powerpc currently.
For the input registers used as "ERR" or "RET" output, "+r" is used
  before, but now, we split them to two parts, one is "=r"(_ret) in
  _my_syscall_tail(), another is in _my_syscall_argn(n, arg), they
  together work for "+r" = "=r" + "r"
Btw, have checked "=r" instead of "=a" works on i386 too for we already
bind the _ret variable with "RET" register, but still need to check if
"=a" is necessary?
...
others "+d" or "=d". And
reading the comments, there are some arch-specific choices for these
declarations, that are sometimes there to force the toolchain a little
bit.
Even for s390, test shows, "r" instead of "d" works too (boot and test
passed, not yet checked the size and codes generated), but I didn't find
any document about "d" for s390 from the gcc online doc. This part
(include all of the supported architectures) should be carefully checked
if really adding our new syscall.h. add s390 nolibc committer to CC: list.
If "r" really not works on a target, we can use the logic like this
(include syscall.h after architecture specific definitions):
#ifndef my_syscall5
    #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)       my_syscalln(num, arg5, arg4, arg3, arg2, arg1)
    #endif
    #ifndef my_syscall6
    #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1)
    #endif
The blocks from syscalls.h are only provided as generic support, the
architectures can define their own versions. all of our new blocks can
be protected like my_syscall<N> above, for example:
#ifndef _my_syscall_tail
    #define _my_syscall_tail ...
    #endif
But for s390 here, if it really requires "d" instead of "r", we are able
to allow architectures define their own modifier like this:
#ifndef _NOLIBC_INLINE_ASM_MODIFIER
    #define _NOLIBC_INLINE_ASM_MODIFIER "r"
    #endif
Then, we can define _NOLIBC_INLINE_ASM_MODIFIER for s390:
#define _NOLIBC_INLINE_ASM_MODIFIER "d"
But architectures like i386, If "=a", "=b", ... modifiers are necessary,
new versions of the blocks should be added for these architectures.
As a short summary for this part, the modifiers used by different
architectures should be carefully checked, welcome more suggestions from
the toolchains developers or the architecture maintainers, I will
compare the code generated too.
...
Others like MIPS pass some of their args in the stack, so a name
only is not sufficient. Thus, trying to factor all of this will not only
make it very hard to insert arch-specific adjusments, but it will also
make the code much less readable.
Currently, I didn't yet work on merging the 'stack' support, we used
'#ifdef's in syscall.h:
#ifndef my_syscall5
    #define my_syscall5 ...
    #endif
    #ifndef my_syscall6
    #define my_syscall6 ...
    #endif
So, they are the same as before:
$ grep "#define my_syscall" -ur tools/include/nolibc/arch-*.h
    tools/include/nolibc/arch-i386.h:#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)	\
    tools/include/nolibc/arch-mips.h:#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)                        \
    tools/include/nolibc/arch-mips.h:#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)
Still require more work on how to support these ones, but it is not that
urgent ;-)
...
Also think about this for a moment: you had the opportunity to add two
new archs by simply restarting from existing ones. s390 and loongarch
were added the same way, leaving enough freedom to the implementers to
adjust the definitions to fit their environment's constraints.
I think this reserves as-is if the new arch-<ARCH>.h simply not include
our new syscalls.h, so, the new syscalls.h is optional, the worse case
is as-is, no regression, enough freedom as before ;-)
And further, some architectures may resue some helpers from our new syscalls.h
or at least learn something from what we have done for all of the supported
architectures.
And eventually, if our generic versions fits, just defining the
syscall instructions, registers and clobber list should be enough.
...
If you
make it very complicated, I suspect we won't get any such contributions
anymore just because nobody figures how to find their way through it,
or it would require to change again for everyone just to add one specific
case. I tend to think that the current situation is already fairly
minimal with quite readable and debuggable calls, and that it's what
*really* matters.
This may really influence the upstream and review flow:
- If people may send a new architecture support, if just fits the new
  syscall.h, we may need to review carefully about the test results,
  especially for the input/output operands, error register.
As tests for powerpc shows, the above issues should be covered by our
  nolibc-test.
- If people may send a new architecture support as before, If we find it
  fits our new syscalls.h or it can apply part of the blocks, we can
  give some suggestions.
- If people send something not just fit our new syscall.h, we may be
  able to think about if a new 'hook' is required, but it is not
  necessary, we can delay this change requirement to after a careful
  design (just like the argument passing via 'stack' case currently) .
...
When you're trying to reorganize code, it's important to ask yourself
whether you would prefer to debug the old one or the new one at 3am.
Hint: at 3am, the more abstractions there are, the less understandable
it becomes.
Interesting and agree, but this abstraction does clear something to be more
undersatndable too ;-)
I do hate hard-debuggable macros, but as we mentioned above, the inline
assembly code is another harder parts, the new carefully tuned blocks may really
help us to avoid introduce bugs with manually wrotten new codes and also it may
help us to avoid copy and paste multiple duplicated lines of the same codes.
...
...
Willy, do we need to rename my_syscall<N> to _nolibc_syscall<N> to limit
these macros nolibc internally? I plan to rename all of the new adding
macros with _nolibc_ (for funcs) or _NOLIBC_ (for variables).
Why not, I'm not opposed to that. Just keep in mind that every single
rename complicates backports (or may even silently break them), so it's
important to know where you want to go and to try to make changes converge
towards something stable and durable.
Yeah, let's keep the changes minimal, we could prefix the new macros
with _nolibc_ or _NOLIBC_, the old ones can be kept as-is.
Best regards,
Zhangjin
...
Thanks,
Willy

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc