Hi, Thomas
On 2023-07-08 23:29:58+0800, Zhangjin Wu wrote:
As the environ and _auxv support added for nolibc, the assembly _start function becomes more and more complex and therefore makes the porting of nolibc to new architectures harder and harder.
To simplify portability, this c version of _start_c() is added to do most of the assembly start operations in C, which reduces the complexity a lot and will eventually simplify the porting of nolibc to the new architectures.
The new _start_c() only requires a stack pointer argument, it will find argv, envp and _auxv for us, and then call main(), finally, it exit() with main's return status. With this new _start_c(), the future new architectures only require to add very few assembly instructions.
I like it!
A quick test indicates that the initialization of the stackprotectors could also be moved into the C function.
Cool, do you mean directly call __stack_chk_init() at the beginning of _start_c()?
It also seems like a good opportunity to add some tests for argv/environment variable passing.
Yes, and even further, we can do more on auxv, just like musl does in src/env/__libc_start_main.c, not that urgent currently:
libc.auxv = auxv = (void *)(envp+i+1); ... __hwcap = aux[AT_HWCAP]; if (aux[AT_SYSINFO]) __sysinfo = aux[AT_SYSINFO]; ... libc.page_size = aux[AT_PAGESZ];
if (!pn) pn = (void*)aux[AT_EXECFN]; if (!pn) pn = ""; __progname = __progname_full = pn; for (i=0; pn[i]; i++) if (pn[i]=='/') __progname = pn+i+1;
__init_tls(aux); __init_ssp((void *)aux[AT_RANDOM]);
if (aux[AT_UID]==aux[AT_EUID] && aux[AT_GID]==aux[AT_EGID] && !aux[AT_SECURE]) return;
... libc.secure = 1;
And as general note to the full series I think that splitting the arch files is not necessary and confusing.
Ok, welcome to discuss more in this thread:
https://lore.kernel.org/lkml/20230710072340.10798-1-falcon@tinylab.org/
and let's choose a better method as possible as we can, Just replied Willy to explain more.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/crt.h | 44 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+)
diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h index 221b7c5346ca..b269294e9664 100644 --- a/tools/include/nolibc/crt.h +++ b/tools/include/nolibc/crt.h @@ -13,4 +13,48 @@ char **environ __attribute__((weak));
The old code seems to avoid putting "environ" into the global symbol namespace. Could this declaration be moved into the function like in getenv()?
ok, do you mean just move it to stdlib.h like this? I moved _auxv (used by getauxv()) to stdlib.h too:
tools/nolibc: move environ and _auxv from crt.h to stdlib.h
Move the definitions of environ and _auxv from crt.h to stdlib.h, where the place who uses those definitions.
- getenv uses environ - getauxv uses _auxcv
Signed-off-by: Zhangjin Wu falcon@tinylab.org
diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h index b269294e9664..d2f84cbe73d0 100644 --- a/tools/include/nolibc/crt.h +++ b/tools/include/nolibc/crt.h @@ -10,14 +10,13 @@ #include "compiler.h" #include "crt_arch.h"
-char **environ __attribute__((weak)); -const unsigned long *_auxv __attribute__((weak)); - int main(int argc, char *argv[], char **envp); static void exit(int);
void _start_c(long *sp) { + extern char **environ; + extern const unsigned long *_auxv; int argc, i; char **argv; char **envp; diff --git a/tools/include/nolibc/stdlib.h b/tools/include/nolibc/stdlib.h index 2f9b4b3c6d26..5eadadc2d0f5 100644 --- a/tools/include/nolibc/stdlib.h +++ b/tools/include/nolibc/stdlib.h @@ -14,6 +14,9 @@ #include "string.h" #include <linux/auxvec.h>
+char **environ __attribute__((weak)); +const unsigned long *_auxv __attribute__((weak)); + struct nolibc_heap { size_t len; char user_p[] __attribute__((__aligned__));
const unsigned long *_auxv __attribute__((weak)); +int main(int argc, char *argv[], char **envp);
This will lead to conflicting declarations if the users use a different signature. I'm not (yet?) sure how to work around this.
Ah yes, I forgot this critical case, people may use something like:
int main(void) int main(int argc, char *argv[])
Just applied the method suggested by you in anothe reply [1]:
diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h index d2f84cbe73d0..8fe38ef8c5b2 100644 --- a/tools/include/nolibc/crt.h +++ b/tools/include/nolibc/crt.h @@ -10,7 +10,7 @@ #include "compiler.h" #include "crt_arch.h"
-int main(int argc, char *argv[], char **envp); +typedef int (_nolibc_main_fn)(int, char **, char **); static void exit(int);
void _start_c(long *sp) @@ -20,6 +20,7 @@ void _start_c(long *sp) int argc, i; char **argv; char **envp; + _nolibc_main_fn _nolibc_main __asm__ ("main");
/* * sp : argc <-- argument count, required by main() @@ -53,7 +54,7 @@ void _start_c(long *sp) _auxv = (void *)(envp + i + 1);
/* go to application */ - exit(main(argc, argv, envp)); + exit(_nolibc_main(argc, argv, envp)); }
#endif /* _NOLIBC_CRT_H */
It works as expected, thanks very much!
Also how is the case handled where main() returns "void"?
Do you mean the main() without an explicit "return 0"?
int main(int argc, char *argv[], char **envp) { printf("Hello\n"); }
Tested c89, c99, c2x, c23, gnu11, the same result as yours:
std | return value ----------|------------ c89 | 6 /* return value of printf() */ c99 | 0 /* set by compiler with: xorl %eax, %eax */ c2x | 0 gnu11 | 0
I'm not sure how this is currently handled or if the compiler takes care of returning 0 in this case.
void main(void) { printf("Hello\n"); }
Tested c89, c99, c2x, c23, gnu11, the same result as yours, the return value is 6, the return value is the return status of the last called functon.
Thanks, Zhangjin --- [1]: https://lore.kernel.org/lkml/90fdd255-32f4-4caf-90ff-06456b53dac3@t-8ch.de/