Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
Introduction ============
This series is based on the 20230524-nolibc-rv32+stkp4 branch of [3], it includes 3 parts, they work together to add full rv32 support:
* Reverts two old out-of-day patches * Revert "tools/nolibc: riscv: Support __NR_llseek for rv32" * Revert "selftests/nolibc: Fix up compile error for rv32"
(these two and the reverted ones:
* commit 606343b7478c ("selftests/nolibc: Fix up compile error for rv32") * commit d2c3acba6d66 ("tools/nolibc: riscv: Support __NR_llseek for rv32")
can be removed from the git repo completely, there are two new ones to replace them)
* Compile and test support patches * selftests/nolibc: print name instead of number for EOVERFLOW * selftests/nolibc: syscall_args: use __NR_statx for rv32 * --> replace the old one 606343b7478, use statx instead of read * selftests/nolibc: riscv: customize makefile for rv32 * selftests/nolibc: allow specify a bios for qemu * selftests/nolibc: remove the duplicated gettimeofday_bad2
* Fix up some missing syscalls, mainly time32 syscalls * tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32 * --> replace the old one d2c3acba6d66, cleaned up * tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32 * tools/nolibc: ppoll/ppoll_time64: Add a missing argument * tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32 * tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32 * tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32
Compile =======
For rv64:
$ make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
$ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
For rv32:
$ make ARCH=riscv CONFIG_32BIT=1 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
$ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
Testing =======
Environment:
// gcc toolchain $ riscv64-linux-gnu-gcc --version riscv64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
// glibc >= 2.33 required, for older glibc, must upgrade include/bits/wordsize.h $ dpkg -l | grep libc6-dev | grep riscv ii libc6-dev-riscv64-cross 2.31-0ubuntu7cross1
// glibc include/bits/wordsize.h: manually upgraded to >= 2.33 // without this, can not build tools/testing/selftests/nolibc/nolibc-test.c $ cat /usr/riscv64-linux-gnu/include/bits/wordsize.h #if __riscv_xlen == (__SIZEOF_POINTER__ * 8) # define __WORDSIZE __riscv_xlen #else # error unsupported ABI #endif
# define __WORDSIZE_TIME64_COMPAT32 1
#if __WORDSIZE == 32 # define __WORDSIZE32_SIZE_ULONG 0 # define __WORDSIZE32_PTRDIFF_LONG 0 #endif
// higher qemu version is better, latest version is v8.0.0+ $ qemu-system-riscv64 --version QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.18) Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
// opensbi version, higher is better, must match kernel version and qemu version // rv64: used version is 1.2, latest is 1.2 $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1 OpenSBI v1.2-116-g7919530 // rv32: used version is v0.9, latest is 1.2 $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1 OpenSBI v0.9-152-g754d511
For rv64:
$ pwd /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- defconfig $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv64/virt/bsp/bios/opensbi/generic/fw_jump.elf run MKDIR sysroot/riscv/include make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' CC nolibc-test MKDIR initramfs INSTALL initramfs/init make[1]: Entering directory '/labs/linux-lab/src/linux-stable' ... LD vmlinux NM System.map SORTTAB vmlinux OBJCOPY arch/riscv/boot/Image Kernel: arch/riscv/boot/Image is ready make[1]: Leaving directory '/labs/linux-lab/src/linux-stable' 135 test(s) passed. $ file ../../../../vmlinux ../../../../vmlinux: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=b8e1cea5122b04bce540b4022f0d6f171ffe615a, not stripped
For rv32:
$ pwd /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- defconfig $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv32/virt/bsp/bios/opensbi/generic/fw_jump.elf run MKDIR sysroot/riscv/include make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' CC nolibc-test MKDIR initramfs INSTALL initramfs/init make[1]: Entering directory '/labs/linux-lab/src/linux-stable' CALL scripts/checksyscalls.sh GEN usr/initramfs_data.cpio COPY usr/initramfs_inc_data AS usr/initramfs_data.o AR usr/built-in.a GEN security/selinux/flask.h security/selinux/av_permissions.h CC security/selinux/avc.o CC security/selinux/hooks.o CC security/selinux/selinuxfs.o CC security/selinux/nlmsgtab.o CC security/selinux/netif.o CC security/selinux/netnode.o CC security/selinux/netport.o CC security/selinux/status.o CC security/selinux/ss/services.o AR security/selinux/built-in.a AR security/built-in.a AR built-in.a AR vmlinux.a LD vmlinux.o OBJCOPY modules.builtin.modinfo GEN modules.builtin MODPOST vmlinux.symvers UPD include/generated/utsversion.h CC init/version-timestamp.o LD .tmp_vmlinux.kallsyms1 NM .tmp_vmlinux.kallsyms1.syms KSYMS .tmp_vmlinux.kallsyms1.S AS .tmp_vmlinux.kallsyms1.S LD .tmp_vmlinux.kallsyms2 NM .tmp_vmlinux.kallsyms2.syms KSYMS .tmp_vmlinux.kallsyms2.S AS .tmp_vmlinux.kallsyms2.S LD vmlinux NM System.map SORTTAB vmlinux OBJCOPY arch/riscv/boot/Image Kernel: arch/riscv/boot/Image is ready make[1]: Leaving directory '/labs/linux-lab/src/linux-stable' 135 test(s) passed. $ file ../../../../vmlinux ../../../../vmlinux: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=bad4c1f3899f47355d2a2010bade56972fd94b9d, not stripped
The full rv64 testing result (run.out) is uploaded at [4]. The full rv32 testing result (run.out) is uploaded at [5].
That's all, thanks!
Best regards, Zhangjin Wu ---
[1]: https://lore.kernel.org/linux-riscv/20230520143154.68663-1-falcon@tinylab.or... [2]: https://lore.kernel.org/linux-riscv/20230520135235.68155-1-falcon@tinylab.or... [3]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git [4]: https://pastebin.com/3L0nV78u [5]: https://pastebin.com/RadrXdta
Zhangjin Wu (13): Revert "tools/nolibc: riscv: Support __NR_llseek for rv32" Revert "selftests/nolibc: Fix up compile error for rv32" selftests/nolibc: print name instead of number for EOVERFLOW selftests/nolibc: syscall_args: use __NR_statx for rv32 selftests/nolibc: riscv: customize makefile for rv32 selftests/nolibc: allow specify a bios for qemu selftests/nolibc: remove the duplicated gettimeofday_bad2 tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32 tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32 tools/nolibc: ppoll/ppoll_time64: Add a missing argument tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32 tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32 tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32
tools/include/nolibc/std.h | 1 + tools/include/nolibc/sys.h | 135 +++++++++++++++++-- tools/include/nolibc/types.h | 21 ++- tools/testing/selftests/nolibc/Makefile | 14 +- tools/testing/selftests/nolibc/nolibc-test.c | 15 ++- 5 files changed, 167 insertions(+), 19 deletions(-)
This reverts commit d2c3acba6d66 to prepare for a whole new patch later.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/std.h | 1 - tools/include/nolibc/sys.h | 19 ------------------- 2 files changed, 20 deletions(-)
diff --git a/tools/include/nolibc/std.h b/tools/include/nolibc/std.h index 83c0b0cb9564..933bc0be7e1c 100644 --- a/tools/include/nolibc/std.h +++ b/tools/include/nolibc/std.h @@ -32,6 +32,5 @@ typedef signed long off_t; typedef signed long blksize_t; typedef signed long blkcnt_t; typedef signed long time_t; -typedef long long loff_t;
#endif /* _NOLIBC_STD_H */ diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 7874062bea95..d5792a5de70b 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -671,26 +671,7 @@ int link(const char *old, const char *new) static __attribute__((unused)) off_t sys_lseek(int fd, off_t offset, int whence) { -#ifdef __NR_lseek return my_syscall3(__NR_lseek, fd, offset, whence); -#elif defined(__NR_llseek) - loff_t res; - off_t retval; - - int rc = my_syscall5(__NR_llseek, fd, (long) (((uint64_t) (offset)) >> 32), (long) offset, &res, whence); - - if (rc) - return rc; - - retval = (off_t) res; - if (retval == res) - return retval; - - SET_ERRNO(EOVERFLOW); - return (off_t) -1; -#else -#error Neither __NR_lseek nor __NR_llseek defined, cannot implement sys_lseek() -#endif }
static __attribute__((unused))
This reverts commit 606343b7478 to prepare for a whole new patch later.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/nolibc-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 6e0a4dbe321e..c570bb848c1a 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -596,7 +596,7 @@ int run_syscall(int min, int max) CASE_TEST(write_badf); EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break; CASE_TEST(write_zero); EXPECT_SYSZR(1, write(1, &tmp, 0)); break; CASE_TEST(syscall_noargs); EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break; - CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_read, -1, &tmp, 1), -1, EBADF); break; + CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break; case __LINE__: return ret; /* must be last */ /* note: do not set any defaults so as to permit holes above */
EOVERFLOW will be used in the coming time64 syscalls support.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/nolibc-test.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index c570bb848c1a..227ef2a3ebba 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -106,6 +106,7 @@ const char *errorname(int err) CASE_ERR(EDOM); CASE_ERR(ERANGE); CASE_ERR(ENOSYS); + CASE_ERR(EOVERFLOW); default: return itoa(err); }
On 2023-05-25 01:46:54+0800, Zhangjin Wu wrote:
EOVERFLOW will be used in the coming time64 syscalls support.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
Reviewed-by: Thomas Weißschuh linux@weissschuh.net
tools/testing/selftests/nolibc/nolibc-test.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index c570bb848c1a..227ef2a3ebba 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -106,6 +106,7 @@ const char *errorname(int err) CASE_ERR(EDOM); CASE_ERR(ERANGE); CASE_ERR(ENOSYS);
- CASE_ERR(EOVERFLOW); default: return itoa(err); }
-- 2.25.1
When compile nolibc-test.c for rv32, we got such error:
tools/testing/selftests/nolibc/nolibc-test.c:599:57: error: ‘__NR_fstat’ undeclared (first use in this function) 599 | CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
The generic include/uapi/asm-generic/unistd.h used by rv32 doesn't support __NR_fstat, use __NR_statx instead:
Running test 'syscall' 69 syscall_noargs = 1 [OK] 70 syscall_args = -1 EFAULT [OK]
As tools/include/nolibc/sys.h shows, __NR_statx is either not supported by all platforms, so, both __NR_fstat and __NR_statx are required.
Btw, the latest riscv libc6-dev package is required, otherwise, we would also get such error:
In file included from /usr/riscv64-linux-gnu/include/sys/cdefs.h:452, from /usr/riscv64-linux-gnu/include/features.h:461, from /usr/riscv64-linux-gnu/include/bits/libc-header-start.h:33, from /usr/riscv64-linux-gnu/include/limits.h:26, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:194, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/syslimits.h:7, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:34, from /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/nolibc-test.c:6: /usr/riscv64-linux-gnu/include/bits/wordsize.h:28:3: error: #error "rv32i-based targets are not supported" 28 | # error "rv32i-based targets are not supported"
The glibc commit 5b6113d62efa ("RISC-V: Support the 32-bit ABI implementation") fixed up above error, so, glibc >= 2.33 (who includes this commit) is required.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/nolibc-test.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 227ef2a3ebba..c86ff6018c7f 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -501,6 +501,17 @@ static int test_fork(void) } }
+static int test_syscall_args(void) +{ +#ifdef __NR_fstat + return syscall(__NR_fstat, 0, NULL); +#elif defined(__NR_statx) + return syscall(__NR_statx, 0, NULL, 0, 0, NULL); +#else +#error Neither __NR_fstat nor __NR_statx defined, cannot implement syscall_args test +#endif +} + /* Run syscall tests between IDs <min> and <max>. * Return 0 on success, non-zero on failure. */ @@ -597,7 +608,7 @@ int run_syscall(int min, int max) CASE_TEST(write_badf); EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break; CASE_TEST(write_zero); EXPECT_SYSZR(1, write(1, &tmp, 0)); break; CASE_TEST(syscall_noargs); EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break; - CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break; + CASE_TEST(syscall_args); EXPECT_SYSER(1, test_syscall_args(), -1, EFAULT); break; case __LINE__: return ret; /* must be last */ /* note: do not set any defaults so as to permit holes above */
On 2023-05-25 01:48:11+0800, Zhangjin Wu wrote:
When compile nolibc-test.c for rv32, we got such error:
tools/testing/selftests/nolibc/nolibc-test.c:599:57: error: ‘__NR_fstat’ undeclared (first use in this function) 599 | CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
The generic include/uapi/asm-generic/unistd.h used by rv32 doesn't support __NR_fstat, use __NR_statx instead:
Running test 'syscall' 69 syscall_noargs = 1 [OK] 70 syscall_args = -1 EFAULT [OK]
As tools/include/nolibc/sys.h shows, __NR_statx is either not supported by all platforms, so, both __NR_fstat and __NR_statx are required.
Btw, the latest riscv libc6-dev package is required, otherwise, we would also get such error:
In file included from /usr/riscv64-linux-gnu/include/sys/cdefs.h:452, from /usr/riscv64-linux-gnu/include/features.h:461, from /usr/riscv64-linux-gnu/include/bits/libc-header-start.h:33, from /usr/riscv64-linux-gnu/include/limits.h:26, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:194, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/syslimits.h:7, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:34, from /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/nolibc-test.c:6: /usr/riscv64-linux-gnu/include/bits/wordsize.h:28:3: error: #error "rv32i-based targets are not supported" 28 | # error "rv32i-based targets are not supported"
The glibc commit 5b6113d62efa ("RISC-V: Support the 32-bit ABI implementation") fixed up above error, so, glibc >= 2.33 (who includes this commit) is required.
It seems weird to require limits.h from the system libc at all.
The only thing used from there are INT_MAX and INT_MIN. Instead we could define our own versions of INT_MAX and INT_MIN in stdint.h.
#ifndef INT_MAX #define INT_MAX __INT_MAX__ #endif
#ifndef INT_MIN #define INT_MIN (- __INT_MAX__ - 1) #endif
Thomas
Hi, Thomas
On 2023-05-25 01:48:11+0800, Zhangjin Wu wrote:
When compile nolibc-test.c for rv32, we got such error:
tools/testing/selftests/nolibc/nolibc-test.c:599:57: error: ‘__NR_fstat’ undeclared (first use in this function) 599 | CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
The generic include/uapi/asm-generic/unistd.h used by rv32 doesn't support __NR_fstat, use __NR_statx instead:
Running test 'syscall' 69 syscall_noargs = 1 [OK] 70 syscall_args = -1 EFAULT [OK]
As tools/include/nolibc/sys.h shows, __NR_statx is either not supported by all platforms, so, both __NR_fstat and __NR_statx are required.
Btw, the latest riscv libc6-dev package is required, otherwise, we would also get such error:
In file included from /usr/riscv64-linux-gnu/include/sys/cdefs.h:452, from /usr/riscv64-linux-gnu/include/features.h:461, from /usr/riscv64-linux-gnu/include/bits/libc-header-start.h:33, from /usr/riscv64-linux-gnu/include/limits.h:26, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:194, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/syslimits.h:7, from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:34, from /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/nolibc-test.c:6: /usr/riscv64-linux-gnu/include/bits/wordsize.h:28:3: error: #error "rv32i-based targets are not supported" 28 | # error "rv32i-based targets are not supported"
The glibc commit 5b6113d62efa ("RISC-V: Support the 32-bit ABI implementation") fixed up above error, so, glibc >= 2.33 (who includes this commit) is required.
It seems weird to require limits.h from the system libc at all.
The only thing used from there are INT_MAX and INT_MIN. Instead we could define our own versions of INT_MAX and INT_MIN in stdint.h.
#ifndef INT_MAX #define INT_MAX __INT_MAX__ #endif
#ifndef INT_MIN #define INT_MIN (- __INT_MAX__ - 1) #endif
Just verified and prepared a patch, it did work perfectly, thanks.
The above commit message exactly the error info will be cleaned up in v2.
Best regards, Zhangjin
Thomas
On Wed, May 24, 2023, at 19:48, Zhangjin Wu wrote:
+static int test_syscall_args(void) +{ +#ifdef __NR_fstat
- return syscall(__NR_fstat, 0, NULL);
+#elif defined(__NR_statx)
- return syscall(__NR_statx, 0, NULL, 0, 0, NULL);
+#else +#error Neither __NR_fstat nor __NR_statx defined, cannot implement syscall_args test +#endif +}
Does this need to work on old kernels? My impression was that this is only intended to be used with the kernel that ships the copy, so you can just rely on statx to be defined and drop the old fallbacks (same as for pselect6_time64 as I commented).
Arnd
Hi Arnd,
On Fri, May 26, 2023 at 11:21:02AM +0200, Arnd Bergmann wrote:
On Wed, May 24, 2023, at 19:48, Zhangjin Wu wrote:
+static int test_syscall_args(void) +{ +#ifdef __NR_fstat
- return syscall(__NR_fstat, 0, NULL);
+#elif defined(__NR_statx)
- return syscall(__NR_statx, 0, NULL, 0, 0, NULL);
+#else +#error Neither __NR_fstat nor __NR_statx defined, cannot implement syscall_args test +#endif +}
Does this need to work on old kernels? My impression was that this is only intended to be used with the kernel that ships the copy, so you can just rely on statx to be defined and drop the old fallbacks (same as for pselect6_time64 as I commented).
Yes, as much as possible this is desirable because the purpose is clearly to simplify the build of applications. The reason is that some applications might want to run as well on older kernels, but may miss some syscalls on the nolibc shipped with these older ones. Since the project is quite young, it lags a lot behind what each kernel supports, so it's expected that users will take the most recent nolibc version to benefit from support of syscalls that were already present in older ones.
The alternative would be to take the project out of the kernel as it was before but this would likely complicate contributions.
However the selftest code is clearly specific to a kernel IMHO, since its goal is to be used to check that the shipped version of nolibc works on this kernel.
As such, my view on this is that as long as it doesn't require unreasonable efforts to support older kernels for the userland code, we should try. If sometimes it's a big pain, we should not insist too much, but at least making sure that only the feature in question will not work would be desirable.
Thanks, Willy
Hi, Arnd
On Wed, May 24, 2023, at 19:48, Zhangjin Wu wrote:
+static int test_syscall_args(void) +{ +#ifdef __NR_fstat
- return syscall(__NR_fstat, 0, NULL);
+#elif defined(__NR_statx)
- return syscall(__NR_statx, 0, NULL, 0, 0, NULL);
+#else +#error Neither __NR_fstat nor __NR_statx defined, cannot implement syscall_args test +#endif +}
Does this need to work on old kernels? My impression was that this is only intended to be used with the kernel that ships the copy, so you can just rely on statx to be defined and drop
Willy suggested this before, besides the generic unistd.h users, I only found one __NR_statx in arm64 and forgot the ones in the *.tbl files:
$ grep -n statx -ur arch/ | cut -d ':' -f1,2 | tr ':' ' ' | awk '{printf("git blame -L %s,%s %s\n",$2,$2,$1);}' | bash cabcebd33b8b8 (Firoz Khan 2018-11-13 15:01:52 +0530 453) 522 common statx sys_statx a1016e94cce9f (Russell King 2017-03-09 17:14:32 +0000 414) 397 common statx sys_statx 7349ee3a97edb (Arnd Bergmann 2018-12-30 22:25:07 +0100 338) 326 common statx sys_statx fd81414666cf6 (Firoz Khan 2018-11-13 11:30:28 +0530 389) 379 common statx sys_statx 9bcbf97c62931 (Firoz Khan 2018-12-13 14:37:38 +0530 341) 330 n32 statx sys_statx 9bcbf97c62931 (Firoz Khan 2018-12-13 14:37:38 +0530 337) 326 n64 statx sys_statx 9bcbf97c62931 (Firoz Khan 2018-12-13 14:37:38 +0530 380) 366 o32 statx sys_statx 85e69701f58c9 (Firoz Khan 2018-09-09 07:22:50 +0530 395) 349 common statx sys_statx 90856087daca9 (Arnd Bergmann 2019-01-16 14:15:23 +0100 389) 379 common statx sys_statx sys_statx d25a122afd437 (Arnd Bergmann 2018-12-30 23:04:30 +0100 393) 383 common statx sys_statx 6ff645dd683af (Firoz Khan 2018-11-14 10:56:30 +0530 433) 360 common statx sys_statx fc06bac35c8c7 (Firoz Khan 2018-11-13 11:34:33 +0530 408) 398 common statx sys_statx aff8503932004 (Firoz Khan 2018-12-17 16:10:34 +0530 474) 383 nospu statx sys_statx a845a6cf1dad1 (Brian Gerst 2020-03-13 15:51:39 -0400 397) 383 i386 statx sys_statx cab56d3484d4b (Brian Gerst 2020-03-13 15:51:38 -0400 343) 332 common statx sys_statx c7914ef69dbb3 (Firoz Khan 2018-11-13 15:49:29 +0530 374) 351 common statx sys_statx 713cc9df6473f (Will Deacon 2017-03-21 18:04:26 +0000 807) #define __NR_statx 397 713cc9df6473f (Will Deacon 2017-03-21 18:04:26 +0000 808) __SYSCALL(__NR_statx, sys_statx)
(2020: x86 changed the format of the *.tbl) (2019: s390 changed the format of the *.tbl)
$ grep -n asm-generic/unistd.h -ur arch/ | cut -d ':' -f1,2 | tr ':' ' ' | awk '{printf("git blame -L %s,%s %s\n",$2,$2,$1);}' | bash 4adeefe161a74 arch/arc/include/asm/unistd.h (Vineet Gupta 2013-01-18 15:12:18 +0530 31) #include <asm-generic/unistd.h> 91e040a79df73 (Vineet Gupta 2016-10-20 07:39:45 -0700 35) /* Generic syscall (fs/filesystems.c - lost in asm-generic/unistd.h */ 4262a727621ce (David Howells 2012-10-11 11:05:13 +0100 25) #include <asm-generic/unistd.h> 4859bfca11c7d (Guo Ren 2018-09-05 14:25:08 +0800 9) #include <asm-generic/unistd.h> b9398a84590be arch/hexagon/include/asm/unistd.h (Richard Kuo 2011-10-31 18:35:16 -0500 40) #include <asm-generic/unistd.h> 1000197d80132 (Ley Foon Tan 2014-11-06 15:19:57 +0800 27) #include <asm-generic/unistd.h> 09abb90107202 arch/openrisc/include/asm/unistd.h (Jonas Bonn 2011-06-04 22:26:51 +0300 30) #include <asm-generic/unistd.h> 27f8899d6002e (David Abdurachmanov 2018-11-08 20:02:39 +0100 26) #include <asm-generic/unistd.h> be769645a2aef (Huacai Chen 2022-05-31 18:04:11 +0800 5) #include <asm-generic/unistd.h>
(2022: the year loongarch added)
$ grep -n statx, -ur fs/stat.c 677:SYSCALL_DEFINE5(statx, $ git blame -L 677,677 fs/stat.c a528d35e8bfcc (David Howells 2017-01-31 16:46:22 +0000 677) SYSCALL_DEFINE5(statx,
So, statx has been added from v4.10 (2017, a528d35e8bfcc) and the last arch support is at least from v4.20 (2018, aff8503932004), perhaps it is safe enough to only reserve the statx now, will simply replace fstat with statx in the coming new revision of this patch, thanks very much.
the old fallbacks (same as for pselect6_time64 as I commented).
Did the same search for this:
$ grep -n pselect6_time64 -ur arch/ | cut -d ':' -f1,2 | tr ':' ' ' | awk '{printf("git blame -L %s,+1 %s\n",$2,$1);}' | bash 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 430) 413 common pselect6_time64 sys_pselect6 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 416) 413 common pselect6_time64 sys_pselect6 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 355) 413 n32 pselect6_time64 compat_sys_pselect6_time64 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 404) 413 o32 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 414) 413 32 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 419) 413 32 pselect6_time64 - compat_sys_pselect6_time64 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 419) 413 common pselect6_time64 sys_pselect6 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 462) 413 32 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 422) 413 common pselect6_time64 sys_pselect6 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 503) 413 32 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64 cab56d3484d4b (Brian Gerst 2020-03-13 15:51:38 -0400 421) 413 i386 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 387) 413 common pselect6_time64 sys_pselect6 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 838) #define __NR_pselect6_time64 413 48166e6ea47d2 (Arnd Bergmann 2019-01-10 12:45:11 +0100 839) __SYSCALL(__NR_pselect6_time64, compat_sys_pselect6_time64)
(not sure if we have missed some archs)
$ grep -n pselect6_time64, fs/select.c 1368:COMPAT_SYSCALL_DEFINE6(pselect6_time64, int, n, compat_ulong_t __user *, inp, $ git blame -L 1368,+1 fs/select.c e024707bccae1 (Deepa Dinamani 2018-09-19 21:41:07 -0700 1368) COMPAT_SYSCALL_DEFINE6(pselect6_time64, int, n, compat_ulong_t __user *, inp,
pselect6_time64 has been added from v4.20 and the last arch support is at least from v5.0.0.
As we discussed in another reply, will add pselect6_time64 at first and reserve the drop patch of the already added oldselect, newselect in the future.
Thanks very much, Zhangjin
Arnd
both riscv64 and riscv32 have the same ARCH value, it is riscv, the default defconfig enables 64bit, to support riscv32, let's allow pass "ARCH=riscv32" or "ARCH=riscv CONFIG_32BIT=1" to customize riscv32 setting.
Note: glibc >= 2.33 is required to avoid a bug of the old bits/wordsize.h.
/usr/riscv64-linux-gnu/include/bits/wordsize.h:28:3: error: #error "rv32i-based targets are not supported" 28 | # error "rv32i-based targets are not supported"
This can not pass compile, several time64 syscalls are still missing.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/Makefile | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index 47c3c89092e4..9adc8944dd80 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -14,6 +14,12 @@ include $(srctree)/scripts/subarch.include ARCH = $(SUBARCH) endif
+# Allow pass ARCH=riscv|riscv32|riscv64, riscv implies riscv64 +ifneq ($(findstring xriscv,x$(ARCH)),) + CONFIG_32BIT := $(if $(findstring 32x,$(ARCH)x),1) + override ARCH := riscv +endif + # kernel image names by architecture IMAGE_i386 = arch/x86/boot/bzImage IMAGE_x86_64 = arch/x86/boot/bzImage @@ -34,7 +40,7 @@ DEFCONFIG_x86 = defconfig DEFCONFIG_arm64 = defconfig DEFCONFIG_arm = multi_v7_defconfig DEFCONFIG_mips = malta_defconfig -DEFCONFIG_riscv = defconfig +DEFCONFIG_riscv = $(if $(CONFIG_32BIT),rv32_defconfig,defconfig) DEFCONFIG_s390 = defconfig DEFCONFIG_loongarch = defconfig DEFCONFIG = $(DEFCONFIG_$(ARCH)) @@ -49,7 +55,7 @@ QEMU_ARCH_x86 = x86_64 QEMU_ARCH_arm64 = aarch64 QEMU_ARCH_arm = arm QEMU_ARCH_mips = mipsel # works with malta_defconfig -QEMU_ARCH_riscv = riscv64 +QEMU_ARCH_riscv = $(if $(CONFIG_32BIT),riscv32,riscv64) QEMU_ARCH_s390 = s390x QEMU_ARCH_loongarch = loongarch64 QEMU_ARCH = $(QEMU_ARCH_$(ARCH)) @@ -76,6 +82,7 @@ else Q=@ endif
+CFLAGS_riscv = $(if $(CONFIG_32BIT),-march=rv32im -mabi=ilp32) CFLAGS_s390 = -m64 CFLAGS_STACKPROTECTOR ?= $(call cc-option,-mstack-protector-guard=global $(call cc-option,-fstack-protector-all)) CFLAGS ?= -Os -fno-ident -fno-asynchronous-unwind-tables -std=c89 \
On 2023-05-25 01:50:57+0800, Zhangjin Wu wrote:
both riscv64 and riscv32 have the same ARCH value, it is riscv, the default defconfig enables 64bit, to support riscv32, let's allow pass "ARCH=riscv32" or "ARCH=riscv CONFIG_32BIT=1" to customize riscv32 setting.
What's the advantage of doing CONFIG_32BIT? For i386/x86_64, arm/arm64 it's not necessary either. (Let's ignore the "x86" case)
If for riscv the "normal" version is riscv64 then adding a new "riscv32" that works the same as the other architectures seems nicer and easier.
Hi, Thomas
On 2023-05-25 01:50:57+0800, Zhangjin Wu wrote:
both riscv64 and riscv32 have the same ARCH value, it is riscv, the default defconfig enables 64bit, to support riscv32, let's allow pass "ARCH=riscv32" or "ARCH=riscv CONFIG_32BIT=1" to customize riscv32 setting.
What's the advantage of doing CONFIG_32BIT? For i386/x86_64, arm/arm64 it's not necessary either. (Let's ignore the "x86" case)
Very good question, thanks.
This requirement may happen on mips, loongarch and even powerpc too, both x86 and arm are just the 'exception'.
It is 'designed' as a temp flag/variable to specifiy that current arch is riscv32, not riscv64, of course, we can use something like this or even use a meaningless 't' variable:
# Allow pass ARCH=riscv|riscv32|riscv64, riscv implies riscv64 ifneq ($(findstring xriscv,x$(ARCH)),) riscv32 := $(if $(findstring riscv32x,$(ARCH)x),1) override ARCH := riscv endif
Using CONFIG_32BIT instead of riscv32 has some extra considerations:
* Using it in command line is a 'side effect', if it is a meaningless variable, we will not use it for we can not remember it, here use it as a choice, riscv32 is enough, we can simply remove this comment from the commit message.
* The architectures like riscv, mips, loongarch share the same source code tree between 32bit and 64bit and even 128bit in the future, x86 and arm just not do so.
The ARCH specified here differs from the one to kernel make, we should provide a flag/variable or anther ARCH variant to show the difference, _ARCH or XARCH have been used in my local patch.
CONFIG_32BIT is meaningful to reflect the difference, even for future, we can use the same thing CONFIG_64BIT, CONFIG_128BIT, so simply BITS=32, BITS=64, BITS=128, but that is hard to be used as a oneline condition statement (although we can use something like findstring).
$(if $(findstring x32y,x$(BITS)y),something whatevever)
v.s.
$(if $(CONFIG_32BIT),something whatevever)
If not use a tmp flag/variable, this works too, but duplicated :-)
$(if $(findstring xriscv32y,x$(ARCH)y),something whatevever)
* We are able to auto detect this config from include/config/auto.conf, there will be something like CONFIG_32BIT=y there.
we did use such auto detect logic in my local patch, but it has some issues if we want a riscv64 build after we did a riscv32 config if we only pass ARCH=riscv, so, I just removed that logic but reserved the pontential for future.
If for riscv the "normal" version is riscv64 then adding a new "riscv32" that works the same as the other architectures seems nicer and easier.
The complexity here is what just explained above: The ARCH specified here differs from the one to kernel make.
It is ok to add riscv32 like the other architectures as following:
ifeq ($(ARCH),riscv32) _ARCH := riscv else _ARCH := $(ARCH) endif
IMAGE_riscv32 = arch/riscv/boot/Image DEFCONFIG_riscv32 = rv32_defconfig QEMU_ARCH_riscv32 = riscv32 QEMU_ARGS_riscv32 = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" CFLAGS_riscv32 = -march=rv32im -mabi=ilp32
And all of the other 'ARCH' variables passed to kernel 'make' should be changed to $(_ARCH), include most of the core targets, like:
sysroot/$(ARCH)/include: $(Q)rm -rf sysroot/$(ARCH) sysroot/sysroot $(QUIET_MKDIR)mkdir -p sysroot $(Q)$(MAKE) -C ../../../include/nolibc ARCH=$(_ARCH) OUTPUT=$(CURDIR)/sysroot/ headers_standalone $(Q)mv sysroot/sysroot sysroot/$(ARCH)
defconfig: $(Q)$(MAKE) -C $(srctree) ARCH=$(_ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
kernel: initramfs $(Q)$(MAKE) -C $(srctree) ARCH=$(_ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) $(IMAGE_NAME) CONFIG_INITRAMFS_SOURCE=$(CURDIR)/initramfs
It is not that easier, it touched more source code and make things a little complex, we must mixes the using of ARCH and _ARCH in the whole Makefile and that is not comfortable and may introduce more complexity, for example, we may be worry about if the directories should be named with the new $(_ARCH) ;-)
And CONFIG_32BIT variable is better than riscv32, because, we can share this meaningful variable among mips, loongarch in the future if their maintainers want to add more variants support for such platforms, they will meet the same requirement.
Thanks very much.
Best regards, Zhangjin
riscv qemu has a builtin bios (opensbi), but it may not match the latest kernel and some old versions may hang during boot, let's allow user pass a newer version to qemu via the -bios option.
we can use it like this:
$ make run BIOS=/path/to/new-bios ...
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index 9adc8944dd80..9213763ab3b6 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -70,7 +70,8 @@ QEMU_ARGS_mips = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)" -QEMU_ARGS = $(QEMU_ARGS_$(ARCH)) +QEMU_ARGS_BIOS = $(if $(BIOS),-bios $(BIOS)) +QEMU_ARGS = $(QEMU_ARGS_$(ARCH)) $(QEMU_ARGS_BIOS)
# OUTPUT is only set when run from the main makefile, otherwise # it defaults to this nolibc directory.
On 2023-05-25 01:52:29+0800, Zhangjin Wu wrote:
riscv qemu has a builtin bios (opensbi), but it may not match the latest kernel and some old versions may hang during boot, let's allow user pass a newer version to qemu via the -bios option.
Nitpick:
This seems very specific and hopefully only necessary temporarily.
Instead it could be changed to some generic mechanim like "QEMU_ARGS_EXTRA"?
we can use it like this:
$ make run BIOS=/path/to/new-bios ...
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index 9adc8944dd80..9213763ab3b6 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -70,7 +70,8 @@ QEMU_ARGS_mips = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)" -QEMU_ARGS = $(QEMU_ARGS_$(ARCH)) +QEMU_ARGS_BIOS = $(if $(BIOS),-bios $(BIOS)) +QEMU_ARGS = $(QEMU_ARGS_$(ARCH)) $(QEMU_ARGS_BIOS) # OUTPUT is only set when run from the main makefile, otherwise
# it defaults to this nolibc directory.
2.25.1
Hi, Thomas
On 2023-05-25 01:52:29+0800, Zhangjin Wu wrote:
riscv qemu has a builtin bios (opensbi), but it may not match the latest kernel and some old versions may hang during boot, let's allow user pass a newer version to qemu via the -bios option.
Nitpick:
This seems very specific and hopefully only necessary temporarily.
RISC-V is such a new ISA and the Spec (especially the SBI) changes very frequently ;-)
Instead it could be changed to some generic mechanim like "QEMU_ARGS_EXTRA"?
Good point, will apply it.
Thanks, Zhangjin
we can use it like this:
$ make run BIOS=/path/to/new-bios ...
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/testing/selftests/nolibc/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile index 9adc8944dd80..9213763ab3b6 100644 --- a/tools/testing/selftests/nolibc/Makefile +++ b/tools/testing/selftests/nolibc/Makefile @@ -70,7 +70,8 @@ QEMU_ARGS_mips = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)" QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)" -QEMU_ARGS = $(QEMU_ARGS_$(ARCH)) +QEMU_ARGS_BIOS = $(if $(BIOS),-bios $(BIOS)) +QEMU_ARGS = $(QEMU_ARGS_$(ARCH)) $(QEMU_ARGS_BIOS) # OUTPUT is only set when run from the main makefile, otherwise # it defaults to this nolibc directory.
On Fri, May 26, 2023 at 06:25:18PM +0800, Zhangjin Wu wrote:
On 2023-05-25 01:52:29+0800, Zhangjin Wu wrote:
riscv qemu has a builtin bios (opensbi), but it may not match the latest kernel and some old versions may hang during boot, let's allow user pass a newer version to qemu via the -bios option.
Nitpick:
This seems very specific and hopefully only necessary temporarily.
RISC-V is such a new ISA and the Spec (especially the SBI) changes very frequently ;-)
Huh. Could you please expand on which versions of QEMU will hang while booting an upstream or stable kernel? Which kernels would be good to know too.
Thanks, Conor.
Hi, Conor.
On Fri, May 26, 2023 at 06:25:18PM +0800, Zhangjin Wu wrote:
On 2023-05-25 01:52:29+0800, Zhangjin Wu wrote:
riscv qemu has a builtin bios (opensbi), but it may not match the latest kernel and some old versions may hang during boot, let's allow user pass a newer version to qemu via the -bios option.
Nitpick:
This seems very specific and hopefully only necessary temporarily.
RISC-V is such a new ISA and the Spec (especially the SBI) changes very frequently ;-)
Huh. Could you please expand on which versions of QEMU will hang while booting an upstream or stable kernel? Which kernels would be good to know too.
As the cover letter listed (in the Environment section), the softwares we used are:
// higher qemu version is better, latest version is v8.0.0+ $ qemu-system-riscv64 --version QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.18) Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
// opensbi version, higher is better, must match kernel version and qemu version // rv64: used version is 1.2, latest is 1.2 $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1 OpenSBI v1.2-116-g7919530 // rv32: used version is v0.9, latest is 1.2 $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1 OpenSBI v0.9-152-g754d511
The kernel version is the one this patchset based on (Willy's nolibc repo), it is v6.4-rc1.
qemu v4.2.1 is the one systematically installed (/usr/bin) from the qemu-system-misc package and used to test this patchset in my Ubuntu 20.04 based test docker image.
Just installed a v7.0.0 qemu from ppa:canonical-server/server-backports, there is no default opensbi, and re-checked, there is one prebuilt opensbi for rv64, but still no prebuilt opensbi for rv32.
$ sudo add-apt-repository ppa:canonical-server/server-backports $ sudo apt install qemu-system-misc
$ sudo apt install opensbi $ dpkg -S opensbi | grep -E "fw_*bin|elf" qemu-system-data: /usr/share/qemu/opensbi-riscv64-generic-fw_dynamic.elf opensbi: /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_dynamic.elf opensbi: /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf
$ qemu-system-riscv32 -display none -no-reboot -kernel build/riscv32/virt/linux/v6.4-rc1/arch/riscv/boot/Image -serial stdio -M virt -append "console=ttyS0 panic=-1" qemu-system-riscv32: Unable to load the RISC-V firmware "opensbi-riscv32-generic-fw_dynamic.bin"
I used the one built myself, If not want to build such opensbi manually, we can download one (1.2 currently) from qemu source code:
https://gitlab.com/qemu-project/qemu/-/blob/master/pc-bios/opensbi-riscv32-g...
Then, we can use it like this:
$ qemu-system-riscv32 -display none -no-reboot -kernel build/riscv32/virt/linux/v6.4-rc1/arch/riscv/boot/Image -serial stdio -M virt -append "console=ttyS0 panic=-1" -bios /path/to/opensbi-riscv32-generic-fw_dynamic.bin
Will append this extra info in the commit message of the coming new revision of this patch, thanks a lot.
The hang issue I mentioned may be using one of my older prebuilt version of opensbi, I can not find which one it exactly is, so, please ignore that info, will update that description too.
Btw, something not about this patch: qemu v8.0.0 seems not boot non-mmu v6.3, both sides have issues, not dig into it carefully, so, not report it yet.
Thanks, Zhangjin
Thanks, Conor.
On Fri, May 26, 2023 at 09:38:25PM +0800, Zhangjin Wu wrote:
Hi, Conor.
On Fri, May 26, 2023 at 06:25:18PM +0800, Zhangjin Wu wrote:
On 2023-05-25 01:52:29+0800, Zhangjin Wu wrote:
riscv qemu has a builtin bios (opensbi), but it may not match the latest kernel and some old versions may hang during boot, let's allow user pass a newer version to qemu via the -bios option.
Nitpick:
This seems very specific and hopefully only necessary temporarily.
RISC-V is such a new ISA and the Spec (especially the SBI) changes very frequently ;-)
Huh. Could you please expand on which versions of QEMU will hang while booting an upstream or stable kernel? Which kernels would be good to know too.
As the cover letter listed (in the Environment section), the softwares we used are:
Not super interested in those ones since they work ;)
The kernel version is the one this patchset based on (Willy's nolibc repo), it is v6.4-rc1.
qemu v4.2.1 is the one systematically installed (/usr/bin) from the qemu-system-misc package and used to test this patchset in my Ubuntu 20.04 based test docker image.
Okay, in the context of RISC-V, that is pretty ancient ;)
Just installed a v7.0.0 qemu from ppa:canonical-server/server-backports, there is no default opensbi, and re-checked, there is one prebuilt opensbi for rv64, but still no prebuilt opensbi for rv32.
Ah, I see.
The hang issue I mentioned may be using one of my older prebuilt version of opensbi, I can not find which one it exactly is, so, please ignore that info, will update that description too.
Okay. If you do manage to reproduce it, LMK! I was/am just worried we have some regressions because you should be able to keep booting with those older opensbi versions, modulo some Kconfig changes - although if it is something like qemu 4.2.1 specific I don't think I care all that much about dinosaurs ;)
Btw, something not about this patch: qemu v8.0.0 seems not boot non-mmu v6.3, both sides have issues, not dig into it carefully, so, not report it yet.
Cool. Feel free to CC me on whatever you discover. nommu gets little enough testing in mainline, and even less in stable kernels. That reminds me, I do need to add 32-bit nommu to the patchwork automation for linux-riscv.
Thanks, Conor.
On Fri, May 26, 2023 at 09:00:02AM +0200, Thomas Weißschuh wrote:
On 2023-05-25 01:52:29+0800, Zhangjin Wu wrote:
riscv qemu has a builtin bios (opensbi), but it may not match the latest kernel and some old versions may hang during boot, let's allow user pass a newer version to qemu via the -bios option.
Nitpick:
This seems very specific and hopefully only necessary temporarily.
Instead it could be changed to some generic mechanim like "QEMU_ARGS_EXTRA"?
FWIW I, too, think that QEMU_ARGS_EXTRA could be very convenient for various test cases on any architecture (change number of CPUs, RAM size, boot options etc).
Willy
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/testing/selftests/nolibc/nolibc-test.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index c86ff6018c7f..a9c07018ac9d 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -575,7 +575,6 @@ int run_syscall(int min, int max) #ifdef NOLIBC CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; - CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; #endif CASE_TEST(getpagesize); EXPECT_SYSZR(1, test_getpagesize()); break; CASE_TEST(ioctl_tiocinq); EXPECT_SYSZR(1, ioctl(0, TIOCINQ, &tmp)); break;
riscv uses the generic include/uapi/asm-generic/unistd.h, it has code like this:
#if __BITS_PER_LONG == 64 && !defined(__SYSCALL_COMPAT) #define __NR_lseek __NR3264_lseek #else #define __NR_llseek __NR3264_lseek #endif
There is no __NR_lseek for rv32, use __NR_llseek instead.
This code is based on sysdeps/unix/sysv/linux/lseek.c of glibc.
Signed-off-by: Zhangjin Wu falcon@tinylab.org Signed-off-by: Willy Tarreau w@1wt.eu --- tools/include/nolibc/std.h | 1 + tools/include/nolibc/sys.h | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+)
diff --git a/tools/include/nolibc/std.h b/tools/include/nolibc/std.h index 933bc0be7e1c..83c0b0cb9564 100644 --- a/tools/include/nolibc/std.h +++ b/tools/include/nolibc/std.h @@ -32,5 +32,6 @@ typedef signed long off_t; typedef signed long blksize_t; typedef signed long blkcnt_t; typedef signed long time_t; +typedef long long loff_t;
#endif /* _NOLIBC_STD_H */ diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index d5792a5de70b..0ff77c0a06d7 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -671,7 +671,25 @@ int link(const char *old, const char *new) static __attribute__((unused)) off_t sys_lseek(int fd, off_t offset, int whence) { +#ifdef __NR_lseek return my_syscall3(__NR_lseek, fd, offset, whence); +#elif defined(__NR_llseek) + loff_t result; + off_t retval; + int ret; + + ret = my_syscall5(__NR_llseek, fd, (long) (((uint64_t) (offset)) >> 32), (long) offset, &result, whence); + if (ret) + return ret; + + retval = (off_t) result; + if (retval != result) + return -EOVERFLOW; + + return retval; +#else +#error Neither __NR_lseek nor __NR_llseek defined, cannot implement sys_lseek() +#endif }
static __attribute__((unused))
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_ppoll after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_ppoll_time64 instead.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/std.h | 1 + tools/include/nolibc/sys.h | 7 ++++++- tools/include/nolibc/types.h | 6 ++++++ 3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/std.h b/tools/include/nolibc/std.h index 83c0b0cb9564..221385c0e823 100644 --- a/tools/include/nolibc/std.h +++ b/tools/include/nolibc/std.h @@ -32,6 +32,7 @@ typedef signed long off_t; typedef signed long blksize_t; typedef signed long blkcnt_t; typedef signed long time_t; +typedef long long time64_t; typedef long long loff_t;
#endif /* _NOLIBC_STD_H */ diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 0ff77c0a06d7..08d38175bd7b 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -923,8 +923,13 @@ int pivot_root(const char *new, const char *old) static __attribute__((unused)) int sys_poll(struct pollfd *fds, int nfds, int timeout) { -#if defined(__NR_ppoll) +#if defined(__NR_ppoll) || defined(__NR_ppoll_time64) +#ifdef __NR_ppoll struct timespec t; +#else + struct timespec64 t; +#define __NR_ppoll __NR_ppoll_time64 +#endif
if (timeout >= 0) { t.tv_sec = timeout / 1000; diff --git a/tools/include/nolibc/types.h b/tools/include/nolibc/types.h index 15b0baffd336..ee914391439c 100644 --- a/tools/include/nolibc/types.h +++ b/tools/include/nolibc/types.h @@ -203,6 +203,12 @@ struct stat { time_t st_ctime; /* time of last status change */ };
+/* needed by time64 syscalls */ +struct timespec64 { + time64_t tv_sec; /* seconds */ + long tv_nsec; /* nanoseconds */ +}; + /* WARNING, it only deals with the 4096 first majors and 256 first minors */ #define makedev(major, minor) ((dev_t)((((major) & 0xfff) << 8) | ((minor) & 0xff))) #define major(dev) ((unsigned int)(((dev) >> 8) & 0xfff))
On 2023-05-25 01:57:24+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_ppoll after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_ppoll_time64 instead.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/std.h | 1 + tools/include/nolibc/sys.h | 7 ++++++- tools/include/nolibc/types.h | 6 ++++++ 3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/std.h b/tools/include/nolibc/std.h index 83c0b0cb9564..221385c0e823 100644 --- a/tools/include/nolibc/std.h +++ b/tools/include/nolibc/std.h @@ -32,6 +32,7 @@ typedef signed long off_t; typedef signed long blksize_t; typedef signed long blkcnt_t; typedef signed long time_t; +typedef long long time64_t; typedef long long loff_t; #endif /* _NOLIBC_STD_H */ diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 0ff77c0a06d7..08d38175bd7b 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -923,8 +923,13 @@ int pivot_root(const char *new, const char *old) static __attribute__((unused)) int sys_poll(struct pollfd *fds, int nfds, int timeout) { -#if defined(__NR_ppoll) +#if defined(__NR_ppoll) || defined(__NR_ppoll_time64) +#ifdef __NR_ppoll struct timespec t; +#else
- struct timespec64 t;
+#define __NR_ppoll __NR_ppoll_time64 +#endif if (timeout >= 0) { t.tv_sec = timeout / 1000; diff --git a/tools/include/nolibc/types.h b/tools/include/nolibc/types.h index 15b0baffd336..ee914391439c 100644 --- a/tools/include/nolibc/types.h +++ b/tools/include/nolibc/types.h @@ -203,6 +203,12 @@ struct stat { time_t st_ctime; /* time of last status change */ }; +/* needed by time64 syscalls */ +struct timespec64 {
- time64_t tv_sec; /* seconds */
- long tv_nsec; /* nanoseconds */
+};
A question to you and Willy, as it's also done the same for other types:
What is the advantage of custom definitions over using the one from the kernel (maybe via a typedef).
From linux/time_types.h:
struct __kernel_timespec { __kernel_time64_t tv_set; long long tv_nsec; };
/* WARNING, it only deals with the 4096 first majors and 256 first minors */ #define makedev(major, minor) ((dev_t)((((major) & 0xfff) << 8) | ((minor) & 0xff)))
#define major(dev) ((unsigned int)(((dev) >> 8) & 0xfff))
2.25.1
On Fri, May 26, 2023, at 09:15, Thomas Weißschuh wrote:
On 2023-05-25 01:57:24+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no
+/* needed by time64 syscalls */ +struct timespec64 {
- time64_t tv_sec; /* seconds */
- long tv_nsec; /* nanoseconds */
+};
A question to you and Willy, as it's also done the same for other types:
What is the advantage of custom definitions over using the one from the kernel (maybe via a typedef).
From linux/time_types.h:
struct __kernel_timespec { __kernel_time64_t tv_set; long long tv_nsec; };
I agree the __kernel_* types are what we should be using when interacting with system calls directly, that is definitely what they are intended for.
I would go further here and completely drop support for 32-bit time_t/off_t and derived types in nolibc. Unfortunately, the kernel's include/uapi/linux/time.h header still defines the old types, this is one of the last remnants the time32 syscalls definitions in the kernel headers, and this already conflicts with the glibc and musl definitions, so anything that includes this header is broken on real systems. I think it makes most sense for nolibc to just use the linux/time_types.h header instead and use something like
#define timespec __kernel_timespec #define itimerspec __kernel_itimerspec typedef __kernel_time64_t time_t; /* timeval is only provided for users, not compatible with syscalls */ struct timeval { __kernel_time64_t tv_sec; __s64 tv_nsec; };
so we can drop all the fallbacks for old 32-bit targets. This also allows running with CONFIG_COMPAT_32BIT_TIME disabled.
Arnd
Hi, Arnd, Thomas, Willy
On Fri, May 26, 2023, at 09:15, Thomas Wei=C3=9Fschuh wrote:
On 2023-05-25 01:57:24+0800, Zhangjin Wu wrote:
+/* needed by time64 syscalls */ +struct timespec64 {
- time64_t tv_sec; /* seconds */
- long tv_nsec; /* nanoseconds */
+};
A question to you and Willy, as it's also done the same for other types:
What is the advantage of custom definitions over using the one from the kernel (maybe via a typedef).
From linux/time_types.h:
struct __kernel_timespec { __kernel_time64_t tv_set; long long tv_nsec; };
I agree the __kernel_* types are what we should be using when interacting with system calls directly, that is definitely what they are intended for.
I would go further here and completely drop support for 32-bit time_t/off_t and derived types in nolibc. Unfortunately, the kernel's include/uapi/linux/time.h header still defines the old types, this is one of the last remnants the time32 syscalls definitions in the kernel headers, and this already conflicts with the glibc and musl definitions, so anything that includes this header is broken on real systems. I think it makes most sense for nolibc to just use the linux/time_types.h header instead and use something like
#define timespec __kernel_timespec #define itimerspec __kernel_itimerspec typedef __kernel_time64_t time_t; /* timeval is only provided for users, not compatible with syscalls */ struct timeval { __kernel_time64_t tv_sec; __s64 tv_nsec; };
so we can drop all the fallbacks for old 32-bit targets. This also allows running with CONFIG_COMPAT_32BIT_TIME disabled.
Just a status update ...
I'm working on the pure time64 and 64bit off_t syscalls support, it almost worked (tested on rv32/64, arm32/64), thanks very much for your suggestions.
It includes:
* Based on linux/types.h and * Use 64bit off_t * Use 64bit time_t * the new std.h looks like this
typedef uint32_t __kernel_dev_t;
typedef __kernel_dev_t dev_t; typedef __kernel_ulong_t ino_t; typedef __kernel_mode_t mode_t; typedef __kernel_pid_t pid_t; typedef __kernel_uid32_t uid_t; typedef __kernel_gid32_t gid_t; typedef __kernel_loff_t off_t; typedef __kernel_time64_t time_t; typedef uint32_t nlink_t; typedef uint64_t blksize_t; typedef uint64_t blkcnt_t;
* Use __kernel_timespec as timespec * Use 64bit time_t based struct timeval * Disable gettimeofday syscall completely for 32bit platforms * And disable the gettimeofday_bad1/2 test case too * Remove the oldselect and newslect path completely * The new types.h looks this:
/* always use time64 structs in user-space even on 32bit platforms */ #define timespec __kernel_timespec #define itimerspec __kernel_itimerspec
/* timeval is only provided for users, not compatible with syscalls */ struct __timeval64 { __kernel_time64_t tv_sec; /* seconds */ __s64 tv_usec; /* microseconds */ }; /* override the 32bit version of struct timeval in linux/time.h */ #define timeval __timeval64
/* itimerval is only provided for users, not compatible with syscalls */ struct __itimerval64 { struct __timeval64 it_interval; /* timer interval */ struct __timeval64 it_value; /* current value */ }; /* override the 32bit version of struct itimerval in linux/time.h */ #define itimerval __itimerval64
* Use __NR_*time64 for all 32bit platforms * Use __NR_pselect6/ppoll/clock_gettime only for 64bit platforms * New sizeof tests added to verify off_t, time_t, timespec, itimerspec...
CASE_TEST(sizeof_time_t); EXPECT_EQ(1, 8, sizeof(time_t)); break; CASE_TEST(sizeof_timespec); EXPECT_EQ(1, 16, sizeof(struct timespec)); break; #ifdef NOLIBC CASE_TEST(sizeof_itimerspec); EXPECT_EQ(1, 32, sizeof(struct itimerspec)); break; #endif CASE_TEST(sizeof_timeval); EXPECT_EQ(1, 16, sizeof(struct timeval)); break; CASE_TEST(sizeof_itimerval); EXPECT_EQ(1, 32, sizeof(struct itimerval)); break; CASE_TEST(sizeof_off_t); EXPECT_EQ(1, 8, sizeof(off_t)); break;
@Arnd, the above timeval/itimerval definitions are used to override the ones from linux/time.h to avoid such error:
error: redefinition of ‘struct timeval’
nolibc/sysroot/riscv/include/types.h:225:8: error: redefinition of ‘struct timeval’ 225 | struct timeval { | ^~~~~~~ In file included from nolibc/sysroot/riscv/include/types.h:11, from nolibc/sysroot/riscv/include/nolibc.h:98, from nolibc/sysroot/riscv/include/errno.h:26, from nolibc/sysroot/riscv/include/stdio.h:14, from tools/testing/selftests/nolibc/nolibc-test.c:12: nolibc/sysroot/riscv/include/linux/time.h:16:8: note: originally defined here 16 | struct timeval {
@Arnd, As you commented in another reply, is it time for us to update include/uapi/linux/time.h together and let it provide time64 timeval/itimerval instead of the old ones? perhaps some libc's are still using them.
Or perhaps we can add a switch like __ARCH_WANT_TIME32_SYSCALLS, add a __ARCH_WANT_TIME32_STRUCTS and simply bind it with __ARCH_WANT_TIME32_SYSCALLS?
About the above ugly override code, What's your suggestion in v2? ;-)
Best regards, Zhangjin
Arnd
On Sun, May 28, 2023, at 10:25, Zhangjin Wu wrote:
On Fri, May 26, 2023, at 09:15, Thomas Wei=C3=9Fschuh wrote:
Use __NR_*time64 for all 32bit platforms
Use __NR_pselect6/ppoll/clock_gettime only for 64bit platforms
New sizeof tests added to verify off_t, time_t, timespec, itimerspec...
CASE_TEST(sizeof_time_t); EXPECT_EQ(1, 8,
sizeof(time_t)); break; CASE_TEST(sizeof_timespec); EXPECT_EQ(1, 16, sizeof(struct timespec)); break; #ifdef NOLIBC CASE_TEST(sizeof_itimerspec); EXPECT_EQ(1, 32, sizeof(struct itimerspec)); break; #endif CASE_TEST(sizeof_timeval); EXPECT_EQ(1, 16, sizeof(struct timeval)); break; CASE_TEST(sizeof_itimerval); EXPECT_EQ(1, 32, sizeof(struct itimerval)); break; CASE_TEST(sizeof_off_t); EXPECT_EQ(1, 8, sizeof(off_t)); break;
@Arnd, the above timeval/itimerval definitions are used to override the ones from linux/time.h to avoid such error:
error: redefinition of ‘struct timeval’ nolibc/sysroot/riscv/include/types.h:225:8: error: redefinition of
‘struct timeval’ 225 | struct timeval { | ^~~~~~~ In file included from nolibc/sysroot/riscv/include/types.h:11, from nolibc/sysroot/riscv/include/nolibc.h:98, from nolibc/sysroot/riscv/include/errno.h:26, from nolibc/sysroot/riscv/include/stdio.h:14, from tools/testing/selftests/nolibc/nolibc-test.c:12: nolibc/sysroot/riscv/include/linux/time.h:16:8: note: originally defined here 16 | struct timeval {
@Arnd, As you commented in another reply, is it time for us to update include/uapi/linux/time.h together and let it provide time64 timeval/itimerval instead of the old ones? perhaps some libc's are still using them.
It's hard to know if anything is using it until we try. On the other hand, we also know that anything still relying on it is going to be increasingly broken on 32-bit architectures over as we get closer to y2038, so we can just try removing these here to see what happens.
Or perhaps we can add a switch like __ARCH_WANT_TIME32_SYSCALLS, add a __ARCH_WANT_TIME32_STRUCTS and simply bind it with __ARCH_WANT_TIME32_SYSCALLS?
I don't like that idea: __ARCH_WANT_TIME32_SYSCALLS tells us that an architeture still provides those syscalls for compatibility, so that is architecture specific, but __ARCH_WANT_TIME32_STRUCTS is not something that is an architecture specific decision at all, it's only needed for old source code.
About the above ugly override code, What's your suggestion in v2? ;-)
Can you maybe just remove the "#include <linux/time.h>" from all include/uapi/ and nolibc headers so it just never gets seen?
Unfortunately I see the header contains a few other definitions, which might have to get moved out of the way, possibly to linux/time_types.h.
Arnd
Hi Zhangjin,
On Sun, May 28, 2023 at 04:25:09PM +0800, Zhangjin Wu wrote:
Just a status update ...
I'm working on the pure time64 and 64bit off_t syscalls support, it almost worked (tested on rv32/64, arm32/64), thanks very much for your suggestions.
It includes:
Based on linux/types.h and
- Use 64bit off_t
- Use 64bit time_t
- the new std.h looks like this
typedef uint32_t __kernel_dev_t; typedef __kernel_dev_t dev_t; typedef __kernel_ulong_t ino_t; typedef __kernel_mode_t mode_t; typedef __kernel_pid_t pid_t; typedef __kernel_uid32_t uid_t; typedef __kernel_gid32_t gid_t; typedef __kernel_loff_t off_t; typedef __kernel_time64_t time_t; typedef uint32_t nlink_t; typedef uint64_t blksize_t; typedef uint64_t blkcnt_t;
Use __kernel_timespec as timespec
Use 64bit time_t based struct timeval
- Disable gettimeofday syscall completely for 32bit platforms
- And disable the gettimeofday_bad1/2 test case too
When you say "disable", you mean "remap", right ? Or do you mean "break in 2023 code that was expected to break only in 2038 after the hardware supporting it no longer exists" ?
Thanks, Willy
On Sun, May 28, 2023, at 12:29, Willy Tarreau wrote:
On Sun, May 28, 2023 at 04:25:09PM +0800, Zhangjin Wu wrote:
- Use __kernel_timespec as timespec
- Use 64bit time_t based struct timeval
- Disable gettimeofday syscall completely for 32bit platforms
- And disable the gettimeofday_bad1/2 test case too
When you say "disable", you mean "remap", right ? Or do you mean "break in 2023 code that was expected to break only in 2038 after
clock_gettime() has been supported for a very long time, so both time() and gettimeofday() can be trivial wrappers around that.
Nothing really should be using the timezone argument, so I'd just ignore that in nolibc. (it's a little trickier for /sbin/init setting the initial timezone, but I hope we can ignore that here).
clock_gettime() as a function call that takes a timespec argument in turn should be a wrapper around either sys_clock_gettime64 (on 32-bit architectures) or sys_clock_gettime_old() (on 64-bit architectures, or as a fallback on old 32-bit kernels after clock_gettime64 fails).
On normal libc implementations, the low-level sys_clock_gettime64() and sys_clock_gettime_old(), whatever they are named, would call vdso first and then fall back to the syscall, but I don't think that's necessary for nolibc.
I'd define them the same as the kernel, with sys_clock_gettime64() taking a __kernel_timespec, and sys_clock_gettime_old() takeing a __kernel_old_timespec.
Arnd
On Sun, May 28, 2023 at 12:55:47PM +0200, Arnd Bergmann wrote:
On Sun, May 28, 2023, at 12:29, Willy Tarreau wrote:
On Sun, May 28, 2023 at 04:25:09PM +0800, Zhangjin Wu wrote:
- Use __kernel_timespec as timespec
- Use 64bit time_t based struct timeval
- Disable gettimeofday syscall completely for 32bit platforms
- And disable the gettimeofday_bad1/2 test case too
When you say "disable", you mean "remap", right ? Or do you mean "break in 2023 code that was expected to break only in 2038 after
clock_gettime() has been supported for a very long time, so both time() and gettimeofday() can be trivial wrappers around that.
OK, that's what I wanted to clarify. I understood "drop" in the sense of, well, "drop" :-)
Nothing really should be using the timezone argument, so I'd just ignore that in nolibc. (it's a little trickier for /sbin/init setting the initial timezone, but I hope we can ignore that here).
Yes I'm fine with this approach.
clock_gettime() as a function call that takes a timespec argument in turn should be a wrapper around either sys_clock_gettime64 (on 32-bit architectures) or sys_clock_gettime_old() (on 64-bit architectures, or as a fallback on old 32-bit kernels after clock_gettime64 fails).
Sounds good to me.
On normal libc implementations, the low-level sys_clock_gettime64() and sys_clock_gettime_old(), whatever they are named, would call vdso first and then fall back to the syscall, but I don't think that's necessary for nolibc.
Indeed, we don't exploit the VDSO here since it's essentially useful for performance and that's not what we're seeking.
I'd define them the same as the kernel, with sys_clock_gettime64() taking a __kernel_timespec, and sys_clock_gettime_old() takeing a __kernel_old_timespec.
Sounds good, thanks Arnd! Willy
The ppoll and ppoll_time64 syscalls have 5 arguments, but we only provide 4, align with kernel and add the missing sigsetsize argument.
Because the sigmask is NULL, the last sigsetsize argument is ignored, keep it as 0 here is safe enough.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/sys.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 08d38175bd7b..c0335a84f880 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -935,7 +935,7 @@ int sys_poll(struct pollfd *fds, int nfds, int timeout) t.tv_sec = timeout / 1000; t.tv_nsec = (timeout % 1000) * 1000000; } - return my_syscall4(__NR_ppoll, fds, nfds, (timeout >= 0) ? &t : NULL, NULL); + return my_syscall5(__NR_ppoll, fds, nfds, (timeout >= 0) ? &t : NULL, NULL, 0); #elif defined(__NR_poll) return my_syscall3(__NR_poll, fds, nfds, timeout); #else
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_pselect6 after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_pselect6_time64 instead.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/sys.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index c0335a84f880..00c7197dcd50 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -1041,8 +1041,13 @@ int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeva struct timeval *t; } arg = { .n = nfds, .r = rfds, .w = wfds, .e = efds, .t = timeout }; return my_syscall1(__NR_select, &arg); -#elif defined(__ARCH_WANT_SYS_PSELECT6) && defined(__NR_pselect6) +#elif defined(__ARCH_WANT_SYS_PSELECT6) && (defined(__NR_pselect6) || defined(__NR_pselect6_time64)) +#ifdef __NR_pselect6 struct timespec t; +#else + struct timespec64 t; +#define __NR_pselect6 __NR_pselect6_time64 +#endif
if (timeout) { t.tv_sec = timeout->tv_sec;
On 2023-05-25 01:59:55+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_pselect6 after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_pselect6_time64 instead.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/sys.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index c0335a84f880..00c7197dcd50 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -1041,8 +1041,13 @@ int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeva struct timeval *t; } arg = { .n = nfds, .r = rfds, .w = wfds, .e = efds, .t = timeout }; return my_syscall1(__NR_select, &arg); -#elif defined(__ARCH_WANT_SYS_PSELECT6) && defined(__NR_pselect6) +#elif defined(__ARCH_WANT_SYS_PSELECT6) && (defined(__NR_pselect6) || defined(__NR_pselect6_time64)) +#ifdef __NR_pselect6 struct timespec t; +#else
- struct timespec64 t;
+#define __NR_pselect6 __NR_pselect6_time64
Wouldn't this #define leak to the users of nolibc and lead to calls to pselect6_time64 with the ABI of the __NR_pselect6 if userspace is doing its own raw syscalls?
+#endif if (timeout) { t.tv_sec = timeout->tv_sec; -- 2.25.1
Hi, Thomas
On 2023-05-25 01:59:55+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_pselect6 after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_pselect6_time64 instead.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/sys.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index c0335a84f880..00c7197dcd50 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -1041,8 +1041,13 @@ int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeva struct timeval *t; } arg = { .n = nfds, .r = rfds, .w = wfds, .e = efds, .t = timeout }; return my_syscall1(__NR_select, &arg); -#elif defined(__ARCH_WANT_SYS_PSELECT6) && defined(__NR_pselect6) +#elif defined(__ARCH_WANT_SYS_PSELECT6) && (defined(__NR_pselect6) || defined(__NR_pselect6_time64)) +#ifdef __NR_pselect6 struct timespec t; +#else
- struct timespec64 t;
+#define __NR_pselect6 __NR_pselect6_time64
Wouldn't this #define leak to the users of nolibc and lead to calls to pselect6_time64 with the ABI of the __NR_pselect6 if userspace is doing its own raw syscalls?
Yeah, it would break the user-side raw __NR_pselect6 syscall for nolibc is a header-only libc, so, it is not safe to use such method like glibc.
Something like this will let the syscall call to pselect6_time64 instead of the user-required __NR_pselect6 and pass the wrong type of argument.
#include "nolibc.h" // If no __NR_pselect6 defined, __NR_pselect6 = __NR_pselect6_time64
#ifdef __NR_pselect6 struct timespec t; // come here for __NR_pselect6_time64, but t is not timespec64, broken syscall(__NR_pselect6, nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL); #else struct timespec64 t; syscall(__NR_pselect6, nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL); #endif
I have used something like __NR_pselect6_time3264 locally, before sending the patchset, I found a cleaner method already used in sys.h:
#ifndef __NR__newselect #define __NR__newselect __NR_select #endif
But I forgot the arguments mixing issue, __NR__newselect and __NR_select share the same type of arguments, but __NR_pselect6 and __NR_pselect6_time64 not, so, I will use back the old method but still need to find a better string, just like __NR__newselect, __NR__pselect6 may be used in kernel space in the future, and __NR_pselect6_time3264 is too long, what about this?
#ifdef __NR_pselect6 struct timespec t; #define __NR_pselect6__ __NR_pselect6 #else struct timespec64 t; #define __NR_pselect6__ __NR_pselect6_time64 #endif
Or even ___NR_pselect6?
The same issue is in this patch:
[PATCH 13/13] tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64
will solve it with the same method.
Thanks, Zhangjin
+#endif if (timeout) { t.tv_sec = timeout->tv_sec; -- 2.25.1
On 2023-05-25 15:10:21+0800, Zhangjin Wu wrote:
Hi, Thomas
On 2023-05-25 01:59:55+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_pselect6 after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_pselect6_time64 instead.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/sys.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index c0335a84f880..00c7197dcd50 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -1041,8 +1041,13 @@ int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeva struct timeval *t; } arg = { .n = nfds, .r = rfds, .w = wfds, .e = efds, .t = timeout }; return my_syscall1(__NR_select, &arg); -#elif defined(__ARCH_WANT_SYS_PSELECT6) && defined(__NR_pselect6) +#elif defined(__ARCH_WANT_SYS_PSELECT6) && (defined(__NR_pselect6) || defined(__NR_pselect6_time64)) +#ifdef __NR_pselect6 struct timespec t; +#else
- struct timespec64 t;
+#define __NR_pselect6 __NR_pselect6_time64
Wouldn't this #define leak to the users of nolibc and lead to calls to pselect6_time64 with the ABI of the __NR_pselect6 if userspace is doing its own raw syscalls?
Yeah, it would break the user-side raw __NR_pselect6 syscall for nolibc is a header-only libc, so, it is not safe to use such method like glibc.
Something like this will let the syscall call to pselect6_time64 instead of the user-required __NR_pselect6 and pass the wrong type of argument.
#include "nolibc.h" // If no __NR_pselect6 defined, __NR_pselect6 = __NR_pselect6_time64 #ifdef __NR_pselect6 struct timespec t; // come here for __NR_pselect6_time64, but t is not timespec64, broken syscall(__NR_pselect6, nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL); #else struct timespec64 t; syscall(__NR_pselect6, nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL); #endif
I have used something like __NR_pselect6_time3264 locally, before sending the patchset, I found a cleaner method already used in sys.h:
#ifndef __NR__newselect #define __NR__newselect __NR_select #endif
But I forgot the arguments mixing issue, __NR__newselect and __NR_select share the same type of arguments, but __NR_pselect6 and __NR_pselect6_time64 not, so, I will use back the old method but still need to find a better string, just like __NR__newselect, __NR__pselect6 may be used in kernel space in the future, and __NR_pselect6_time3264 is too long, what about this?
#ifdef __NR_pselect6 struct timespec t; #define __NR_pselect6__ __NR_pselect6 #else struct timespec64 t; #define __NR_pselect6__ __NR_pselect6_time64 #endif
Or even ___NR_pselect6?
What about:
#ifdef __NR_pselect6 struct timespec t; const long nr_pselect = __NR_pselect6; #else struct timespec64 t; const long nr_pselect = __NR_pselect6_time64; #endif
The same issue is in this patch:
[PATCH 13/13] tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64
will solve it with the same method.
Thanks!
On 2023-05-25 15:10:21+0800, Zhangjin Wu wrote:
Hi, Thomas
On 2023-05-25 01:59:55+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_pselect6 after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_pselect6_time64 instead.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/sys.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index c0335a84f880..00c7197dcd50 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -1041,8 +1041,13 @@ int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeva struct timeval *t; } arg = { .n = nfds, .r = rfds, .w = wfds, .e = efds, .t = timeout }; return my_syscall1(__NR_select, &arg); -#elif defined(__ARCH_WANT_SYS_PSELECT6) && defined(__NR_pselect6) +#elif defined(__ARCH_WANT_SYS_PSELECT6) && (defined(__NR_pselect6) || defined(__NR_pselect6_time64)) +#ifdef __NR_pselect6 struct timespec t; +#else
- struct timespec64 t;
+#define __NR_pselect6 __NR_pselect6_time64
Wouldn't this #define leak to the users of nolibc and lead to calls to pselect6_time64 with the ABI of the __NR_pselect6 if userspace is doing its own raw syscalls?
Yeah, it would break the user-side raw __NR_pselect6 syscall for nolibc is a header-only libc, so, it is not safe to use such method like glibc.
Something like this will let the syscall call to pselect6_time64 instead of the user-required __NR_pselect6 and pass the wrong type of argument.
#include "nolibc.h" // If no __NR_pselect6 defined, __NR_pselect6 = __NR_pselect6_time64 #ifdef __NR_pselect6 struct timespec t; // come here for __NR_pselect6_time64, but t is not timespec64, broken syscall(__NR_pselect6, nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL); #else struct timespec64 t; syscall(__NR_pselect6, nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL); #endif
I have used something like __NR_pselect6_time3264 locally, before sending the patchset, I found a cleaner method already used in sys.h:
#ifndef __NR__newselect #define __NR__newselect __NR_select #endif
But I forgot the arguments mixing issue, __NR__newselect and __NR_select share the same type of arguments, but __NR_pselect6 and __NR_pselect6_time64 not, so, I will use back the old method but still need to find a better string, just like __NR__newselect, __NR__pselect6 may be used in kernel space in the future, and __NR_pselect6_time3264 is too long, what about this?
#ifdef __NR_pselect6 struct timespec t; #define __NR_pselect6__ __NR_pselect6 #else struct timespec64 t; #define __NR_pselect6__ __NR_pselect6_time64 #endif
Or even ___NR_pselect6?
What about:
#ifdef __NR_pselect6 struct timespec t; const long nr_pselect = __NR_pselect6; #else struct timespec64 t; const long nr_pselect = __NR_pselect6_time64; #endif
It looks better and cleaner, will apply this method, thanks!
The same issue is in this patch:
[PATCH 13/13] tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64
will solve it with the same method.
And also this one:
[PATCH 09/13] tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64
Have tested all of them, will send a v2 later.
Best regards, Zhangjin
Thanks!
On Wed, May 24, 2023, at 19:59, Zhangjin Wu wrote:
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index c0335a84f880..00c7197dcd50 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -1041,8 +1041,13 @@ int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeva struct timeval *t; } arg = { .n = nfds, .r = rfds, .w = wfds, .e = efds, .t = timeout }; return my_syscall1(__NR_select, &arg); -#elif defined(__ARCH_WANT_SYS_PSELECT6) && defined(__NR_pselect6) +#elif defined(__ARCH_WANT_SYS_PSELECT6) && (defined(__NR_pselect6) || defined(__NR_pselect6_time64)) +#ifdef __NR_pselect6 struct timespec t; +#else
- struct timespec64 t;
+#define __NR_pselect6 __NR_pselect6_time64 +#endif
Overriding the __NR_pselect6 constant seems wrong here, this can easily lead to more bugs, as pselect6 and pselect6_time64 are not compatible because of the different arguments, so anything using the constant after including sys.h will be broken.
I would just use __kernel_timespec64 unconditionally and then use __NR_pselect6_time64 on all 32-bit architectures here, but use __NR_pselect6 on 32-bit architectures.
I think we can also kill off the oldselect and newselect cases, using pselect6/pselect6_time64 unconditionally here, and no longer care about building against pre-5.1 kernel headers, at least for the copy of nolibc that ships with the kernel.
Arnd
Hi, Arnd
On Wed, May 24, 2023, at 19:59, Zhangjin Wu wrote:
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index c0335a84f880..00c7197dcd50 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -1041,8 +1041,13 @@ int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeva struct timeval *t; } arg = { .n = nfds, .r = rfds, .w = wfds, .e = efds, .t = timeout }; return my_syscall1(__NR_select, &arg); -#elif defined(__ARCH_WANT_SYS_PSELECT6) && defined(__NR_pselect6) +#elif defined(__ARCH_WANT_SYS_PSELECT6) && (defined(__NR_pselect6) || defined(__NR_pselect6_time64)) +#ifdef __NR_pselect6 struct timespec t; +#else
- struct timespec64 t;
+#define __NR_pselect6 __NR_pselect6_time64 +#endif
Overriding the __NR_pselect6 constant seems wrong here, this can easily lead to more bugs, as pselect6 and pselect6_time64 are not compatible because of the different arguments, so anything using the constant after including sys.h will be broken.
Yes, thanks, Thomas also pointed out this issue in another reply of this message thread, it has been fixed locally with his suggestion, it looks like this:
#ifdef __NR_pselect6 struct timespec t; const long nr_pselect = __NR_pselect6; #else struct timespec64 t; const long nr_pselect = __NR_pselect6_time64; #endif
if (timeout) { t.tv_sec = timeout->tv_sec; t.tv_nsec = timeout->tv_usec * 1000; } return my_syscall6(nr_pselect, nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL);
I have applied this method for ppoll_time64 and clock_gettime64 too, this method can save several duplicated lines for us, I have prepared v2 patches locally for them.
I would just use __kernel_timespec64 unconditionally and then use __NR_pselect6_time64 on all 32-bit architectures here, but use __NR_pselect6 on 32-bit architectures.
The 2nd 32-bit you mean is 64-bit?
This is related to the timespec64/time64_t definitions as you commented in another reply. I will learn how to use the one from <linux/time_types.h>, it may require to clean up the existing files in tools/include/nolibc/ at first.
I think we can also kill off the oldselect and newselect cases, using pselect6/pselect6_time64 unconditionally here, and no longer care about building against pre-5.1 kernel headers, at least for the copy of nolibc that ships with the kernel.
As Willy commented in another reply, users may want to copy the recent one and use them with an old kernel, even if want to drop them, a standalone patch may be preferable.
Thanks very much, Zhangjin
Arnd
On Fri, May 26, 2023, at 13:00, Zhangjin Wu wrote:
On Wed, May 24, 2023, at 19:59, Zhangjin Wu wrote:
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h
I have applied this method for ppoll_time64 and clock_gettime64 too, this method can save several duplicated lines for us, I have prepared v2 patches locally for them.
Ok, that addresses my concern about the possible bugs.
I would just use __kernel_timespec64 unconditionally and then use __NR_pselect6_time64 on all 32-bit architectures here, but use __NR_pselect6 on 32-bit architectures.
The 2nd 32-bit you mean is 64-bit?
Yes, sorry for the typo.
This is related to the timespec64/time64_t definitions as you commented in another reply. I will learn how to use the one from <linux/time_types.h>, it may require to clean up the existing files in tools/include/nolibc/ at first.
Ok, thanks.
I think we can also kill off the oldselect and newselect cases, using pselect6/pselect6_time64 unconditionally here, and no longer care about building against pre-5.1 kernel headers, at least for the copy of nolibc that ships with the kernel.
As Willy commented in another reply, users may want to copy the recent one and use them with an old kernel, even if want to drop them, a standalone patch may be preferable.
Fair enough. I checked the old versions and see that 5.1 in May 2019 was the first one to include the time64 system call definitions, though 5.6 from March 2020 was the first version that I consider fully working with time64.
I don't know how common it is to copy nolibc into other projects, but requiring a three year old kernel might be a little too aggressive here. They could copy from 6.1-stable to keep the fallback (and miss rv32) if we do this, but a better cutoff may be Dec 2025 when linux-5.4 has its "projected EOL" date and one last LTS with the fallback (linux-6.16 or so) gets released.
Arnd
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_wait4 after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_waitid instead.
This code is based on sysdeps/unix/sysv/linux/wait4.c of glibc.
Notes: The kernel wait4 syscall has the 'pid == INT_MIN' path and returns -ESRCH, but the kernel waitid syscall has no such path, to let this __NR_waitid based sys_wait4 has the same return value and pass the 'waitpid_min' test, we emulate such path in our new nolibc __NR_waitid branch.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/sys.h | 54 ++++++++++++++++++++++++++++++++++++ tools/include/nolibc/types.h | 15 +++++++++- 2 files changed, 68 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 00c7197dcd50..2642b380c6aa 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -12,6 +12,7 @@
/* system includes */ #include <asm/unistd.h> +#include <asm/siginfo.h> /* for siginfo_t */ #include <asm/signal.h> /* for SIGCHLD */ #include <asm/ioctls.h> #include <asm/mman.h> @@ -1333,7 +1334,60 @@ int unlink(const char *path) static __attribute__((unused)) pid_t sys_wait4(pid_t pid, int *status, int options, struct rusage *rusage) { +#ifdef __NR_wait4 return my_syscall4(__NR_wait4, pid, status, options, rusage); +#elif defined(__NR_waitid) + siginfo_t infop; + int idtype = P_PID; + int ret; + + /* emulate the 'pid == INT_MIN' path of wait4 */ + if (pid == INT_MIN) + return -ESRCH; + + if (pid < -1) { + idtype = P_PGID; + pid *= -1; + } else if (pid == -1) { + idtype = P_ALL; + } else if (pid == 0) { + idtype = P_PGID; + } + + options |= WEXITED; + + ret = my_syscall5(__NR_waitid, idtype, pid, &infop, options, rusage); + if (ret < 0) + return ret; + + if (status) { + switch (infop.si_code) { + case CLD_EXITED: + *status = W_EXITCODE(infop.si_status, 0); + break; + case CLD_DUMPED: + *status = __WCOREFLAG | infop.si_status; + break; + case CLD_KILLED: + *status = infop.si_status; + break; + case CLD_TRAPPED: + case CLD_STOPPED: + *status = W_STOPCODE(infop.si_status); + break; + case CLD_CONTINUED: + *status = __W_CONTINUED; + break; + default: + *status = 0; + break; + } + } + + return infop.si_pid; +#else +#error Neither __NR_wait4 nor __NR_waitid defined, cannot implement sys_wait4() +#endif }
static __attribute__((unused)) diff --git a/tools/include/nolibc/types.h b/tools/include/nolibc/types.h index ee914391439c..c4f95c267607 100644 --- a/tools/include/nolibc/types.h +++ b/tools/include/nolibc/types.h @@ -92,8 +92,21 @@ #define WTERMSIG(status) ((status) & 0x7f) #define WIFSIGNALED(status) ((status) - 1 < 0xff)
-/* waitpid() flags */ +/* waitpid() and waitid() flags */ #define WNOHANG 1 +#define WEXITED 0x00000004 + +/* first argument for waitid() */ +#define P_ALL 0 +#define P_PID 1 +#define P_PGID 2 +#define P_PIDFD 3 + +/* Macros used on waitid's status setting */ +#define W_EXITCODE(ret, sig) ((ret) << 8 | (sig)) +#define W_STOPCODE(sig) ((sig) << 8 | 0x7f) +#define __W_CONTINUED 0xffff +#define __WCOREFLAG 0x80
/* standard exit() codes */ #define EXIT_SUCCESS 0
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_gettimeofday and __NR_clock_gettime after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_clock_gettime64 instead.
This code is based on src/time/gettimeofday.c of musl and sysdeps/unix/sysv/linux/clock_gettime.c of glibc.
Both __NR_clock_gettime and __NR_clock_gettime64 are added for sys_gettimeofday() for they share most of the code.
Notes:
* Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
* kernel clock_gettime* syscalls can not get tz info, just like musl and glibc do, we set tz to zero to avoid a random number.
Signed-off-by: Zhangjin Wu falcon@tinylab.org --- tools/include/nolibc/sys.h | 46 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 2642b380c6aa..ad38cc3856be 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -26,6 +26,7 @@
#include "arch.h" #include "errno.h" +#include "string.h" #include "types.h"
@@ -51,6 +52,11 @@ * should not be placed here. */
+/* + * This is the first address past the end of the text segment (the program code). + */ + +extern char etext;
/* * int brk(void *addr); @@ -554,7 +560,47 @@ long getpagesize(void) static __attribute__((unused)) int sys_gettimeofday(struct timeval *tv, struct timezone *tz) { +#ifdef __NR_gettimeofday return my_syscall2(__NR_gettimeofday, tv, tz); +#elif defined(__NR_clock_gettime) || defined(__NR_clock_gettime64) +#ifdef __NR_clock_gettime + struct timespec ts; +#else + struct timespec64 ts; +#define __NR_clock_gettime __NR_clock_gettime64 +#endif + int ret; + + /* make sure tv pointer is at least after code segment */ + if (tv != NULL && (char *)tv <= &etext) + return -EFAULT; + + /* set tz to zero to avoid random number */ + if (tz != NULL) { + if ((char *)tz > &etext) + memset(tz, 0, sizeof(struct timezone)); + else + return -EFAULT; + } + + if (tv == NULL) + return 0; + + ret = my_syscall2(__NR_clock_gettime, CLOCK_REALTIME, &ts); + if (ret) + return ret; + + tv->tv_sec = (time_t) ts.tv_sec; +#ifdef __NR_clock_gettime64 + if (tv->tv_sec != ts.tv_sec) + return -EOVERFLOW; +#endif + + tv->tv_usec = ts.tv_nsec / 1000; + return 0; +#else +#error None of __NR_gettimeofday, __NR_clock_gettime and __NR_clock_gettime64 defined, cannot implement sys_gettimeofday() +#endif }
static __attribute__((unused))
On 2023-05-25 02:03:32+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_gettimeofday and __NR_clock_gettime after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_clock_gettime64 instead.
This code is based on src/time/gettimeofday.c of musl and sysdeps/unix/sysv/linux/clock_gettime.c of glibc.
Both __NR_clock_gettime and __NR_clock_gettime64 are added for sys_gettimeofday() for they share most of the code.
Notes:
Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
kernel clock_gettime* syscalls can not get tz info, just like musl and glibc do, we set tz to zero to avoid a random number.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/sys.h | 46 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 2642b380c6aa..ad38cc3856be 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -26,6 +26,7 @@ #include "arch.h" #include "errno.h" +#include "string.h" #include "types.h" @@ -51,6 +52,11 @@
- should not be placed here.
*/ +/*
- This is the first address past the end of the text segment (the program code).
- */
+extern char etext; /*
- int brk(void *addr);
@@ -554,7 +560,47 @@ long getpagesize(void) static __attribute__((unused)) int sys_gettimeofday(struct timeval *tv, struct timezone *tz) { +#ifdef __NR_gettimeofday return my_syscall2(__NR_gettimeofday, tv, tz); +#elif defined(__NR_clock_gettime) || defined(__NR_clock_gettime64) +#ifdef __NR_clock_gettime
- struct timespec ts;
+#else
- struct timespec64 ts;
+#define __NR_clock_gettime __NR_clock_gettime64 +#endif
- int ret;
- /* make sure tv pointer is at least after code segment */
- if (tv != NULL && (char *)tv <= &etext)
return -EFAULT;
To me the weird etext comparisions don't seem to be worth it, to be honest.
- /* set tz to zero to avoid random number */
- if (tz != NULL) {
if ((char *)tz > &etext)
memset(tz, 0, sizeof(struct timezone));
else
return -EFAULT;
- }
- if (tv == NULL)
return 0;
- ret = my_syscall2(__NR_clock_gettime, CLOCK_REALTIME, &ts);
- if (ret)
return ret;
- tv->tv_sec = (time_t) ts.tv_sec;
+#ifdef __NR_clock_gettime64
Nitpick:
While this ifdef works and is correct its logic is a bit indirect. If it is copied to some other function in the future it may be incorrect there.
Without the #ifdef the compiler should still be able to optimize the code away.
- if (tv->tv_sec != ts.tv_sec)
return -EOVERFLOW;
+#endif
- tv->tv_usec = ts.tv_nsec / 1000;
- return 0;
+#else +#error None of __NR_gettimeofday, __NR_clock_gettime and __NR_clock_gettime64 defined, cannot implement sys_gettimeofday() +#endif } static __attribute__((unused)) -- 2.25.1
Hi, Thomas, Willy
On 2023-05-25 02:03:32+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_gettimeofday and __NR_clock_gettime after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_clock_gettime64 instead.
This code is based on src/time/gettimeofday.c of musl and sysdeps/unix/sysv/linux/clock_gettime.c of glibc.
Both __NR_clock_gettime and __NR_clock_gettime64 are added for sys_gettimeofday() for they share most of the code.
Notes:
Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
kernel clock_gettime* syscalls can not get tz info, just like musl and glibc do, we set tz to zero to avoid a random number.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/sys.h | 46 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 2642b380c6aa..ad38cc3856be 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -26,6 +26,7 @@ #include "arch.h" #include "errno.h" +#include "string.h" #include "types.h" @@ -51,6 +52,11 @@
- should not be placed here.
*/ +/*
- This is the first address past the end of the text segment (the program code).
- */
+extern char etext; /*
- int brk(void *addr);
@@ -554,7 +560,47 @@ long getpagesize(void) static __attribute__((unused)) int sys_gettimeofday(struct timeval *tv, struct timezone *tz) { +#ifdef __NR_gettimeofday return my_syscall2(__NR_gettimeofday, tv, tz); +#elif defined(__NR_clock_gettime) || defined(__NR_clock_gettime64) +#ifdef __NR_clock_gettime
- struct timespec ts;
+#else
- struct timespec64 ts;
+#define __NR_clock_gettime __NR_clock_gettime64 +#endif
- int ret;
- /* make sure tv pointer is at least after code segment */
- if (tv != NULL && (char *)tv <= &etext)
return -EFAULT;
To me the weird etext comparisions don't seem to be worth it, to be honest.
This is the issue we explained in commit message:
* Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
but not that deeply described the direct cause, the direct cause is that the test case passes a '(void *)1' and the kernel space of gettimeofday can simply 'fixup' this issue by the get_user/put_user helpers, but our user-space tv and tz code has no such function, just emulate such 'fixup' by a stupid etext compare to at least make sure the data pointer is in data range. Welcome better solution.
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
Without this ugly check, the above cases would get such error:
35 gettimeofday_bad1init[1]: unhandled signal 11 code 0x1 at 0x00000002 in init[10000+5000] CPU: 0 PID: 1 Comm: init Not tainted 6.4.0-rc1-00134-gf929c7b7184f-dirty #20 Hardware name: riscv-virtio,qemu (DT) epc : 00012ccc ra : 00012ca8 sp : 9d254d90 gp : 00016800 tp : 00000000 t0 : 00000000 t1 : 0000000a t2 : 00000000 s0 : 00000001 s1 : 00016008 a0 : 00000000 a1 : 9d254da8 a2 : 00000014 a3 : 00000000 a4 : 00000000 a5 : 00000000 a6 : 00000001 a7 : 00000193 s2 : 00000023 s3 : 9d254da4 s4 : 00000000 s5 : 00000000 s6 : 0000541b s7 : 00000007 s8 : 9d254dcc s9 : 000144e8 s10: 00016000 s11: 00000006 t3 : 00000000 t4 : ffffffff t5 : 00000000 t6 : 00000000 status: 00000020 badaddr: 00000002 cause: 0000000f
Will at least append this test error in the commit message of the coming new revision of this patch.
Hi, Willy, this also require your discussion, simply remove the above two test cases may be not a good idea too, the check for gettimeofday is perfectly ok.
The same 'emulate' method is used in the waitid patch, but that only requires to compare 'pid == INT_MIN', which is not that weird.
- /* set tz to zero to avoid random number */
- if (tz != NULL) {
if ((char *)tz > &etext)
memset(tz, 0, sizeof(struct timezone));
else
return -EFAULT;
- }
The same issue here.
- if (tv == NULL)
return 0;
- ret = my_syscall2(__NR_clock_gettime, CLOCK_REALTIME, &ts);
- if (ret)
return ret;
- tv->tv_sec = (time_t) ts.tv_sec;
+#ifdef __NR_clock_gettime64
Nitpick:
While this ifdef works and is correct its logic is a bit indirect. If it is copied to some other function in the future it may be incorrect there.
Without the #ifdef the compiler should still be able to optimize the code away.
Ok, will remove the #ifdef wrapper, let the compiler optimize itself.
Thanks, Zhangjin
- if (tv->tv_sec != ts.tv_sec)
return -EOVERFLOW;
+#endif
- tv->tv_usec = ts.tv_nsec / 1000;
- return 0;
+#else +#error None of __NR_gettimeofday, __NR_clock_gettime and __NR_clock_gettime64 defined, cannot implement sys_gettimeofday() +#endif } static __attribute__((unused)) -- 2.25.1
Hi, Thomas, Willy
On 2023-05-25 02:03:32+0800, Zhangjin Wu wrote:
rv32 uses the generic include/uapi/asm-generic/unistd.h and it has no __NR_gettimeofday and __NR_clock_gettime after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI"), use __NR_clock_gettime64 instead.
This code is based on src/time/gettimeofday.c of musl and sysdeps/unix/sysv/linux/clock_gettime.c of glibc.
Both __NR_clock_gettime and __NR_clock_gettime64 are added for sys_gettimeofday() for they share most of the code.
Notes:
Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
kernel clock_gettime* syscalls can not get tz info, just like musl and glibc do, we set tz to zero to avoid a random number.
Signed-off-by: Zhangjin Wu falcon@tinylab.org
tools/include/nolibc/sys.h | 46 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 2642b380c6aa..ad38cc3856be 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -26,6 +26,7 @@ #include "arch.h" #include "errno.h" +#include "string.h" #include "types.h" @@ -51,6 +52,11 @@
- should not be placed here.
*/ +/*
- This is the first address past the end of the text segment (the program code).
- */
+extern char etext; /*
- int brk(void *addr);
@@ -554,7 +560,47 @@ long getpagesize(void) static __attribute__((unused)) int sys_gettimeofday(struct timeval *tv, struct timezone *tz) { +#ifdef __NR_gettimeofday return my_syscall2(__NR_gettimeofday, tv, tz); +#elif defined(__NR_clock_gettime) || defined(__NR_clock_gettime64) +#ifdef __NR_clock_gettime
- struct timespec ts;
+#else
- struct timespec64 ts;
+#define __NR_clock_gettime __NR_clock_gettime64 +#endif
- int ret;
- /* make sure tv pointer is at least after code segment */
- if (tv != NULL && (char *)tv <= &etext)
return -EFAULT;
To me the weird etext comparisions don't seem to be worth it, to be honest.
This is the issue we explained in commit message:
* Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
but not that deeply described the direct cause, the direct cause is that the test case passes a '(void *)1' and the kernel space of gettimeofday can simply 'fixup' this issue by the get_user/put_user helpers, but our user-space tv and tz code has no such function, just emulate such 'fixup' by a stupid etext compare to at least make sure the data pointer is in data range. Welcome better solution.
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
Without this ugly check, the above cases would get such error:
35 gettimeofday_bad1init[1]: unhandled signal 11 code 0x1 at 0x00000002 in init[10000+5000] CPU: 0 PID: 1 Comm: init Not tainted 6.4.0-rc1-00134-gf929c7b7184f-dirty #20 Hardware name: riscv-virtio,qemu (DT) epc : 00012ccc ra : 00012ca8 sp : 9d254d90 gp : 00016800 tp : 00000000 t0 : 00000000 t1 : 0000000a t2 : 00000000 s0 : 00000001 s1 : 00016008 a0 : 00000000 a1 : 9d254da8 a2 : 00000014 a3 : 00000000 a4 : 00000000 a5 : 00000000 a6 : 00000001 a7 : 00000193 s2 : 00000023 s3 : 9d254da4 s4 : 00000000 s5 : 00000000 s6 : 0000541b s7 : 00000007 s8 : 9d254dcc s9 : 000144e8 s10: 00016000 s11: 00000006 t3 : 00000000 t4 : ffffffff t5 : 00000000 t6 : 00000000 status: 00000020 badaddr: 00000002 cause: 0000000f
Will at least append this test error in the commit message of the coming new revision of this patch.
Hi, Willy, this also require your discussion, simply remove the above two test cases may be not a good idea too, the check for gettimeofday is perfectly ok.
What about this? Just like Willy did in 1da02f51088 ("selftests/nolibc: support glibc as well"), Let's only limit the test case under the __NR_gettimeofday #ifdef:
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 702bf449f8d7..d52f3720918e 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -563,7 +563,7 @@ int run_syscall(int min, int max) CASE_TEST(getdents64_root); EXPECT_SYSNE(1, test_getdents64("/"), -1); break; CASE_TEST(getdents64_null); EXPECT_SYSER(1, test_getdents64("/dev/null"), -1, ENOTDIR); break; CASE_TEST(gettimeofday_null); EXPECT_SYSZR(1, gettimeofday(NULL, NULL)); break; -#ifdef NOLIBC +#if defined(NOLIBC) && defined(__NR_gettimeofday) CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; #endif
With the above change, we can simply remove the ugly etext check like this:
diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index d1d26da306b7..ebe8ed018db6 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -572,17 +572,9 @@ int sys_gettimeofday(struct timeval *tv, struct timezone *tz) #endif int ret;
- /* make sure tv pointer is at least after code segment */ - if (tv != NULL && (char *)tv <= &etext) - return -EFAULT; - /* set tz to zero to avoid random number */ - if (tz != NULL) { - if ((char *)tz > &etext) - memset(tz, 0, sizeof(struct timezone)); - else - return -EFAULT; - } + if (tz != NULL) + memset(tz, 0, sizeof(struct timezone));
if (tv == NULL) return 0;
If agree, will apply this method in the next revision.
The same 'emulate' method is used in the waitid patch, but that only requires to compare 'pid == INT_MIN', which is not that weird.
- /* set tz to zero to avoid random number */
- if (tz != NULL) {
if ((char *)tz > &etext)
memset(tz, 0, sizeof(struct timezone));
else
return -EFAULT;
- }
The same issue here.
And the one for waitid may work like this:
@@ -1390,10 +1382,6 @@ pid_t sys_wait4(pid_t pid, int *status, int options, struct rusage *rusage) int idtype = P_PID; int ret;
- /* emulate the 'pid == INT_MIN' path of wait4 */ - if (pid == INT_MIN) - return -ESRCH; - if (pid < -1) { idtype = P_PGID; pid *= -1; @@ -593,7 +593,9 @@ int run_syscall(int min, int max) CASE_TEST(unlink_root); EXPECT_SYSER(1, unlink("/"), -1, EISDIR); break; CASE_TEST(unlink_blah); EXPECT_SYSER(1, unlink("/proc/self/blah"), -1, ENOENT); break; CASE_TEST(wait_child); EXPECT_SYSER(1, wait(&tmp), -1, ECHILD); break; +#ifdef __NR_wait4 CASE_TEST(waitpid_min); EXPECT_SYSER(1, waitpid(INT_MIN, &tmp, WNOHANG), -1, ESRCH); break; +#endif CASE_TEST(waitpid_child); EXPECT_SYSER(1, waitpid(getpid(), &tmp, WNOHANG), -1, ECHILD); break; CASE_TEST(write_badf); EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break; CASE_TEST(write_zero); EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
Best regards, Zhangjin
Hi Zhangjin,
On Sat, May 27, 2023 at 09:26:35AM +0800, Zhangjin Wu wrote:
@@ -554,7 +560,47 @@ long getpagesize(void) static __attribute__((unused)) int sys_gettimeofday(struct timeval *tv, struct timezone *tz) { +#ifdef __NR_gettimeofday return my_syscall2(__NR_gettimeofday, tv, tz); +#elif defined(__NR_clock_gettime) || defined(__NR_clock_gettime64) +#ifdef __NR_clock_gettime
- struct timespec ts;
+#else
- struct timespec64 ts;
+#define __NR_clock_gettime __NR_clock_gettime64 +#endif
- int ret;
- /* make sure tv pointer is at least after code segment */
- if (tv != NULL && (char *)tv <= &etext)
return -EFAULT;
To me the weird etext comparisions don't seem to be worth it, to be honest.
This is the issue we explained in commit message:
* Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
but not that deeply described the direct cause, the direct cause is that the test case passes a '(void *)1' and the kernel space of gettimeofday can simply 'fixup' this issue by the get_user/put_user helpers, but our user-space tv and tz code has no such function, just emulate such 'fixup' by a stupid etext compare to at least make sure the data pointer is in data range. Welcome better solution.
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
I also disagree with this approach. The purpose of nolibc is not to serve "nolibc-test", but to serve userland programs in the most efficient way possible in terms of code size. Nolibc-test only tries to reproduce a number of well-known success and error cases that applications might face, to detect whether or not we implemented our syscalls correctly and if something recently broke on the kernel side. In no case should we adapt the nolibc code to the tests run by nolibc-test.
What this means here is that we need to decide whether the pointer check by the syscall is important for applications, in which case we should do our best to validate it, or if we consider that we really don't care a dime since invalid values will only be sent by bogus applications we do not expect to support, and we get rid of the test. Note that reliably detecting that a pointer is valid from userland is not trivial at all, it requires to rely on other syscalls for the check and is racy in threaded environments.
I tend to think that for gettimeofday() we don't really care about invalid pointers we could be seeing here because I can't imagine a single case where this wouldn't come from an application bug, so in my opinion it's fine if the application crashes. The problem here is for nolibc-test. But this just means that we probably need to revisit the way we validate some failures, to only perform some of them on native syscalls and not emulated ones.
One approach might consist in tagging emulated syscalls and using this for each test. Originally we only had a 1:1 mapping so this was not a question. But with all the remapping you're encountering we might have no other choice. For example for each syscall we could have:
#define _NOLIBC_sys_blah_native 0 // implemented but emulated syscall #define _NOLIBC_sys_blah_native 1 // implemented and native syscall
And our macros in nolibc-test could rely on this do skip some tests (just skip the whole test if _NOLIBC_sys_blah_native is not defined, and skip some error tests if it's 0).
Overall what I'm seeing is that rv32 integration requires significant changes to the existing nolibc-test infrastructure due to the need to remap many syscalls, and that this will result in much cleaner and more maintainable code than forcefully inserting it there. Now that we're getting a cleaner picture of what the difficulties are, we'd rather work on these as a priority.
Regards, Willy
Hi,
Just to mention, the 3rd one is missing in the riscv-linux mailing list [1], but it is ok in the other two [2], [3], it was sent with the same command ;-(
[1]: https://lore.kernel.org/linux-riscv/cover.1684949267.git.falcon@tinylab.org/... [2]: https://lore.kernel.org/lkml/cover.1684949267.git.falcon@tinylab.org/T/#m1c2... [3]: https://lore.kernel.org/linux-kselftest/cover.1684949267.git.falcon@tinylab....
If required, do we need to resend the 3rd to riscv-linux only?
Thanks, Zhangjin
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
Introduction
This series is based on the 20230524-nolibc-rv32+stkp4 branch of [3], it includes 3 parts, they work together to add full rv32 support:
Reverts two old out-of-day patches
- Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
- Revert "selftests/nolibc: Fix up compile error for rv32"
(these two and the reverted ones:
- commit 606343b7478c ("selftests/nolibc: Fix up compile error for rv32")
- commit d2c3acba6d66 ("tools/nolibc: riscv: Support __NR_llseek for rv32")
can be removed from the git repo completely, there are two new ones to replace them)
Compile and test support patches
- selftests/nolibc: print name instead of number for EOVERFLOW
- selftests/nolibc: syscall_args: use __NR_statx for rv32
- --> replace the old one 606343b7478, use statx instead of read
- selftests/nolibc: riscv: customize makefile for rv32
- selftests/nolibc: allow specify a bios for qemu
- selftests/nolibc: remove the duplicated gettimeofday_bad2
Fix up some missing syscalls, mainly time32 syscalls
- tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
- --> replace the old one d2c3acba6d66, cleaned up
- tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
- tools/nolibc: ppoll/ppoll_time64: Add a missing argument
- tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
- tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
- tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32
Compile
For rv64:
$ make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ... $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
For rv32:
$ make ARCH=riscv CONFIG_32BIT=1 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ... $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test $ file nolibc-test nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
Testing
Environment:
// gcc toolchain $ riscv64-linux-gnu-gcc --version riscv64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. // glibc >= 2.33 required, for older glibc, must upgrade include/bits/wordsize.h $ dpkg -l | grep libc6-dev | grep riscv ii libc6-dev-riscv64-cross 2.31-0ubuntu7cross1 // glibc include/bits/wordsize.h: manually upgraded to >= 2.33 // without this, can not build tools/testing/selftests/nolibc/nolibc-test.c $ cat /usr/riscv64-linux-gnu/include/bits/wordsize.h #if __riscv_xlen == (__SIZEOF_POINTER__ * 8) # define __WORDSIZE __riscv_xlen #else # error unsupported ABI #endif # define __WORDSIZE_TIME64_COMPAT32 1 #if __WORDSIZE == 32 # define __WORDSIZE32_SIZE_ULONG 0 # define __WORDSIZE32_PTRDIFF_LONG 0 #endif // higher qemu version is better, latest version is v8.0.0+ $ qemu-system-riscv64 --version QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.18) Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers // opensbi version, higher is better, must match kernel version and qemu version // rv64: used version is 1.2, latest is 1.2 $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1 OpenSBI v1.2-116-g7919530 // rv32: used version is v0.9, latest is 1.2 $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1 OpenSBI v0.9-152-g754d511
For rv64:
$ pwd /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- defconfig $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv64/virt/bsp/bios/opensbi/generic/fw_jump.elf run MKDIR sysroot/riscv/include make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' CC nolibc-test MKDIR initramfs INSTALL initramfs/init make[1]: Entering directory '/labs/linux-lab/src/linux-stable' ... LD vmlinux NM System.map SORTTAB vmlinux OBJCOPY arch/riscv/boot/Image Kernel: arch/riscv/boot/Image is ready make[1]: Leaving directory '/labs/linux-lab/src/linux-stable' 135 test(s) passed. $ file ../../../../vmlinux ../../../../vmlinux: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=b8e1cea5122b04bce540b4022f0d6f171ffe615a, not stripped
For rv32:
$ pwd /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- defconfig $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv32/virt/bsp/bios/opensbi/generic/fw_jump.elf run MKDIR sysroot/riscv/include make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[2]: Entering directory '/labs/linux-lab/src/linux-stable' INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include make[2]: Leaving directory '/labs/linux-lab/src/linux-stable' make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc' CC nolibc-test MKDIR initramfs INSTALL initramfs/init make[1]: Entering directory '/labs/linux-lab/src/linux-stable' CALL scripts/checksyscalls.sh GEN usr/initramfs_data.cpio COPY usr/initramfs_inc_data AS usr/initramfs_data.o AR usr/built-in.a GEN security/selinux/flask.h security/selinux/av_permissions.h CC security/selinux/avc.o CC security/selinux/hooks.o CC security/selinux/selinuxfs.o CC security/selinux/nlmsgtab.o CC security/selinux/netif.o CC security/selinux/netnode.o CC security/selinux/netport.o CC security/selinux/status.o CC security/selinux/ss/services.o AR security/selinux/built-in.a AR security/built-in.a AR built-in.a AR vmlinux.a LD vmlinux.o OBJCOPY modules.builtin.modinfo GEN modules.builtin MODPOST vmlinux.symvers UPD include/generated/utsversion.h CC init/version-timestamp.o LD .tmp_vmlinux.kallsyms1 NM .tmp_vmlinux.kallsyms1.syms KSYMS .tmp_vmlinux.kallsyms1.S AS .tmp_vmlinux.kallsyms1.S LD .tmp_vmlinux.kallsyms2 NM .tmp_vmlinux.kallsyms2.syms KSYMS .tmp_vmlinux.kallsyms2.S AS .tmp_vmlinux.kallsyms2.S LD vmlinux NM System.map SORTTAB vmlinux OBJCOPY arch/riscv/boot/Image Kernel: arch/riscv/boot/Image is ready make[1]: Leaving directory '/labs/linux-lab/src/linux-stable' 135 test(s) passed. $ file ../../../../vmlinux ../../../../vmlinux: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=bad4c1f3899f47355d2a2010bade56972fd94b9d, not stripped
The full rv64 testing result (run.out) is uploaded at [4]. The full rv32 testing result (run.out) is uploaded at [5].
That's all, thanks!
Best regards, Zhangjin Wu
Zhangjin Wu (13): Revert "tools/nolibc: riscv: Support __NR_llseek for rv32" Revert "selftests/nolibc: Fix up compile error for rv32" selftests/nolibc: print name instead of number for EOVERFLOW selftests/nolibc: syscall_args: use __NR_statx for rv32 selftests/nolibc: riscv: customize makefile for rv32 selftests/nolibc: allow specify a bios for qemu selftests/nolibc: remove the duplicated gettimeofday_bad2 tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32 tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32 tools/nolibc: ppoll/ppoll_time64: Add a missing argument tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32 tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32 tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32
tools/include/nolibc/std.h | 1 + tools/include/nolibc/sys.h | 135 +++++++++++++++++-- tools/include/nolibc/types.h | 21 ++- tools/testing/selftests/nolibc/Makefile | 14 +- tools/testing/selftests/nolibc/nolibc-test.c | 15 ++- 5 files changed, 167 insertions(+), 19 deletions(-)
-- 2.25.1
Hi Zhangjin,
On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
(...)
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Regardless, in order to clean the things up and relieve you from the non-rv32 stuff, I've just reverted the two patches that your series reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to branch 20230528-nolibc-rv32+stkp5.
Regards, Willy
On 2023-05-28 09:59:55+0200, Willy Tarreau wrote:
On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
(...)
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Regardless, in order to clean the things up and relieve you from the non-rv32 stuff, I've just reverted the two patches that your series reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to branch 20230528-nolibc-rv32+stkp5.
If you are fine with pushing more stuff to this branch, picking up the fix for the duplicated test gettimeofday_bad2 (7) would be nice, too.
Thomas
On 2023-05-28 10:42:39+0200, Thomas Weißschuh wrote:
On 2023-05-28 09:59:55+0200, Willy Tarreau wrote:
On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
(...)
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Regardless, in order to clean the things up and relieve you from the non-rv32 stuff, I've just reverted the two patches that your series reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to branch 20230528-nolibc-rv32+stkp5.
If you are fine with pushing more stuff to this branch, picking up the fix for the duplicated test gettimeofday_bad2 (7) would be nice, too.
And the ppoll() argument cleanup (10) for that matter.
Zhangjin:
IMO it would be more convenient to move generic cleanup patches to the beginning of the series. When the reviewers are focussing on the real changes they won't be interrupted by the cleanups. Also the maintainer can more easily pick them up independently, so they are dealt with and nobody has to worry about them anymore.
Thomas
On Sun, May 28, 2023 at 11:41:53AM +0200, Thomas Weißschuh wrote:
If you are fine with pushing more stuff to this branch, picking up the fix for the duplicated test gettimeofday_bad2 (7) would be nice, too.
And the ppoll() argument cleanup (10) for that matter.
OK now both done, thank you.
IMO it would be more convenient to move generic cleanup patches to the beginning of the series. When the reviewers are focussing on the real changes they won't be interrupted by the cleanups. Also the maintainer can more easily pick them up independently, so they are dealt with and nobody has to worry about them anymore.
Agreed!
Thanks! Willy
Hi, Willy
Hi Zhangjin,
On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
(...)
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Yes, user-space signal() may be the right direction, we just need to let user-space not crash the kernel, what about this 'solution' for current stage (consider the pure time64 support too):
#if defined(NOLIBC) && defined(__NR_gettimeofday) && __SIZEOF_LONG__ == 8 CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; #endif
This idea is from your commit 1da02f51088 ("selftests/nolibc: support glibc as well") for glibc, but the difference is of course glibc not crashes the kernel.
Btw, since the gettimeofday_null case may be optimized by compiler and not trigger such errors:
// rv32 nolibc-test.c:(.text.run_syscall+0x8c0): undefined reference to `__divdi3'
// arm32 nolibc-test.c:(.text.run_syscall+0x820): undefined reference to `__aeabi_ldivmod'
The above errors have been hidden after the disabling of the gettimeofday_bad1 test case, so, still need to solve it before sending v2.
The method used by musl may work, but the high bits may be lost (from long long to int)?
tv->tv_usec = (int)ts.tv_nsec / 1000;
Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the ones for the other architectures, or get one from lib/math/div64.c.
Will add such new test cases to detect the above issues:
CASE_TEST(gettimeofday_tv); EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break; CASE_TEST(gettimeofday_tz); EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break; CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break;
May still require to add 'used' attribute to 'struct timeval tv' and 'struct timeval tz' to let compiler not optimize them away.
For the waitid syscall based waitpid INT_MIN test case, I have prepared such code:
#define IF_TEST(name) \ if (strcmp(test, #name) == 0)
const int _errorno(const char *test) { #ifdef __NR_wait4 IF_TEST(waitpid_min); return ESRCH; #else /* __NR_waitid */ IF_TEST(waitpid_min); return EINVAL; #endif return 0; }
#define errorno(test) _errorno(#test)
CASE_TEST(waitpid_min); EXPECT_SYSER(1, waitpid(INT_MIN, &tmp, WNOHANG), -1, errorno(waitpid_min)); break;
Instead of simply disabling this case, the above code allows to return different values for different syscalls.
is a standalone errorno_waitpid_min() better? I just want to let future test cases share some code, but it may be slower ;-)
Regardless, in order to clean the things up and relieve you from the non-rv32 stuff, I've just reverted the two patches that your series reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to branch 20230528-nolibc-rv32+stkp5.
Thanks very much and I have seen another two have been pushed too, will rebase everything on this new branch.
Based on the other suggestions from you and Thomas, I plan to send some generic and independent changes at first, and then the left hard parts, It may simplify the whole progress.
Best regards, Zhangjin
Regards, Willy
On Sun, May 28, 2023 at 06:39:57PM +0800, Zhangjin Wu wrote:
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Yes, user-space signal() may be the right direction, we just need to let user-space not crash the kernel, what about this 'solution' for current stage (consider the pure time64 support too):
#if defined(NOLIBC) && defined(__NR_gettimeofday) && __SIZEOF_LONG__ == 8 CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; #endif
This idea is from your commit 1da02f51088 ("selftests/nolibc: support glibc as well") for glibc, but the difference is of course glibc not crashes the kernel.
Well, I was imagining implementing an EXPECT_EFAULT() macro that would rely on whatever other macros we'd set to indicate that a syscall got remapped. But I had another check grepping for EFAULT:
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; CASE_TEST(poll_fault); EXPECT_SYSER(1, poll((void *)1, 1, 0), -1, EFAULT); break; CASE_TEST(prctl); EXPECT_SYSER(1, prctl(PR_SET_NAME, (unsigned long)NULL, 0, 0, 0), -1, EFAULT); break; CASE_TEST(select_fault); EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break; CASE_TEST(stat_fault); EXPECT_SYSER(1, stat(NULL, &stat_buf), -1, EFAULT); break; CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
In short, they're very few, and several of these could simply be dropped as irrelevant once we know that the libc is able to remap them and dereference the arguments itself.
I'd be fine with dropping the two gettimeofday_bad ones, poll_fault, select_fault and stat_fault. These ones already have at least one or two other tests. These ones were initially added because they were easy to implement, but if they're not relevant we can drop them and stop wondering how to hack around the tests.
If that's OK for you as well I can do that.
Btw, since the gettimeofday_null case may be optimized by compiler and not trigger such errors:
// rv32 nolibc-test.c:(.text.run_syscall+0x8c0): undefined reference to `__divdi3' // arm32 nolibc-test.c:(.text.run_syscall+0x820): undefined reference to `__aeabi_ldivmod'
The above errors have been hidden after the disabling of the gettimeofday_bad1 test case, so, still need to solve it before sending v2.
Sorry, I don't understand what you mean, I'm not seeing such a divide in the code. Or maybe you're speaking about what you got after some of your proposed changes ?
The method used by musl may work, but the high bits may be lost (from long long to int)? tv->tv_usec = (int)ts.tv_nsec / 1000;
Yes, and it would be even cleaner to use a uint here since tv_nsec is always positive. This will simply result in a multiplication and a shift on most platforms. Of course that's the type of thing you normally don't want on a fast path for some small systems but here code compacity counts more and that's fine.
Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the ones for the other architectures, or get one from lib/math/div64.c.
No, these ones come from the compiler via libgcc_s, we must not try to reimplement them. And we should do our best to avoid depending on them to avoid the error you got above.
Will add such new test cases to detect the above issues:
CASE_TEST(gettimeofday_tv); EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break; CASE_TEST(gettimeofday_tz); EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break; CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break;
May still require to add 'used' attribute to 'struct timeval tv' and 'struct timeval tz' to let compiler not optimize them away.
Maybe, or turn them to volatile as well.
For the waitid syscall based waitpid INT_MIN test case, I have prepared such code:
#define IF_TEST(name) \ if (strcmp(test, #name) == 0) const int _errorno(const char *test) { #ifdef __NR_wait4 IF_TEST(waitpid_min); return ESRCH; #else /* __NR_waitid */ IF_TEST(waitpid_min); return EINVAL; #endif return 0; } #define errorno(test) _errorno(#test) CASE_TEST(waitpid_min); EXPECT_SYSER(1, waitpid(INT_MIN, &tmp, WNOHANG), -1, errorno(waitpid_min)); break;
Instead of simply disabling this case, the above code allows to return different values for different syscalls.
I don't like this, it gets particularly complicated to follow, especially since it doesn't rely on the underlying syscall but on which ones are defined, and supposes that the underlying implementation will use exactly these ones. Do not forget that we're not trying to verify that the tests provoke a specific syscall return, but that our syscall implementation returns the errno the application expects. If we see that one of them breaks, it means either that our test is wrong or undefined, or that our mapping of the syscall is incorrect. But in either case it indicates that an application relying on a specific errno would see a different value.
Many syscalls can return various values among a set, depending on which error is tested first. If that's the case for the ones above, I'd largely prefer to have EXPECT_SYSER2() that accepts any errno among a set of two (and maybe layer EXPECT_SYSER3() if 3 errno are possible).
Also there's something to keep in mind: nolibc-test is just one userland application among others. This means that every time you need to modify it to shut up an error that pops up after a change to nolibc, it means that you're possibly breaking one application living on an edge case and explicitly checking for that errno value. It is not necessarily dramatic but that's still something to keep in mind. We've all seen some of our code fail after a syscall started to report a new errno value we didn't expect, so it's important to still be cautious here and not to rely too much on the ease of adapting error handling in nolibc-test.
Thanks very much and I have seen another two have been pushed too, will rebase everything on this new branch.
OK.
Based on the other suggestions from you and Thomas, I plan to send some generic and independent changes at first, and then the left hard parts, It may simplify the whole progress.
Yes, thank you! As a general rule of thumb (which makes the handling easier for everyone including you), the least controversial changes should be proposed first. This often allows to merge the first half of the patches at once and saves you from having to reorder what's left.
Willy
On Sun, May 28, 2023 at 06:39:57PM +0800, Zhangjin Wu wrote:
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Yes, user-space signal() may be the right direction, we just need to let user-space not crash the kernel, what about this 'solution' for current stage (consider the pure time64 support too):
#if defined(NOLIBC) && defined(__NR_gettimeofday) && __SIZEOF_LONG__ == 8 CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; #endif
This idea is from your commit 1da02f51088 ("selftests/nolibc: support glibc as well") for glibc, but the difference is of course glibc not crashes the kernel.
Well, I was imagining implementing an EXPECT_EFAULT() macro that would rely on whatever other macros we'd set to indicate that a syscall got remapped. But I had another check grepping for EFAULT:
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; CASE_TEST(poll_fault); EXPECT_SYSER(1, poll((void *)1, 1, 0), -1, EFAULT); break; CASE_TEST(prctl); EXPECT_SYSER(1, prctl(PR_SET_NAME, (unsigned long)NULL, 0, 0, 0), -1, EFAULT); break; CASE_TEST(select_fault); EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break; CASE_TEST(stat_fault); EXPECT_SYSER(1, stat(NULL, &stat_buf), -1, EFAULT); break; CASE_TEST(syscall_args); EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
In short, they're very few, and several of these could simply be dropped as irrelevant once we know that the libc is able to remap them and dereference the arguments itself.
I'd be fine with dropping the two gettimeofday_bad ones, poll_fault, select_fault and stat_fault. These ones already have at least one or two other tests. These ones were initially added because they were easy to implement, but if they're not relevant we can drop them and stop wondering how to hack around the tests.
If that's OK for you as well I can do that.
The dropping of the others is not required, since currently, we only found these two gettimeofday test cases have such issues when we implement them with clock_gettime/time64, because there is a "timespec to timeval" conversion in user-space, if the data pointer could be passed to kernel space, there would be no such issues (kernel will transfer data via put_user() helpers).
Btw, since the gettimeofday_null case may be optimized by compiler and not trigger such errors:
// rv32 nolibc-test.c:(.text.run_syscall+0x8c0): undefined reference to `__divdi3' // arm32 nolibc-test.c:(.text.run_syscall+0x820): undefined reference to `__aeabi_ldivmod'
The above errors have been hidden after the disabling of the gettimeofday_bad1 test case, so, still need to solve it before sending v2.
Sorry, I don't understand what you mean, I'm not seeing such a divide in the code. Or maybe you're speaking about what you got after some of your proposed changes ?
The method used by musl may work, but the high bits may be lost (from long long to int)? tv->tv_usec = (int)ts.tv_nsec / 1000;
Yes, and it would be even cleaner to use a uint here since tv_nsec is always positive. This will simply result in a multiplication and a shift on most platforms. Of course that's the type of thing you normally don't want on a fast path for some small systems but here code compacity counts more and that's fine.
Ok, will use uint here.
Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the ones for the other architectures, or get one from lib/math/div64.c.
No, these ones come from the compiler via libgcc_s, we must not try to reimplement them. And we should do our best to avoid depending on them to avoid the error you got above.
Will add such new test cases to detect the above issues:
CASE_TEST(gettimeofday_tv); EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break; CASE_TEST(gettimeofday_tz); EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break; CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break;
May still require to add 'used' attribute to 'struct timeval tv' and 'struct timeval tz' to let compiler not optimize them away.
Maybe, or turn them to volatile as well.
Yeah, volatile is better.
For the waitid syscall based waitpid INT_MIN test case, I have prepared such code:
#define IF_TEST(name) \ if (strcmp(test, #name) == 0) const int _errorno(const char *test) { #ifdef __NR_wait4 IF_TEST(waitpid_min); return ESRCH; #else /* __NR_waitid */ IF_TEST(waitpid_min); return EINVAL; #endif return 0; } #define errorno(test) _errorno(#test) CASE_TEST(waitpid_min); EXPECT_SYSER(1, waitpid(INT_MIN, &tmp, WNOHANG), -1, errorno(waitpid_min)); break;
Instead of simply disabling this case, the above code allows to return different values for different syscalls.
I don't like this, it gets particularly complicated to follow, especially since it doesn't rely on the underlying syscall but on which ones are defined, and supposes that the underlying implementation will use exactly these ones. Do not forget that we're not trying to verify that the tests provoke a specific syscall return, but that our syscall implementation returns the errno the application expects. If we see that one of them breaks, it means either that our test is wrong or undefined, or that our mapping of the syscall is incorrect. But in either case it indicates that an application relying on a specific errno would see a different value.
Many syscalls can return various values among a set, depending on which error is tested first. If that's the case for the ones above, I'd largely prefer to have EXPECT_SYSER2() that accepts any errno among a set of two (and maybe layer EXPECT_SYSER3() if 3 errno are possible).
Ok.
Also there's something to keep in mind: nolibc-test is just one userland application among others. This means that every time you need to modify it to shut up an error that pops up after a change to nolibc, it means that you're possibly breaking one application living on an edge case and explicitly checking for that errno value. It is not necessarily dramatic but that's still something to keep in mind. We've all seen some of our code fail after a syscall started to report a new errno value we didn't expect, so it's important to still be cautious here and not to rely too much on the ease of adapting error handling in nolibc-test.
Thanks very much and I have seen another two have been pushed too, will rebase everything on this new branch.
OK.
Based on the other suggestions from you and Thomas, I plan to send some generic and independent changes at first, and then the left hard parts, It may simplify the whole progress.
Yes, thank you! As a general rule of thumb (which makes the handling easier for everyone including you), the least controversial changes should be proposed first. This often allows to merge the first half of the patches at once and saves you from having to reorder what's left.
That's true.
Thanks, Zhangjin
Willy
Hi Zhangjin,
May 28, 2023 12:40:31 Zhangjin Wu falcon@tinylab.org:
Hi, Willy
Hi Zhangjin,
On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
(...)
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Yes, user-space signal() may be the right direction, we just need to let user-space not crash the kernel, what about this 'solution' for current stage (consider the pure time64 support too):
If you did manage to crash the actual kernel than that would be a bug in the kernel that needs to be fixed. Feel free to describe how it happened and I'll take a look.
Thomas
May 28, 2023 12:40:31 Zhangjin Wu falcon@tinylab.org:
Hi, Willy
Hi Zhangjin,
On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
(...)
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Yes, user-space signal() may be the right direction, we just need to let user-space not crash the kernel, what about this 'solution' for current stage (consider the pure time64 support too):
If you did manage to crash the actual kernel than that would be a bug in the kernel that needs to be fixed. Feel free to describe how it happened and I'll take a look.
Sorry, my description above is not really right, the sigsegv (11) signal will be sent to our program when it tries to write something to the address: (void *)1 for this test case tries to do/test so:
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
In the original gettimeofday syscall based implementation, the kernel side tries to use put_user to copy timespec data to user space timeval:
kernel/time/time.c:
if (put_user(ts.tv_sec, &tv->tv_sec) || put_user(ts.tv_nsec / 1000, &tv->tv_usec)) return -EFAULT;
The put_user() in arch/riscv/include/asm/uaccess.h will trigger the 'fixup' logic and return -EFAULT and let the other test cases continue.
But if add our clock_gettime/time64 syscalls based implementation, we must get timespec from kernel space and then convert them to timeval in user space, the address of timespec can be handled by kernel space too, but we must write them to the address of timeval in user-space:
tv->tv_sec = ts.tv_sec; tv->tv_usec = (unsigned int)ts.tv_nsec / 1000;
In above test case, tv above is something like (void *)1, it is invalid, kernel will prevent writing and force send a sigsegv and stop the program:
35 gettimeofday_bad1init[1]: unhandled signal 11 code 0x1 at 0x00000002 in init[10000+5000] CPU: 0 PID: 1 Comm: init Not tainted 6.4.0-rc1-00137-gfdc311fa22ed-dirty #60 Hardware name: riscv-virtio,qemu (DT) epc : 00012c90 ra : 00012c6c sp : 9d097d90 gp : 00016800 tp : 00000000 t0 : 00000000 t1 : 0000000a t2 : 00000000 s0 : 00000001 s1 : 00016008 a0 : 00000000 a1 : 9d097da8 a2 : 00000014 a3 : 00000000 a4 : 00000000 a5 : 00000000 a6 : 00000001 a7 : 00000193 s2 : 00000023 s3 : 00000000 s4 : 9d097da4 s5 : 00000000 s6 : 0000541b s7 : 00000007 s8 : 9d097dcc s9 : 00014474 s10: 00016000 s11: 00000006 t3 : 00000000 t4 : ffffffff t5 : 00000000 t6 : 00000000 status: 00000020 badaddr: 00000002 cause: 0000000f Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Because our test run nolibc-test as init of initramfs on qemu, when init exit but not reboot as normally, then it 'crashes' the kernel (kernel panic above).
If we have sigaction()/sigsetjmp/siglongjump support, then, we can call 'reboot()' in sigsegv signal handler, and event let it continue the other test cases. sigaction seems only work to trigger when to call siglongjump, siglongjump ask sigsetjmp to do the real recover action.
I did find some useful urls, and wrote such an exception restore logic, not completely, not support NOLIBC_TEST environment variables yet.
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 001ea789fa39..e7d488722e14 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -110,6 +110,47 @@ const char *errorname(int err) } }
+#if !defined(NOLIBC) +#include <setjmp.h> +int test_base = 0; +int test_number = 0; +int test_llen = 0; +sigjmp_buf mark; +typedef int (*func_t)(int min, int max); +func_t test_func = NULL; +int test_idx = 0; + +static int pad_spc(int llen, int cnt, const char *fmt, ...); +static const struct test test_names[]; + +void continue_run(void) +{ + int idx; + int err; + int ret; + int min = 0; + int max = INT_MAX; + + test_llen += printf(" = %d ", -1); + pad_spc(test_llen, 64, "[FAIL] (continued by sigaction/siglongjmp/sigsetjmp)\n"); + test_func = test_names[test_idx].func; + test_func(test_number - test_base + 1, 1000); + + for (idx = test_idx + 1; test_names[idx].name; idx++) { + printf("Running test '%s'\n", test_names[idx].name); + err = test_names[idx].func(min, max); + ret += err; + printf("Errors during this test: %d\n\n", err); + } +} + +void action(int sig, siginfo_t *si, void *p) +{ + if (sig != SIGKILL) + siglongjmp(mark, -1); +} +#endif + #define IF_TEST(name) \ if (strcmp(test, #name) == 0)
@@ -447,8 +488,13 @@ static int expect_strne(const char *expr, int llen, const char *cmp)
/* declare tests based on line numbers. There must be exactly one test per line. */ +#if !defined(NOLIBC) +#define CASE_TEST(name) \ + case __LINE__: test_number = __LINE__; if (strcmp(#name, "getpid") == 0) { test_base = test_number; } llen += printf("%d %s", test, #name); test_llen = llen; +#else #define CASE_TEST(name) \ case __LINE__: llen += printf("%d %s", test, #name); +#endif
/* used by some syscall tests below */ int test_getdents64(const char *dir) @@ -582,7 +628,7 @@ int run_syscall(int min, int max) CASE_TEST(gettimeofday_tv); EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break; CASE_TEST(gettimeofday_tz); EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break; CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break; -#ifdef NOLIBC +#if 1 CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break; #endif @@ -952,6 +998,22 @@ int main(int argc, char **argv, char **envp) if (getpid() == 1) prepare();
+#if !defined(NOLIBC) + struct sigaction sa = {0}; + sa.sa_sigaction = action; + sa.sa_flags = SA_SIGINFO; + ret = sigaction(SIGSEGV, &sa, NULL); + if (ret == -1) { + perror("sigaction"); + exit(1); + } + + if (sigsetjmp(mark, 1) != 0) { + continue_run(); + exit(0); + } +#endif + /* the definition of a series of tests comes from either argv[1] or the * "NOLIBC_TEST" environment variable. It's made of a comma-delimited * series of test names and optional ranges: @@ -1008,6 +1070,9 @@ int main(int argc, char **argv, char **envp)
/* now's time to call the test */ printf("Running test '%s'\n", test_names[idx].name); +#if !defined(NOLIBC) + test_idx = idx; +#endif err = test_names[idx].func(min, max); ret += err; printf("Errors during this test: %d\n\n", err); @@ -1021,6 +1086,9 @@ int main(int argc, char **argv, char **envp) /* no test mentioned, run everything */ for (idx = 0; test_names[idx].name; idx++) { printf("Running test '%s'\n", test_names[idx].name); +#if !defined(NOLIBC) + test_idx = idx; +#endif err = test_names[idx].func(min, max); ret += err; printf("Errors during this test: %d\n\n", err);
usage:
$ gcc -o nolibc-test tools/testing/selftests/nolibc/nolibc-test.c $ ./nolibc-test ... 35 gettimeofday_tz = 0 [OK] 36 gettimeofday_tv_tz = 0 [OK] 37 gettimeofday_bad1 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 38 gettimeofday_bad2 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 39 getpagesize = 0 [OK] 40 ioctl_tiocinq = 0 [OK] 41 ioctl_tiocinq = 0 [OK] ...
It did work as expected, but for nolibc, we still need to add sigaction/siglongjump/sigsetjmp support.
Will send a patch based on Willy's latest branch, perhaps this may help us to verify the future sigaction/siglongjump/sigsetjmp for nolibc.
ref: https://www.ibm.com/docs/en/i/7.1?topic=ssw_ibm_i_71/apis/sigsetj.html https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-siglongjmp-restore-sta...
Best regards, Zhangjin
Thomas
Hi Zhangjin,
On 2023-05-29 02:39:06+0800, Zhangjin Wu wrote:
May 28, 2023 12:40:31 Zhangjin Wu falcon@tinylab.org:
On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
Hi, Willy
Thanks very mush for your kindly review, discuss and suggestion, now we get full rv32 support ;-)
In the first series [1], we have fixed up the compile errors about _start and __NR_llseek for rv32, but left compile errors about tons of time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use latest system call ABI")) and the missing fstat in nolibc-test.c [2], now we have fixed up all of them.
(...)
I have read the comments that others made on the series and overall agree. I've seen that you intend to prepare a v2. I think we must first decide how to better deal with emulated syscalls as I said in an earlier message. Probably that we should just add a specific test case for EFAULT in nolibc-test since it's the only one (I think) that risks to trigger crashes with emulated syscalls. We could also imagine dealing with the signal ourselves but I'm not that keen on going to implement signal() & longjmp() for now :-/
Yes, user-space signal() may be the right direction, we just need to let user-space not crash the kernel, what about this 'solution' for current stage (consider the pure time64 support too):
If you did manage to crash the actual kernel than that would be a bug in the kernel that needs to be fixed. Feel free to describe how it happened and I'll take a look.
Sorry, my description above is not really right, the sigsegv (11) signal will be sent to our program when it tries to write something to the address: (void *)1 for this test case tries to do/test so:
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
<snip>
35 gettimeofday_bad1init[1]: unhandled signal 11 code 0x1 at 0x00000002 in init[10000+5000] CPU: 0 PID: 1 Comm: init Not tainted 6.4.0-rc1-00137-gfdc311fa22ed-dirty #60 Hardware name: riscv-virtio,qemu (DT) epc : 00012c90 ra : 00012c6c sp : 9d097d90 gp : 00016800 tp : 00000000 t0 : 00000000 t1 : 0000000a t2 : 00000000 s0 : 00000001 s1 : 00016008 a0 : 00000000 a1 : 9d097da8 a2 : 00000014 a3 : 00000000 a4 : 00000000 a5 : 00000000 a6 : 00000001 a7 : 00000193 s2 : 00000023 s3 : 00000000 s4 : 9d097da4 s5 : 00000000 s6 : 0000541b s7 : 00000007 s8 : 9d097dcc s9 : 00014474 s10: 00016000 s11: 00000006 t3 : 00000000 t4 : ffffffff t5 : 00000000 t6 : 00000000 status: 00000020 badaddr: 00000002 cause: 0000000f Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Because our test run nolibc-test as init of initramfs on qemu, when init exit but not reboot as normally, then it 'crashes' the kernel (kernel panic above).
This makes sense, thanks. I just wanted to make sure no kernel bugs were going unhandeld.
If we have sigaction()/sigsetjmp/siglongjump support, then, we can call 'reboot()' in sigsegv signal handler, and event let it continue the other test cases. sigaction seems only work to trigger when to call siglongjump, siglongjump ask sigsetjmp to do the real recover action.
I did find some useful urls, and wrote such an exception restore logic, not completely, not support NOLIBC_TEST environment variables yet.
<lots of implementation>
usage:
$ gcc -o nolibc-test tools/testing/selftests/nolibc/nolibc-test.c $ ./nolibc-test ... 35 gettimeofday_tz = 0 [OK] 36 gettimeofday_tv_tz = 0 [OK] 37 gettimeofday_bad1 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 38 gettimeofday_bad2 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 39 getpagesize = 0 [OK] 40 ioctl_tiocinq = 0 [OK] 41 ioctl_tiocinq = 0 [OK] ...
It did work as expected, but for nolibc, we still need to add sigaction/siglongjump/sigsetjmp support.
Will send a patch based on Willy's latest branch, perhaps this may help us to verify the future sigaction/siglongjump/sigsetjmp for nolibc.
ref: https://www.ibm.com/docs/en/i/7.1?topic=ssw_ibm_i_71/apis/sigsetj.html https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-siglongjmp-restore-sta...
This seems very complicated for fairly limited gain to be honest.
If we really want to keep the current testcase we could also ensure that the pointer does not fall into the first page, as the first page is not mapped under Linux:
0 <= addr < PAGE_SIZE
Or instead of PAGE_SIZE just hardcode 4096, as that should be the minimum size and and does not require a lookup.
Thomas
Hi Thomas,
On Mon, May 29, 2023 at 10:45:40AM +0200, Thomas Weißschuh wrote:
<lots of implementation>
usage:
$ gcc -o nolibc-test tools/testing/selftests/nolibc/nolibc-test.c $ ./nolibc-test ... 35 gettimeofday_tz = 0 [OK] 36 gettimeofday_tv_tz = 0 [OK] 37 gettimeofday_bad1 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 38 gettimeofday_bad2 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 39 getpagesize = 0 [OK] 40 ioctl_tiocinq = 0 [OK] 41 ioctl_tiocinq = 0 [OK] ...
It did work as expected, but for nolibc, we still need to add sigaction/siglongjump/sigsetjmp support.
Will send a patch based on Willy's latest branch, perhaps this may help us to verify the future sigaction/siglongjump/sigsetjmp for nolibc.
ref: https://www.ibm.com/docs/en/i/7.1?topic=ssw_ibm_i_71/apis/sigsetj.html https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-siglongjmp-restore-sta...
This seems very complicated for fairly limited gain to be honest.
I agree as well. I'm not denying the fact that one day we may want to support signal, longjmp and friends but I'm not convinced we want to go through that just to make a few uncertain tests succeed.
If we really want to keep the current testcase we could also ensure that the pointer does not fall into the first page, as the first page is not mapped under Linux:
0 <= addr < PAGE_SIZE
Or instead of PAGE_SIZE just hardcode 4096, as that should be the minimum size and and does not require a lookup.
I would not even do that. It brings nothing to the application layer and inflates the code. I'd rather just get rid of the EFAULT test cases that rely on an unreliable syscall (i.e. one that may either be a real syscall or an emulated one). The value brought by these tests is extremely low and they were implemented only because they were easy to do. If they're causing pain, let's just drop them.
Willy
Hi, Thomas, Willy
Hi Thomas,
On Mon, May 29, 2023 at 10:45:40AM +0200, Thomas Wei�schuh wrote:
<lots of implementation>
usage:
$ gcc -o nolibc-test tools/testing/selftests/nolibc/nolibc-test.c $ ./nolibc-test ... 35 gettimeofday_tz = 0 [OK] 36 gettimeofday_tv_tz = 0 [OK] 37 gettimeofday_bad1 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 38 gettimeofday_bad2 = -1 [FAIL] (continued by sigaction/siglongjmp/sigsetjmp) 39 getpagesize = 0 [OK] 40 ioctl_tiocinq = 0 [OK] 41 ioctl_tiocinq = 0 [OK] ...
It did work as expected, but for nolibc, we still need to add sigaction/siglongjump/sigsetjmp support.
Will send a patch based on Willy's latest branch, perhaps this may help us to verify the future sigaction/siglongjump/sigsetjmp for nolibc.
ref: https://www.ibm.com/docs/en/i/7.1?topic=ssw_ibm_i_71/apis/sigsetj.html https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-siglongjmp-restore-sta...
This seems very complicated for fairly limited gain to be honest.
I agree as well. I'm not denying the fact that one day we may want to support signal, longjmp and friends but I'm not convinced we want to go through that just to make a few uncertain tests succeed.
I agree too, I'm just interested in whether it is able to restore the whole test after a user-space invalid memory access ;-)
If we really want to keep the current testcase we could also ensure that the pointer does not fall into the first page, as the first page is not mapped under Linux:
0 <= addr < PAGE_SIZE
Or instead of PAGE_SIZE just hardcode 4096, as that should be the minimum size and and does not require a lookup.
I would not even do that. It brings nothing to the application layer and inflates the code. I'd rather just get rid of the EFAULT test cases that rely on an unreliable syscall (i.e. one that may either be a real syscall or an emulated one). The value brought by these tests is extremely low and they were implemented only because they were easy to do. If they're causing pain, let's just drop them.
Thanks, one of the sent v2 patches has dropped both of them.
yesterday, based on the demo code pasted in this email thread, I went further to implement a cleaner user-space 'efault' handler, with this handler, it is able to continue next test, and without this handler, just skip the test, so, it can be used to add future test cases for sigaction/sigsetjmp/siglongjmp.
besides, a multiple 'run' support is added too, will share the new code as a new standalone patchset later but I'm not expecting it is mergeable.
Thanks, Zhangjin
Willy
linux-kselftest-mirror@lists.linaro.org