Hi Zhangjin,
On Sat, May 27, 2023 at 09:26:35AM +0800, Zhangjin Wu wrote:
@@ -554,7 +560,47 @@ long getpagesize(void) static __attribute__((unused)) int sys_gettimeofday(struct timeval *tv, struct timezone *tz) { +#ifdef __NR_gettimeofday return my_syscall2(__NR_gettimeofday, tv, tz); +#elif defined(__NR_clock_gettime) || defined(__NR_clock_gettime64) +#ifdef __NR_clock_gettime
- struct timespec ts;
+#else
- struct timespec64 ts;
+#define __NR_clock_gettime __NR_clock_gettime64 +#endif
- int ret;
- /* make sure tv pointer is at least after code segment */
- if (tv != NULL && (char *)tv <= &etext)
return -EFAULT;
To me the weird etext comparisions don't seem to be worth it, to be honest.
This is the issue we explained in commit message:
* Both tv and tz are not directly passed to kernel clock_gettime* syscalls, so, it isn't able to check the pointer automatically with the get_user/put_user helpers just like kernel gettimeofday syscall does. instead, we emulate (but not completely) such checks in our new __NR_clock_gettime* branch of nolibc.
but not that deeply described the direct cause, the direct cause is that the test case passes a '(void *)1' and the kernel space of gettimeofday can simply 'fixup' this issue by the get_user/put_user helpers, but our user-space tv and tz code has no such function, just emulate such 'fixup' by a stupid etext compare to at least make sure the data pointer is in data range. Welcome better solution.
CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break; CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
I also disagree with this approach. The purpose of nolibc is not to serve "nolibc-test", but to serve userland programs in the most efficient way possible in terms of code size. Nolibc-test only tries to reproduce a number of well-known success and error cases that applications might face, to detect whether or not we implemented our syscalls correctly and if something recently broke on the kernel side. In no case should we adapt the nolibc code to the tests run by nolibc-test.
What this means here is that we need to decide whether the pointer check by the syscall is important for applications, in which case we should do our best to validate it, or if we consider that we really don't care a dime since invalid values will only be sent by bogus applications we do not expect to support, and we get rid of the test. Note that reliably detecting that a pointer is valid from userland is not trivial at all, it requires to rely on other syscalls for the check and is racy in threaded environments.
I tend to think that for gettimeofday() we don't really care about invalid pointers we could be seeing here because I can't imagine a single case where this wouldn't come from an application bug, so in my opinion it's fine if the application crashes. The problem here is for nolibc-test. But this just means that we probably need to revisit the way we validate some failures, to only perform some of them on native syscalls and not emulated ones.
One approach might consist in tagging emulated syscalls and using this for each test. Originally we only had a 1:1 mapping so this was not a question. But with all the remapping you're encountering we might have no other choice. For example for each syscall we could have:
#define _NOLIBC_sys_blah_native 0 // implemented but emulated syscall #define _NOLIBC_sys_blah_native 1 // implemented and native syscall
And our macros in nolibc-test could rely on this do skip some tests (just skip the whole test if _NOLIBC_sys_blah_native is not defined, and skip some error tests if it's 0).
Overall what I'm seeing is that rv32 integration requires significant changes to the existing nolibc-test infrastructure due to the need to remap many syscalls, and that this will result in much cleaner and more maintainable code than forcefully inserting it there. Now that we're getting a cleaner picture of what the difficulties are, we'd rather work on these as a priority.
Regards, Willy