On Mon, Aug 02, 2021 at 03:19:39PM +0100, Mark Brown wrote:
On Mon, Aug 02, 2021 at 01:37:50PM +0100, Dave Martin wrote:
On Mon, Aug 02, 2021 at 12:33:30PM +0100, Mark Brown wrote:
That really doesn't seem like a good idea - it's just asking for fragility if a signal gets delivered to the parent process or something. Even if almost all the time there will only be one trip through the loop we should still have the loop there for those few cases where it triggers.
This concern only applies when the program actually registers signal handlers.
wait() can't return for any other reason, and it mustn't, precisely because historically software would have made this assumption. This is one reason why wait3() etc. are separate functions.
That's great for the reader with a detailed knowledge of exactly what error handling can be skipped and how standards conforming Linux is but less good for the reader who is merely aware of best practices. I am not clear what the problem that is solved by removing the loop here is TBH - to me it just makes it less obvious that we've handled everything.
Ok, leave it as is then.
(It would be good to collect some best-practice guidance on how to actually use syscalls, but that's clearly way out of scope here...)
That aside though, can't we use popen(3)?
I tend to forget about popen because it is "boring" to use it, but it looks like it fits this case quite well. Then it would be libc's problem how to fork and wait safely.
popen() appears to be break the _SET_VL_ONEXEC test. Between a lack of strace in my test filesystem and not spotting anything obvious in the glibc sources I can't tell exactly where it's doing something different, though it does feel like it should be a separate testcase if it's anything interesting. I do think there is value in having exactly what's done to start the child process be clear in the test program, and that coverage of anything interesting from popen() could be done incrementally.
Ah, dang, popen() will run the target program via a shell, so there will actually be two fork-exec()s, with the VL being reset to default by the second exec.
Using PR_SET_SET_VL with popen() still makes sense, but if you want the target program to get the new VL (not just the shell) then you'd need PR_SVE_VL_INHERIT. Then we would get confused later when trying to test the !PR_SVE_VL_INHERIT case. The way to "fix" this would be to have the shell invoke something like vlset, but that will blur the test in a different way, adding even more confusion.
So Ack, we can't test all the variations using the popen() method, so we probably shouldn't use it here at all.
This is the kind of reason why I tend not to go for it, I guess -- it looks convenient, but it's just that little bit overcooked as an API. *sigh*
I'll review your final version of the series, but I guess we're all good.
Cheers ---Dave