On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote:
On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote:
This is now in mainline and appears to be causing several tests (at least the ptrace vmaccess global_attach test on arm64, possibly also some of the epoll tests) that previously were timed out by the harness to to hang instead. A bisect seems to point at this patch in particular, there was a bunch of discussion of the fallout of these patches but I'm afraid I lost track of it, is there something in flight for this? -next is affected as well from the looks of it.
FWIW I'm still seeing this on -rc2...
AFAICT this is due to the switch to using clone3() with CLONE_VFORK to start the test which means we never even call alarm() to set up the timeout for the test, let alone have the signal for it delivered. I'm a confused about how this could ever work, with clone_vfork() the parent shouldn't run until the child execs (which won't happen here) or exits. Since we don't call alarm() until after we started the child we never actually get that far, but even if we reorder things we'll not get the signal for the alarm if the child messes up since the parent is suspended.
I'm not clear what the original race being fixed here was but it seems like we should revert this since the timeout functionality is pretty important?