On Mon, 1 Apr 2024 at 22:17, John Stultz jstultz@google.com wrote:
On Thu, Mar 16, 2023 at 5:30 AM Marco Elver elver@google.com wrote:
From: Dmitry Vyukov dvyukov@google.com
POSIX timers using the CLOCK_PROCESS_CPUTIME_ID clock prefer the main thread of a thread group for signal delivery. However, this has a significant downside: it requires waking up a potentially idle thread.
Instead, prefer to deliver signals to the current thread (in the same thread group) if SIGEV_THREAD_ID is not set by the user. This does not change guaranteed semantics, since POSIX process CPU time timers have never guaranteed that signal delivery is to a specific thread (without SIGEV_THREAD_ID set).
The effect is that we no longer wake up potentially idle threads, and the kernel is no longer biased towards delivering the timer signal to any particular thread (which better distributes the timer signals esp. when multiple timers fire concurrently).
Signed-off-by: Dmitry Vyukov dvyukov@google.com Suggested-by: Oleg Nesterov oleg@redhat.com Reviewed-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Marco Elver elver@google.com
Apologies for drudging up this old thread.
I wanted to ask if anyone had objections to including this in the -stable trees?
After this and the follow-on patch e797203fb3ba ("selftests/timers/posix_timers: Test delivery of signals across threads") landed, folks testing older kernels with the latest selftests started to see the new test checking for this behavior to stall. Thomas did submit an adjustment to the test here to avoid the stall: https://lore.kernel.org/lkml/20230606142031.071059989@linutronix.de/, but it didn't seem to land, however that would just result in the test failing instead of hanging.
This change does seem to cherry-pick cleanly back to at least stable/linux-5.10.y cleanly, so it looks simple to pull this change back. But I wanted to make sure there wasn't anything subtle I was missing before sending patches.
I don't have objections per se. But I wonder how other tests deal with such situations. It should happen for any test for new functionality. Can we do the same other tests are doing?