[RFC PATCH] hrtimers: system-wide and per-task hrtimer slacks
mtk.manpages at gmail.com
Tue Apr 24 22:06:19 UTC 2012
On Fri, Apr 6, 2012 at 9:14 PM, Dmitry Antipov
<dmitry.antipov at linaro.org> wrote:
> On 04/05/2012 04:10 AM, Andrew Morton wrote:
>> Well.. there are some back-incompatibilities here.
>> prctl(PR_SET_TIMERSLACK, -1) used to restore current's slack setting to
>> whatever-we-inherited-at-fork, but that has been removed. What are the
>> implications of this, and did we need to do it?
> It seems you're looking at the previous version of this patch
> (http://lkml.org/lkml/2012/2/20/55). Latest proposal is
> http://lwn.net/Articles/484162/, which defines PR_SET_TIMERSLACK
> action as:
> case PR_SET_TIMERSLACK:
> if (arg2 <= 0)
> current->timer_slack_ns =
> else if (arg2 <= HRTIMER_MAX_SLACK)
> current->timer_slack_ns = arg2;
> error = -EINVAL;
>> If we do make changes in this area then the prctl manpage should be
>> updated, please. And if
>> http://www.spinics.net/lists/linux-man/msg01149.html represents the
>> current state of that manpage then it should be updated anyway - that
>> entry doesn't say anything about the (arg2<= 0) case.
> I sent a patch for man pages too, it should be one of the recent posts
> at http://www.spinics.net/lists/linux-man/index.html.
Your response didn't actually address Andrew's point. Your patch
changes user-visible semantics that have been in place since kernel
* The meaning of prctl(PS_SET_TIMESLACK, n) changes,
for the n<0 case (formerly, this reverted the timer slack
to the per-process "default", with the proposed patch, it
reverts the timer slack to a system-wide default).
* The semantics of setting the timer slack of a new thread
Perhaps these changes are warranted/necessary, but they *are* ABI
changes, and so should be carefully explained and well justified.
PS As background to the discussion, here's the current draft of some
text I plan to add to prctl(2) that explains the current semantics,
which would change with Dmitry's patch:
PR_SET_TIMERSLACK (since Linux 2.6.28)
Set the timer slack for the calling thread to the value in
arg2. The timer slack is a value, expressed in nanoseconds,
that is used by the kernel to group timer expirations for
this thread that are close to one another; as a consequence,
timer expirations for this thread may be up to the specified
number of nanoseconds late (but will never expire early).
Grouping timer expirations can help reduce system power con‐
sumption by minimizing CPU wake-ups.
The timer expirations affected by timer slack are those set
by select(2), pselect(2), poll(2), ppoll(2), epoll_wait(2),
epoll_pwait(2), clock_nanosleep(2), nanosleep(2), and
futex(2) (and thus the library functions implemented via
futexes: pthread_cond_timedwait(3), pthread_rwlock_timedrd‐
lock(3), pthread_rwlock_timedwrlock(3), and sem_wait(3)).
Each thread has two associated timer slack values: a
"default" value, and a "current" value. The "current" value
is the one that governs grouping of timer expirations. When
a new thread is created, the two timer slack values are made
the same as the "current" value of the creating thread.
Thereafter, a thread can adjust its timer slack value via
PR_SET_TIMERSLACK: if arg2 is greater than zero, then it
specifies a new value for the "current" timer slack for the
calling thread; if arg2 is less than or equal to zero, then
the "current" timer slack is set to the "default" value.
The timer slack value of init (PID 1), the ancestor of all
threads, is 50,000 nanoseconds (50 microseconds).
* The "default" timer slack of the child is set to the value of
the "current" timer slack of the parent. (See the description
of PR_SET_TIMERSLACK on prctl(2).)
Michael Kerrisk Linux man-pages maintainer;
Author of "The Linux Programming Interface", http://blog.man7.org/
More information about the linaro-dev