[PATCH 21/32] nohz/cpuset: Flush cputime on threads in nohz cpusets when waiting leader
gilad at benyossef.com
Tue Mar 27 14:23:14 UTC 2012
On Tue, Mar 27, 2012 at 4:10 PM, Gilad Ben-Yossef <gilad at benyossef.com> wrote:
> On Wed, Mar 21, 2012 at 3:58 PM, Frederic Weisbecker <fweisbec at gmail.com> wrote:
>> When we wait for a zombie task, flush the cputimes on nohz cpusets
>> in case we are waiting for a group leader that has threads running
>> in nohz CPUs. This way thread_group_times() doesn't report stale
>> If I understood well the code, by the time we call that thread_group_times(),
>> we may have childs that are still running, so this is necessary.
>> But I need to check deeper.
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index 4b4042f..c194662 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -52,6 +52,7 @@
>> #include <linux/hw_breakpoint.h>
>> #include <linux/oom.h>
>> #include <linux/writeback.h>
>> +#include <linux/cpuset.h>
>> #include <asm/uaccess.h>
>> #include <asm/unistd.h>
>> @@ -1712,6 +1713,13 @@ repeat:
>> (!wo->wo_pid || hlist_empty(&wo->wo_pid->tasks[wo->wo_type])))
>> goto notask;
>> + /*
>> + * For cputime in sub-threads before adding them.
>> + * Must be called outside tasklist_lock lock because write lock
>> + * can be acquired under irqs disabled.
>> + */
>> + cpuset_nohz_flush_cputimes();
>> tsk = current;
> I believe this patch is not needed because after this point we call
> do_wait_thread /ptrace_do_wait, which both call wait_consider_task,
> which calls wait_task_stopped/zombie/continued, which all eventually
> calls getrusage, which calls k_getrusage where you added a call to
> cpuset_noz_flush_cputimes() in another patch :-)
OK, I now see that wait_task_zombie actually calls
thread_group_times() directly, unlike other wait_task_*
what I wrote above is not needed.
It does result in more then one IPI for each isolated core (something
like 3 really) for the other cases though:
one from this patch and the rest from the one in k_getrusage calls.
I wonder what would be a better way to do it. In theory we can send
the IPI only to nohz cpuset cores that actually
run tasks form the thread group. Finding which is not trivial though...
> Gilad Ben-Yossef
> Chief Coffee Drinker
> gilad at benyossef.com
> Israel Cell: +972-52-8260388
> US Cell: +1-973-8260388
> "If you take a class in large-scale robotics, can you end up in a
> situation where the homework eats your dog?"
> -- Jean-Baptiste Queru
Chief Coffee Drinker
gilad at benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
-- Jean-Baptiste Queru
More information about the linaro-sched-sig