Re: [QUERY]: Is using CPU hotplug right for isolating CPUs?

13 Feb 2014


      On Tue, Feb 11, 2014 at 02:22:43PM +0530, Viresh Kumar wrote:
...
On 28 January 2014 18:53, Frederic Weisbecker fweisbec@gmail.com wrote:
...
No, when a single task is running on a full dynticks CPU, the tick is supposed to run
every seconds. I'm actually suprised it doesn't happen in your traces, did you tweak
something specific?
Why do we need this 1 second tick currently? And what will happen if I
hotunplug that
CPU and get it back? Would the timer for tick move away from CPU in
question? I see
that when I have changed this 1sec stuff to 300 seconds. But what
would be impact
of that? Will things still work normally?
So the problem resides in the gazillions accounting maintained in scheduler_tick() and
current->sched_class->task_tick().
The scheduler correctness depends on these to be updated regularly. If you deactivate
or increase the delay with very high values, the result is unpredictable. Just expect that
at least some scheduler feature will behave randomly, like load balancing for example or
simply local fairness issues.
So we have that 1 Hz max that makes sure that things are moving forward while keeping
a rate that should be still nice for HPC workloads. But we certainly want to find a
way to remove the need for any tick altogether for extreme real time workloads which
need guarantees rather than just optimizations.
I see two potential solutions for that:
1) Rework the scheduler accounting such that it is safe against full dynticks. That
was the initial plan but it's scary. The scheduler accountings is a huge maze. And I'm not
sure it's actually worth the complication.
2) Offload the accounting. For example we could imagine that the timekeeping could handle the
task_tick() calls on behalf of the full dynticks CPUs. At a small rate like 1 Hz.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [QUERY]: Is using CPU hotplug right for isolating CPUs?