On Wed, 2015-04-22 at 23:56 +0200, Thomas Gleixner wrote:
-int get_nohz_timer_target(int pinned) +int get_nohz_timer_target(void) {
- int cpu = smp_processor_id();
- int i;
- int i, cpu = smp_processor_id(); struct sched_domain *sd;
- if (pinned || !get_sysctl_timer_migration() || !idle_cpu(cpu))
- if (!idle_cpu(cpu)) return cpu;
Maybe also test in_serving_softirq() ?
if (in_serving_softirq() || !idle_cpu(cpu)) return cpu;
There is a fundamental problem with networking load : Many cpus appear to be idle from scheduler perspective because no user/kernel task is running.
CPUs servicing NIC queues can be very busy handling thousands of packets per second, yet have no user/kernel task running.
idle_cpu() return code is : this cpu is idle. hmmmm, really ?
cpus are busy, *and* have to access alien data/locks to activate timers that hardly fire anyway.
When idle_cpu() finally gives the right indication, it is too late : ksoftirqd might be running on the wrong cpu. Innocent cpus, overwhelmed by a sudden timer load and locked into a service loop.
This cannot resist to a DOS, and even with non malicious traffic, the overhead is high.
Thanks.