On Tue, 2012-11-27 at 19:18 +0530, Viresh Kumar wrote:
On 27 November 2012 18:56, Steven Rostedt rostedt@goodmis.org wrote:
A couple of things. The sched_select_cpu() is not cheap. It has a double loop of domains/cpus looking for a non idle cpu. If we have 1024 CPUs, and we are CPU 1023 and all other CPUs happen to be idle, we could be searching 1023 CPUs before we come up with our own.
Not sure if you missed the first check sched_select_cpu()
+int sched_select_cpu(unsigned int sd_flags) +{
/* If Current cpu isn't idle, don't migrate anything */
if (!idle_cpu(cpu))
return cpu;
We aren't going to search if we aren't idle.
OK, we are idle, but CPU 1022 isn't. We still need a large search. But, heh we are idle we can spin. But then why go through this in the first place ;-)
Also, I really don't like this as a default behavior. It seems that this solution is for a very special case, and this can become very intrusive for the normal case.
We tried with an KCONFIG option for it, which Tejun rejected.
Yeah, I saw that. I don't like adding KCONFIG options either. Best is to get something working that doesn't add any regressions. If you can get this to work without making *any* regressions in the normal case than I'm totally fine with that. But if this adds any issues with the normal case, then it's a show stopper.
To be honest, I'm uncomfortable with this approach. It seems to be fighting a symptom and not the disease. I'd rather find a way to keep work from being queued on wrong CPU. If it is a timer, find a way to move the timer. If it is something else, lets work to fix that. Doing searches of possibly all CPUs (unlikely, but it is there), just seems wrong to me.
As Vincent pointed out, on big LITTLE systems we just don't want to serve works on big cores. That would be wasting too much of power. Specially if we are going to wake up big cores.
It would be difficult to control the source driver (which queues work) to little cores. We thought, if somebody wanted to queue work on current cpu then they must use queue_work_on().
As Tejun has mentioned earlier, is there any assumptions anywhere that expects an unbounded work queue to not migrate? Where per cpu variables might be used. Tejun had a good idea of forcing this to migrate the work *every* time. To not let a work queue run on the same CPU that it was queued on. If it can survive that, then it is probably OK. Maybe add a config option that forces this? That way, anyone can test that this isn't an issue.
-- Steve