On 14/03/14 17:16, Jon Medhurst (Tixy) wrote:
On Fri, 2014-03-14 at 16:09 +0000, Chris Redpath wrote:
On 14/03/14 13:58, Jon Medhurst (Tixy) wrote:
On Thu, 2014-03-13 at 17:19 +0000, Chris Redpath wrote:
[...]
@@ -7038,16 +7079,13 @@ static void hmp_force_up_migration(int this_cpu) curr = hmp_get_heaviest_task(curr, 1); p = task_of(curr); if (hmp_up_migration(cpu, &target_cpu, curr)) {
if (!target->active_balance) {
get_task_struct(p);
target->push_cpu = target_cpu;
target->migrate_task = p;
got_target = 1;
trace_sched_hmp_migrate(p, target->push_cpu, HMP_MIGRATE_FORCE);
hmp_next_up_delay(&p->se, target->push_cpu);
}
cpu_rq(target_cpu)->wake_for_idle_pull = 1;
raw_spin_unlock_irqrestore(&target->lock, flags);
spin_unlock(&hmp_force_migration);
smp_send_reschedule(target_cpu);
return;
If I've followed the code correctly, does this change now mean that a CPU stops considering up migrations after it found this one, whereas before it would continue to look for others as well. Do we want to still do that? Guess it might be too complicated if we're now letting the other CPU pull things in it's own time. Or, if we were in the situation before where two tasks needed up migrating, would the new pull method actually find both of those to pull anyway and so behaviour is the same as before?
I intended this to work exactly as it is now, and rely upon the frequent calls to force_up_migration to cover all the CPUs because it was easier than handling the fight over runqueue access which happens when we wake multiple big CPUs and ask them each to find the biggest task to pull.
After having thought about your comment perhaps it's less optimal than it could be and I can probably do better.
As a first solution, I could instead count the number of eligible tasks and wake MIN(num_of_big_cpus, num_of_eligible_tasks) big CPUs. I can change idle pull to wait for the migration spinlock rather than exit if it can't take it. That will allow the big CPUs to sync themselves and each task to be pulled.
My gut reaction is that having CPU's waiting on spin-locks doesn't sound too efficient. In practice, how likely is needing to migrate more than one task anyway?
Yes, waiting is not great but it's not that likely with mobile workloads. IMO it's one of those cases where you can construct a test case but you're unlikely to hit it in real life.
I don't have a good grasp of the big picture, and only tend to think of these things once a month or so when I see new patches arrive, so my first thoughts are usually just to ask more questions, here's some more... :-)
If you woke up multiple cores, how would you get each core to pick up the right task? Could you not have a situation where little cores saw saw these possible migrations:
Task 1 (affine to both CPU A and B) --> CPU A Task 2 (affine to CPU B )--> CPU B
But CPU B ran first, picked Task 1 and then when CPU A gets the lock only sees Task 2 to migrate, but it's affinity doesn't allow that.
Indeed, this is one of the reasons I wanted to leave it alone in the initial patch :) There are other potentially awkward scenarios as well.
Come to think of it, is it possible for big CPUs to be busy with other things and the little cores to go through the migrate process twice before the big CPU has acted on the first request? Does wake_for_idle_pull need to be a counter?
CPUs won't idle pull if they are busy, the request will be ignored and the little core will be left to get on with its business.
Or does none of this really matter, as this migration is best effort, and the odd missed migration opportunity is OK as it will be retried again soon anyway if the task still needs migration?
This is a pretty good description of the situation - usually we'll get another opportunity fairly soon, often in <10ms on another CPU in the cluster.
It's also the case for any migration that there could be loads of reasons why you can't complete the migration when you come to do it - the task could have moved elsewhere when you get to run the stopper, it could have gone to sleep or died. Really all migrations are only best-effort and eventually the system is balanced :)
I think previously we might have attempted to set up multiple migrations, but we couldn't have performed multiple tasks moving from a single CPU to different ones at the same time even if we were able to set that up.
If the latter, perhaps the code is OK as it is in these patches, i.e. only finding one task at a time to migrate and waking one big CPU.
I'm still enjoying thinking about it :) One at a time is much simpler and we do need the bugfix to avoid a pinned task obscuring an unpinned one. I'm tempted to send you a respin with still one-at-a-time idle pulls but I haven't had testing time yet so it will have to be next week.
--Chris