Hi Vikram,
On Tue, Nov 15, 2016 at 03:18:39PM -0800, Vikram Mulukutla wrote:
[...]
/*
- detach_task() -- detach the task for the migration specified in env
*/ static void detach_task(struct task_struct *p, struct lb_env *env) { lockdep_assert_held(&env->src_rq->lock);
deactivate_task(env->src_rq, p, 0); p->on_rq = TASK_ON_RQ_MIGRATING; double_lock_balance(env->src_rq, env->dst_rq); set_task_cpu(p, env->dst_cpu); double_unlock_balance(env->src_rq, env->dst_rq);
}
I'm a bit worried that this unlocking business as part of the double-locking in detach_task() is not safe.
If the code looks anything like mainline in detach_tasks() (notice the s), I don't think it is safe.
I do agree that the double_lock_balance is fragile in that new code
- such as Leo's - may (rightfully) never consider that the rq lock
is being dropped. The LKML implementation that Patrick pointed out does indeed eliminate double locking and should be quite easy to port to WALT on android-common/LSK. I don't think there's a good intermediate solution otherwise (sorry Leo!). We should move over to that implementation wherever WALT is being used with EAS for upcoming development to at the very least not have to think about the lock being dropped.
Thanks for confirmation. I will try to port your latest patches for this.
For 3.18/4.4, the code paths involved were audited when the double-locking was added back when we first upgraded to 3.18 in 2014, and given the sheer number of hours of commercial testing, I daresay that it has been safe so far to drop the rq lock from a stability perspective with the HMP scheduler and with the EAS scheduler (I think because the lock is dropped after the task is dequeued in all cases), neither of which mess with these LB code paths to an extent that would mask a potential problem. So android/common kernel-3.18/kernel-4.4 should be safe for now.
Yeah. I reviewed rq::cfs_tasks and rb tree for vruntime, it's true neither of them has the situation to dereference one same pointer variable before and after detach_task(); so they should be safe.
Thanks, Leo Yan