On 27/06/14 09:44, Jon Medhurst (Tixy) wrote:
On Fri, 2014-06-27 at 09:06 +0800, Alex Shi wrote:
On 06/27/2014 02:51 AM, Naresh Kamboju wrote:
> What do you think?
In today's 1:1, I just talked with Mark, it is not a upstream quality fix for this issue. and better solution would ask more cpuidle part coordinate. :) So, I thought it maybe accepted for a quick fix before a final solution coming out. So I added more comments in commit log and remove the 'cpuidle_driver_unref()' which omitted in v1 patch.
below patch applied and tested. And noticed that, did not found deadlock and warning. Boot log attached and pasted. https://pastebin.linaro.org/view/56e64bde
-Naresh Kamboju
thanks Naresh!
Tixy, Will you pick up this patch and then we merge from your tree?
Chris, Mark, you both seemed to agree with this approach in your other replies, but it would be good to get confirmation that the specific patch is what you were agreeing with, i.e. the one at the end of: http://lists.linaro.org/pipermail/linaro-kernel/2014-June/015860.html
If so, I'll add it to my MP tree so LSK can pull it.
I'm happy for you to go ahead with the patch below, which is the one you referenced at http://lists.linaro.org/pipermail/linaro-kernel/2014-June/015860.html
The first one hadn't removed the driver deref which was incorrect.
--Chris
From: Alex Shi <alex.shi at linaro.org> Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull
Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.
[ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508
This patch is just a quick/cheap workaround for cpuidle_driver_lock deadlock. It works for TC2 and any other platform where the idle driver cannot be changed at runtime.
Signed-off-by: Alex Shi <alex.shi at linaro.org> --- kernel/sched/fair.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71da724..0269273 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3587,10 +3587,12 @@ static enum hrtimer_restart hmp_cpu_keepalive_notify(struct hrtimer *hrtimer) * If there are any, set ns_delay to * ('target_residency of state with shortest too-big latency' - 1) * 1000. */ -static void hmp_keepalive_delay(unsigned int *ns_delay) +static void hmp_keepalive_delay(int cpu, unsigned int *ns_delay) { + struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu); struct cpuidle_driver *drv; - drv = cpuidle_driver_ref(); + + drv = cpuidle_get_cpu_driver(dev); if (drv) { unsigned int us_delay = UINT_MAX; unsigned int us_max_delay = *ns_delay / 1000; @@ -3609,7 +3611,6 @@ static void hmp_keepalive_delay(unsigned int *ns_delay) else *ns_delay = 1000 * (us_delay - 1); } - cpuidle_driver_unref(); }
static void hmp_cpu_keepalive_trigger(void) @@ -3623,7 +3624,7 @@ static void hmp_cpu_keepalive_trigger(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); keepalive->timer.function = hmp_cpu_keepalive_notify;
- hmp_keepalive_delay(&ns_delay); + hmp_keepalive_delay(cpu, &ns_delay); keepalive->delay = ns_to_ktime(ns_delay); keepalive->init = true; }