deadlock in HMP

List overview All Threads
Download

newer

older

arm-soc boot: 64 boots: 60 pass, 3...

mainline boot: 64 boots: 62 pass,...

Alex Shi

18 Jun 2014 18 Jun '14

12:50 p.m.

when lock checking option enabled in kernel config, TC2 catch a deadlock bug as following,

It looks hmp_idle_pull --> hmp_keepalive_delay --> cpuidle_driver_ref interrupt into cpuidle_register_driver.

Since what we needs of cpuidle_driver in hmp_keepalive_delay are exit_latency/target_residency, they should be fixed for this board.

So, could we avoid the cpuidle_driver_ref calling by giving a fixed value? That also can save a cpuidle_get_cpu_driver?

=============== [ 113.878488] [ INFO: inconsistent lock state ] [ 113.878493] 3.10.42+ #4 Not tainted [ 113.878496] --------------------------------- [ 113.878501] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 113.878507] ksoftirqd/4/28 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 113.878531] (cpuidle_driver_lock){+.?...}, at: [<c035bded>] cpuidle_driver_ref+0x15/0x58 [ 113.878534] {SOFTIRQ-ON-W} state was registered at: [ 113.878545] [<c006eedd>] __lock_acquire+0x539/0x8c8 [ 113.878554] [<c006f913>] lock_acquire+0x113/0x164 [ 113.878565] [<c047e24b>] _raw_spin_lock+0x33/0x40 [ 113.878573] [<c035bc73>] cpuidle_register_driver+0x37/0xf4 [ 113.878583] [<c06bab1d>] bl_idle_init+0x85/0x128 [ 113.878592] [<c0008665>] do_one_initcall+0x99/0x11c [ 113.878603] [<c0695a5f>] kernel_init_freeable+0x10b/0x188 [ 113.878611] [<c046f96d>] kernel_init+0x19/0x108 [ 113.878619] [<c000d4a9>] ret_from_fork+0x11/0x1c [ 113.878622] irq event stamp: 28 [ 113.878633] hardirqs last enabled at (28): [<c047e48f>] _raw_spin_unlock_irqrestore+0x2f/0x44 [ 113.878642] hardirqs last disabled at (27): [<c047e335>] _raw_spin_lock_irqsave+0x19/0x4c [ 113.878652] softirqs last enabled at (0): [<c001d658>] copy_process+0x3c4/0xe34 [ 113.878661] softirqs last disabled at (17): [<c002578b>] run_ksoftirqd+0x2f/0x5c [ 113.878664] [ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508 [ 113.878722]

-- Thanks Alex

Show replies by date

Jon Medhurst (Tixy)

18 Jun 18 Jun

1:18 p.m.

On Wed, 2014-06-18 at 20:50 +0800, Alex Shi wrote:

...

when lock checking option enabled in kernel config, TC2 catch a deadlock bug as following,

This was reported last month by Naresh and was discussed again when I found the problem when testing this month's HMP patches from ARM, so this is somewhat old news. (You we're on CC to all those emails too)

...

It looks hmp_idle_pull --> hmp_keepalive_delay --> cpuidle_driver_ref interrupt into cpuidle_register_driver.

Since what we needs of cpuidle_driver in hmp_keepalive_delay are exit_latency/target_residency, they should be fixed for this board.

So, could we avoid the cpuidle_driver_ref calling by giving a fixed value?

Where do we get the fixed value from? These are properties of cpuidle, and as cpuidle is not integrated into the scheduler, asking the cpuidle driver for them seems like the reasonable thing to do. Though if they are fixed and not dependent on current state, then perhaps the idle driver can be hacked to call into the scheduler at probe time to provide the values for the scheduler to stash somewhere.

Chris did say he was going to look at a solution, I don't know what his thoughts are.

-- Tixy

Alex Shi

19 Jun 19 Jun

6:17 a.m.

On 06/18/2014 09:18 PM, Jon Medhurst (Tixy) wrote:

...

On Wed, 2014-06-18 at 20:50 +0800, Alex Shi wrote:

...
when lock checking option enabled in kernel config, TC2 catch a deadlock bug as following,

This was reported last month by Naresh and was discussed again when I found the problem when testing this month's HMP patches from ARM, so this is somewhat old news. (You we're on CC to all those emails too)

Yes, but is a very severe bug for HMP. It could bring much trouble to LSK user. And it looks not too hard to be resolved in time. So I try to raise this issue again. :)

...

...
It looks hmp_idle_pull --> hmp_keepalive_delay --> cpuidle_driver_ref interrupt into cpuidle_register_driver.

Since what we needs of cpuidle_driver in hmp_keepalive_delay are exit_latency/target_residency, they should be fixed for this board.

So, could we avoid the cpuidle_driver_ref calling by giving a fixed value?

Where do we get the fixed value from? These are properties of cpuidle, and as cpuidle is not integrated into the scheduler, asking the cpuidle driver for them seems like the reasonable thing to do. Though if they are fixed and not dependent on current state, then perhaps the idle driver can be hacked to call into the scheduler at probe time to provide the values for the scheduler to stash somewhere

Chris did say he was going to look at a solution, I don't know what his thoughts are.

To avoid the lock using, could we try per_cpu cpuidle driver?

Naresh, could you like to test this patch?

...

From c938115997899014148b401ab91ce67fb94a0f51 Mon Sep 17 00:00:00 2001

From: Alex Shi alex.shi@linaro.org Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.

[ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508

Signed-off-by: Alex Shi alex.shi@linaro.org --- kernel/sched/fair.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71da724..25c6162 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3587,10 +3587,12 @@ static enum hrtimer_restart hmp_cpu_keepalive_notify(struct hrtimer *hrtimer) * If there are any, set ns_delay to * ('target_residency of state with shortest too-big latency' - 1) * 1000. */ -static void hmp_keepalive_delay(unsigned int *ns_delay) +static void hmp_keepalive_delay(int cpu, unsigned int *ns_delay) { + struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu); struct cpuidle_driver *drv; - drv = cpuidle_driver_ref(); + + drv = cpuidle_get_cpu_driver(dev); if (drv) { unsigned int us_delay = UINT_MAX; unsigned int us_max_delay = *ns_delay / 1000; @@ -3623,7 +3625,7 @@ static void hmp_cpu_keepalive_trigger(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); keepalive->timer.function = hmp_cpu_keepalive_notify;

- hmp_keepalive_delay(&ns_delay); + hmp_keepalive_delay(cpu, &ns_delay); keepalive->delay = ns_to_ktime(ns_delay); keepalive->init = true; }

-- 1.9.1

Naresh Kamboju

23 Jun 23 Jun

7:30 a.m.

On 19 June 2014 11:47, Alex Shi alex.shi@linaro.org wrote:

...

On 06/18/2014 09:18 PM, Jon Medhurst (Tixy) wrote:

...
On Wed, 2014-06-18 at 20:50 +0800, Alex Shi wrote:

...
when lock checking option enabled in kernel config, TC2 catch a deadlock bug as following,

This was reported last month by Naresh and was discussed again when I found the problem when testing this month's HMP patches from ARM, so this is somewhat old news. (You we're on CC to all those emails too)

Yes, but is a very severe bug for HMP. It could bring much trouble to LSK user. And it looks not too hard to be resolved in time. So I try to raise this issue again. :)

...
...
It looks hmp_idle_pull --> hmp_keepalive_delay --> cpuidle_driver_ref interrupt into cpuidle_register_driver.

Since what we needs of cpuidle_driver in hmp_keepalive_delay are exit_latency/target_residency, they should be fixed for this board.

So, could we avoid the cpuidle_driver_ref calling by giving a fixed value?

Where do we get the fixed value from? These are properties of cpuidle, and as cpuidle is not integrated into the scheduler, asking the cpuidle driver for them seems like the reasonable thing to do. Though if they are fixed and not dependent on current state, then perhaps the idle driver can be hacked to call into the scheduler at probe time to provide the values for the scheduler to stash somewhere

Chris did say he was going to look at a solution, I don't know what his thoughts are.

To avoid the lock using, could we try per_cpu cpuidle driver?

Naresh, could you like to test this patch?

Sure Alex.

I will test and post you results.

Thanks -Naresh

...

From c938115997899014148b401ab91ce67fb94a0f51 Mon Sep 17 00:00:00 2001 From: Alex Shi alex.shi@linaro.org Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.

[ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508

Signed-off-by: Alex Shi alex.shi@linaro.org

kernel/sched/fair.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71da724..25c6162 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3587,10 +3587,12 @@ static enum hrtimer_restart hmp_cpu_keepalive_notify(struct hrtimer *hrtimer)

If there are any, set ns_delay to

('target_residency of state with shortest too-big latency' - 1) * 1000.

*/ -static void hmp_keepalive_delay(unsigned int *ns_delay) +static void hmp_keepalive_delay(int cpu, unsigned int *ns_delay) {
  struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
  struct cpuidle_driver *drv;
  drv = cpuidle_driver_ref();
  drv = cpuidle_get_cpu_driver(dev);
  if (drv) {
          unsigned int us_delay = UINT_MAX;
          unsigned int us_max_delay = *ns_delay / 1000;
@@ -3623,7 +3625,7 @@ static void hmp_cpu_keepalive_trigger(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); keepalive->timer.function = hmp_cpu_keepalive_notify;
          hmp_keepalive_delay(&ns_delay);
          hmp_keepalive_delay(cpu, &ns_delay);
          keepalive->delay = ns_to_ktime(ns_delay);
          keepalive->init = true;
  }
-- 1.9.1

Alex Shi

24 Jun 24 Jun

7:56 a.m.

On 06/23/2014 03:30 PM, Naresh Kamboju wrote:

...

...
...
To avoid the lock using, could we try per_cpu cpuidle driver?

Naresh, could you like to test this patch?

Sure Alex.

I will test and post you results.

Hi, Naresh, did you test this patch?

Tixy, Chris,

Any comments for this idea?

-- Thanks Alex

Chris Redpath

8:42 a.m.

On 24/06/14 08:56, Alex Shi wrote:

...

On 06/23/2014 03:30 PM, Naresh Kamboju wrote:

...
...
...
To avoid the lock using, could we try per_cpu cpuidle driver?

Naresh, could you like to test this patch?

Sure Alex.

I will test and post you results.

Hi, Naresh, did you test this patch?

Tixy, Chris,

Any comments for this idea?

Hi Alex,

I've had a look at the patch, I'm not sure if it will make any difference as the potential deadlock is on the same CPU.

I will try to have a better look at it this morning.

--Chris

Chris Redpath

9:46 a.m.

On 24/06/14 09:42, Chris Redpath wrote:

...

On 24/06/14 08:56, Alex Shi wrote:

...
On 06/23/2014 03:30 PM, Naresh Kamboju wrote:

...
...
...
To avoid the lock using, could we try per_cpu cpuidle driver?

Naresh, could you like to test this patch?

Sure Alex.

I will test and post you results.

Hi, Naresh, did you test this patch?

Tixy, Chris,

Any comments for this idea?

Hi Alex,

I've had a look at the patch, I'm not sure if it will make any difference as the potential deadlock is on the same CPU.

I will try to have a better look at it this morning.

--Chris

Hi Alex,

I've had a better look - the per-cpu driver interface doesn't use locking at all so any potential deadlock will of course not occur.

The reason I used cpuidle_driver_ref was to ensure that driver data did not go away while I was reading it, as there isn't any notification for driver changes.

Is it possible to change the cpuidle driver after it has been setup? I didn't find anything which said it was not allowed.

--Chris

...

linaro-kernel mailing list linaro-kernel@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-kernel

Daniel Lezcano

10:23 a.m.

On 06/24/2014 11:46 AM, Chris Redpath wrote:

...

On 24/06/14 09:42, Chris Redpath wrote:

...
On 24/06/14 08:56, Alex Shi wrote:

...
On 06/23/2014 03:30 PM, Naresh Kamboju wrote:

...
...
...
To avoid the lock using, could we try per_cpu cpuidle driver?

Naresh, could you like to test this patch?

Sure Alex.

I will test and post you results.

Hi, Naresh, did you test this patch?

Tixy, Chris,

Any comments for this idea?

Hi Alex,

I've had a look at the patch, I'm not sure if it will make any difference as the potential deadlock is on the same CPU.

I will try to have a better look at it this morning.

--Chris

Hi Alex,

I've had a better look - the per-cpu driver interface doesn't use locking at all so any potential deadlock will of course not occur.

The reason I used cpuidle_driver_ref was to ensure that driver data did not go away while I was reading it, as there isn't any notification for driver changes.

Hi,

theoretically nothing prevent the user to unload the module. In practice, that does not happen because the drivers are usually compiled in. But I wouldn't rely on that.

Also, in the code path, the different idle states are inspected to find the one fitting the ns_delay. On some architecture (eg. x86), one or several idle states can be removed at runtime if they switch to battery<->AC [1]

But the bug you are facing is probably coming from the same cpu is taking the same lock twice because of an interrupt.

-- Daniel

[1] https://git.linaro.org/people/daniel.lezcano/linux.git/commit/e484471d65c1af...

...

--Chris

...

linaro-kernel mailing list linaro-kernel@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-kernel

linaro-kernel mailing list linaro-kernel@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-kernel

-- http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro Facebook | http://twitter.com/#!/linaroorg Twitter | http://www.linaro.org/linaro-blog/ Blog

Alex Shi

11:01 a.m.

...

...
Hi Alex,

I've had a better look - the per-cpu driver interface doesn't use locking at all so any potential deadlock will of course not occur.

The reason I used cpuidle_driver_ref was to ensure that driver data did not go away while I was reading it, as there isn't any notification for driver changes.

Hi,

theoretically nothing prevent the user to unload the module. In practice, that does not happen because the drivers are usually compiled in. But I wouldn't rely on that.

I saw a call chain: cpu_idle_loop -> cpu_idle_poll -> cpuidle_get_cpu_driver() So looks nohz is don't worry on this point too.

...

Also, in the code path, the different idle states are inspected to find the one fitting the ns_delay. On some architecture (eg. x86), one or several idle states can be removed at runtime if they switch to battery<->AC [1]

yes, it is a another concern on other architecture. but since hmp is just on arm now...

...

But the bug you are facing is probably coming from the same cpu is taking the same lock twice because of an interrupt.

yes, the patch is ok for this scenario.

-- Thanks Alex

Naresh Kamboju

26 Jun 26 Jun

9:34 a.m.

On 19 June 2014 11:47, Alex Shi alex.shi@linaro.org wrote:

...

On 06/18/2014 09:18 PM, Jon Medhurst (Tixy) wrote:

...
On Wed, 2014-06-18 at 20:50 +0800, Alex Shi wrote:

...
when lock checking option enabled in kernel config, TC2 catch a deadlock bug as following,

This was reported last month by Naresh and was discussed again when I found the problem when testing this month's HMP patches from ARM, so this is somewhat old news. (You we're on CC to all those emails too)

Yes, but is a very severe bug for HMP. It could bring much trouble to LSK user. And it looks not too hard to be resolved in time. So I try to raise this issue again. :)

...
...
It looks hmp_idle_pull --> hmp_keepalive_delay --> cpuidle_driver_ref interrupt into cpuidle_register_driver.

Since what we needs of cpuidle_driver in hmp_keepalive_delay are exit_latency/target_residency, they should be fixed for this board.

So, could we avoid the cpuidle_driver_ref calling by giving a fixed value?

Where do we get the fixed value from? These are properties of cpuidle, and as cpuidle is not integrated into the scheduler, asking the cpuidle driver for them seems like the reasonable thing to do. Though if they are fixed and not dependent on current state, then perhaps the idle driver can be hacked to call into the scheduler at probe time to provide the values for the scheduler to stash somewhere

Chris did say he was going to look at a solution, I don't know what his thoughts are.

To avoid the lock using, could we try per_cpu cpuidle driver?

Naresh, could you like to test this patch?

From c938115997899014148b401ab91ce67fb94a0f51 Mon Sep 17 00:00:00 2001 From: Alex Shi alex.shi@linaro.org Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.

[ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508

Signed-off-by: Alex Shi alex.shi@linaro.org

kernel/sched/fair.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71da724..25c6162 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3587,10 +3587,12 @@ static enum hrtimer_restart hmp_cpu_keepalive_notify(struct hrtimer *hrtimer)

If there are any, set ns_delay to

('target_residency of state with shortest too-big latency' - 1) * 1000.

*/ -static void hmp_keepalive_delay(unsigned int *ns_delay) +static void hmp_keepalive_delay(int cpu, unsigned int *ns_delay) {
  struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
  struct cpuidle_driver *drv;
  drv = cpuidle_driver_ref();
  drv = cpuidle_get_cpu_driver(dev);
  if (drv) {
          unsigned int us_delay = UINT_MAX;
          unsigned int us_max_delay = *ns_delay / 1000;
@@ -3623,7 +3625,7 @@ static void hmp_cpu_keepalive_trigger(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); keepalive->timer.function = hmp_cpu_keepalive_notify;
          hmp_keepalive_delay(&ns_delay);
          hmp_keepalive_delay(cpu, &ns_delay);
          keepalive->delay = ns_to_ktime(ns_delay);
          keepalive->init = true;
  }
-- 1.9.1

After applying above patch. Boot tested on vexpress-tc2 and noticed that this deadlock bug stopped reporting.

Boot logs attached: - deadlock-on-hmp-with-alex-fix-patch-on-tc2.log - deadlock-on-hmp-without-alex-fix-patch-on-tc2.log

Bug updated: https://bugs.launchpad.net/linaro-stable-kernel/+bug/1330392

-Naresh

Alex Shi

9:37 a.m.

Thanks for testing, Naresh!

于 6/26/14, 17:34, Naresh Kamboju 写道:

...

...
...
After applying above patch. Boot tested on vexpress-tc2 and noticed that this deadlock bug stopped reporting.

Boot logs attached:

deadlock-on-hmp-with-alex-fix-patch-on-tc2.log

deadlock-on-hmp-without-alex-fix-patch-on-tc2.log

Bug updated: https://bugs.launchpad.net/linaro-stable-kernel/+bug/1330392

Chris Redpath

10:13 a.m.

On 26/06/14 10:37, Alex Shi wrote:

...

Thanks for testing, Naresh!

于 6/26/14, 17:34, Naresh Kamboju 写道:

...
...
...
After applying above patch. Boot tested on vexpress-tc2 and noticed that this deadlock bug stopped reporting.

Boot logs attached:

deadlock-on-hmp-with-alex-fix-patch-on-tc2.log

deadlock-on-hmp-without-alex-fix-patch-on-tc2.log

Bug updated: https://bugs.launchpad.net/linaro-stable-kernel/+bug/1330392

Yes, the warning goes away but what you've done is to just ignore locking this resource. It's not guaranteed by anything in CPUidle that the driver won't be changed while we are doing this. On all our platforms, the cpuidle driver is built-in rather than loaded as a module so it's extremely unlikely to happen (I think it's actually impossible) but I don't think we should write kernel code that works due to circumstance - we should be honoring the contract for accessing this resource. We didn't have the warning checks working properly in our integrated test and I didn't see this before but the root cause is that I should not be taking that lock in potentially re-entrant code without guarding it.

To properly fix this specific implementation will probably require hacking a new API into the cpuidle framework (something like try_cpuidle_driver_ref) which will make future backporting a wee bit more complicated but not a lot. The patch would have to be added to the big.LITTLE MP patch set.

Perhaps we could also add a lock into the timer hack which we can trylock before attempting to access the cpuidle driver, this would at least stop us deadlocking ourselves.

An alternative might be to re-engineer the solution to provide a low-resume-latency request interface which the scheduler can use instead of a timer hack to keep it alive - the existing QoS interface isn't suitable because it's global and wakes all the CPUs up to get the new latency requirement in effect. That would not strictly be a big.LITTLE feature, but as its the only client it would likely need to be carried around in the same patch set.

There is already a suitable state disable API in cpuidle, but determining which states to disable still requires reading the exit latency which is only available in this driver information.

In summary, I don't like this fix although I accept that it works and it'll continue to work for TC2 and any other platform where the idle driver cannot be changed at runtime.

What do you think?

--Chris

Mark Brown

12:09 p.m.

On 26 June 2014 11:13, Chris Redpath Chris.Redpath@arm.com wrote:

An alternative might be to re-engineer the solution to provide a

...

low-resume-latency request interface which the scheduler can use instead of a timer hack to keep it alive - the existing QoS interface isn't suitable because it's global and wakes all the CPUs up to get the new latency requirement in effect. That would not strictly be a big.LITTLE feature, but as its the only client it would likely need to be carried around in the same patch set.

There is already a suitable state disable API in cpuidle, but determining which states to disable still requires reading the exit latency which is only available in this driver information.

In summary, I don't like this fix although I accept that it works and it'll continue to work for TC2 and any other platform where the idle driver cannot be changed at runtime.

What do you think?

...

From my perspective the request interface seems prettiest, though as you

say it's a bit invasive. I don't know if this is an interface that the upstream scheduler work is likely to use (ie, something that would ever get upstreamed), it'd be good if it was from the point of view of driver authors.

However as a matter of expediency just not locking seems viable; I would be surprised if anyone was doing anything which caused problems with that in practical big.LITTLE applications. Is it worth putting this in as a quick workaround if a proper fix isn't going to be available quickly?

Chris Redpath

2:55 p.m.

On 26/06/14 13:09, Mark Brown wrote:

...

On 26 June 2014 11:13, Chris Redpath <Chris.Redpath@arm.com mailto:Chris.Redpath@arm.com> wrote:
An alternative might be to re-engineer the solution to provide a
low-resume-latency request interface which the scheduler can use
instead of a timer hack to keep it alive - the existing QoS
interface isn't suitable because it's global and wakes all the CPUs
up to get the new latency requirement in effect. That would not
strictly be a big.LITTLE feature, but as its the only client it
would likely need to be carried around in the same patch set.

There is already a suitable state disable API in cpuidle, but
determining which states to disable still requires reading the exit
latency which is only available in this driver information.

In summary, I don't like this fix although I accept that it works
and it'll continue to work for TC2 and any other platform where the
idle driver cannot be changed at runtime.

What do you think?
From my perspective the request interface seems prettiest, though as you say it's a bit invasive. I don't know if this is an interface that the upstream scheduler work is likely to use (ie, something that would ever get upstreamed), it'd be good if it was from the point of view of driver authors.

However as a matter of expediency just not locking seems viable; I would be surprised if anyone was doing anything which caused problems with that in practical big.LITTLE applications. Is it worth putting this in as a quick workaround if a proper fix isn't going to be available quickly?

Hi Mark,

Just realised I didn't reply to you earlier. I agree about the expediency, I don't mind at all if we use Alex' workaround so long as we all know that it is a workaround and it depends on the driver not being modified at the wrong time.

--Chris

Alex Shi

12:57 p.m.

On 06/26/2014 06:13 PM, Chris Redpath wrote:> On 26/06/14 10:37, Alex Shi wrote:

...

...
Thanks for testing, Naresh!

于 6/26/14, 17:34, Naresh Kamboju 写道:

...
...
...
After applying above patch. Boot tested on vexpress-tc2 and noticed that this deadlock bug stopped reporting.

Boot logs attached:

deadlock-on-hmp-with-alex-fix-patch-on-tc2.log

deadlock-on-hmp-without-alex-fix-patch-on-tc2.log

Bug updated: https://bugs.launchpad.net/linaro-stable-kernel/+bug/1330392

Yes, the warning goes away but what you've done is to just ignore locking this resource. It's not guaranteed by anything in CPUidle that the driver won't be changed while we are doing this. On all our platforms, the cpuidle driver is built-in rather than loaded as a module so it's extremely unlikely to happen (I think it's actually impossible) but I don't think we should write kernel code that works due to circumstance - we should be honoring the contract for accessing this resource. We didn't have the warning checks working properly in our integrated test and I didn't see this before but the root cause is that I should not be taking that lock in potentially re-entrant code without guarding it.

To properly fix this specific implementation will probably require hacking a new API into the cpuidle framework (something like try_cpuidle_driver_ref) which will make future backporting a wee bit more complicated but not a lot. The patch would have to be added to the big.LITTLE MP patch set.

Perhaps we could also add a lock into the timer hack which we can trylock before attempting to access the cpuidle driver, this would at least stop us deadlocking ourselves.

An alternative might be to re-engineer the solution to provide a low-resume-latency request interface which the scheduler can use instead of a timer hack to keep it alive - the existing QoS interface isn't suitable because it's global and wakes all the CPUs up to get the new latency requirement in effect. That would not strictly be a big.LITTLE feature, but as its the only client it would likely need to be carried around in the same patch set.

There is already a suitable state disable API in cpuidle, but determining which states to disable still requires reading the exit latency which is only available in this driver information.

In summary, I don't like this fix although I accept that it works and it'll continue to work for TC2 and any other platform where the idle driver cannot be changed at runtime.

What do you think?

In today's 1:1, I just talked with Mark, it is not a upstream quality fix for this issue. and better solution would ask more cpuidle part coordinate. :) So, I thought it maybe accepted for a quick fix before a final solution coming out. So I added more comments in commit log and remove the 'cpuidle_driver_unref()' which omitted in v1 patch.

==============

...

From 49cdbf3b800933bfb25a49ef1cbad918a072574e Mon Sep 17 00:00:00 2001

From: Alex Shi alex.shi@linaro.org Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.

This patch is just a quick/cheap workaround for cpuidle_driver_lock deadlock. It works for TC2 and any other platform where the idle driver cannot be changed at runtime.

Signed-off-by: Alex Shi alex.shi@linaro.org --- kernel/sched/fair.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71da724..0269273 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3587,10 +3587,12 @@ static enum hrtimer_restart hmp_cpu_keepalive_notify(struct hrtimer *hrtimer) * If there are any, set ns_delay to * ('target_residency of state with shortest too-big latency' - 1) * 1000. */ -static void hmp_keepalive_delay(unsigned int *ns_delay) +static void hmp_keepalive_delay(int cpu, unsigned int *ns_delay) { + struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu); struct cpuidle_driver *drv; - drv = cpuidle_driver_ref(); + + drv = cpuidle_get_cpu_driver(dev); if (drv) { unsigned int us_delay = UINT_MAX; unsigned int us_max_delay = *ns_delay / 1000; @@ -3609,7 +3611,6 @@ static void hmp_keepalive_delay(unsigned int *ns_delay) else *ns_delay = 1000 * (us_delay - 1); } - cpuidle_driver_unref(); }

static void hmp_cpu_keepalive_trigger(void) @@ -3623,7 +3624,7 @@ static void hmp_cpu_keepalive_trigger(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); keepalive->timer.function = hmp_cpu_keepalive_notify;

- hmp_keepalive_delay(&ns_delay); + hmp_keepalive_delay(cpu, &ns_delay); keepalive->delay = ns_to_ktime(ns_delay); keepalive->init = true; }

-- 1.9.1 -- Thanks Alex

Chris Redpath

1:38 p.m.

Hi Alex,

On 26/06/14 13:57, Alex Shi wrote:

...

On 06/26/2014 06:13 PM, Chris Redpath wrote:> On 26/06/14 10:37, Alex Shi wrote:

...
...
Thanks for testing, Naresh!

于 6/26/14, 17:34, Naresh Kamboju 写道:

...
...
...
After applying above patch. Boot tested on vexpress-tc2 and noticed that this deadlock bug stopped reporting.

Boot logs attached: - deadlock-on-hmp-with-alex-fix-patch-on-tc2.log - deadlock-on-hmp-without-alex-fix-patch-on-tc2.log

Bug updated: https://bugs.launchpad.net/linaro-stable-kernel/+bug/1330392

Yes, the warning goes away but what you've done is to just ignore locking this resource. It's not guaranteed by anything in CPUidle that the driver won't be changed while we are doing this. On all our platforms, the cpuidle driver is built-in rather than loaded as a module so it's extremely unlikely to happen (I think it's actually impossible) but I don't think we should write kernel code that works due to circumstance - we should be honoring the contract for accessing this resource. We didn't have the warning checks working properly in our integrated test and I didn't see this before but the root cause is that I should not be taking that lock in potentially re-entrant code without guarding it.

To properly fix this specific implementation will probably require hacking a new API into the cpuidle framework (something like try_cpuidle_driver_ref) which will make future backporting a wee bit more complicated but not a lot. The patch would have to be added to the big.LITTLE MP patch set.

Perhaps we could also add a lock into the timer hack which we can trylock before attempting to access the cpuidle driver, this would at least stop us deadlocking ourselves.

An alternative might be to re-engineer the solution to provide a low-resume-latency request interface which the scheduler can use instead of a timer hack to keep it alive - the existing QoS interface isn't suitable because it's global and wakes all the CPUs up to get the new latency requirement in effect. That would not strictly be a big.LITTLE feature, but as its the only client it would likely need to be carried around in the same patch set.

There is already a suitable state disable API in cpuidle, but determining which states to disable still requires reading the exit latency which is only available in this driver information.

In summary, I don't like this fix although I accept that it works and it'll continue to work for TC2 and any other platform where the idle driver cannot be changed at runtime.

What do you think?

In today's 1:1, I just talked with Mark, it is not a upstream quality fix for this issue. and better solution would ask more cpuidle part coordinate. :) So, I thought it maybe accepted for a quick fix before a final solution coming out. So I added more comments in commit log and remove the 'cpuidle_driver_unref()' which omitted in v1 patch.

============== From 49cdbf3b800933bfb25a49ef1cbad918a072574e Mon Sep 17 00:00:00 2001 From: Alex Shi alex.shi@linaro.org Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.

[ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508

This patch is just a quick/cheap workaround for cpuidle_driver_lock deadlock. It works for TC2 and any other platform where the idle driver cannot be changed at runtime.

Signed-off-by: Alex Shi alex.shi@linaro.org

kernel/sched/fair.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71da724..0269273 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3587,10 +3587,12 @@ static enum hrtimer_restart hmp_cpu_keepalive_notify(struct hrtimer *hrtimer)

If there are any, set ns_delay to

('target_residency of state with shortest too-big latency' - 1) * 1000.

*/ -static void hmp_keepalive_delay(unsigned int *ns_delay) +static void hmp_keepalive_delay(int cpu, unsigned int *ns_delay) {

struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu); struct cpuidle_driver *drv;

drv = cpuidle_driver_ref();

drv = cpuidle_get_cpu_driver(dev); if (drv) { unsigned int us_delay = UINT_MAX; unsigned int us_max_delay = *ns_delay / 1000;

@@ -3609,7 +3611,6 @@ static void hmp_keepalive_delay(unsigned int *ns_delay) else *ns_delay = 1000 * (us_delay - 1); }

cpuidle_driver_unref(); }

static void hmp_cpu_keepalive_trigger(void)

@@ -3623,7 +3624,7 @@ static void hmp_cpu_keepalive_trigger(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); keepalive->timer.function = hmp_cpu_keepalive_notify;
hmp_keepalive_delay(&ns_delay);
hmp_keepalive_delay(cpu, &ns_delay);
keepalive->delay = ns_to_ktime(ns_delay); keepalive->init = true; }

That looks OK to me, I'm still thinking about replacing it but realistically this is the kind of thing scheduler-cpuidle integration will solve for us.

--Chris

Alex Shi

2:25 p.m.

On 06/26/2014 09:38 PM, Chris Redpath wrote:

...

...
That looks OK to me, I'm still thinking about replacing it but realistically this is the kind of thing scheduler-cpuidle integration will solve for us.

Definitely.

-- Thanks Alex

Naresh Kamboju

6:51 p.m.

Hi Alex,

On 26 June 2014 18:27, Alex Shi alex.shi@linaro.org wrote:

...

On 06/26/2014 06:13 PM, Chris Redpath wrote:> On 26/06/14 10:37, Alex Shi wrote:

...
...
Thanks for testing, Naresh!

于 6/26/14, 17:34, Naresh Kamboju 写道:

...
...
...
After applying above patch. Boot tested on vexpress-tc2 and noticed that this deadlock bug stopped reporting.

Boot logs attached:

deadlock-on-hmp-with-alex-fix-patch-on-tc2.log

deadlock-on-hmp-without-alex-fix-patch-on-tc2.log

Bug updated: https://bugs.launchpad.net/linaro-stable-kernel/+bug/1330392

Yes, the warning goes away but what you've done is to just ignore locking this resource. It's not guaranteed by anything in CPUidle that the driver won't be changed while we are doing this. On all our platforms, the cpuidle driver is built-in rather than loaded as a module so it's extremely unlikely to happen (I think it's actually impossible) but I don't think we should write kernel code that works due to circumstance - we should be honoring the contract for accessing this resource. We didn't have the warning checks working properly in our integrated test and I didn't see this before but the root cause is that I should not be taking that lock in potentially re-entrant code without guarding it.

To properly fix this specific implementation will probably require hacking a new API into the cpuidle framework (something like try_cpuidle_driver_ref) which will make future backporting a wee bit more complicated but not a lot. The patch would have to be added to the big.LITTLE MP patch set.

Perhaps we could also add a lock into the timer hack which we can trylock before attempting to access the cpuidle driver, this would at least stop us deadlocking ourselves.

An alternative might be to re-engineer the solution to provide a low-resume-latency request interface which the scheduler can use instead of a timer hack to keep it alive - the existing QoS interface isn't suitable because it's global and wakes all the CPUs up to get the new latency requirement in effect. That would not strictly be a big.LITTLE feature, but as its the only client it would likely need to be carried around in the same patch set.

There is already a suitable state disable API in cpuidle, but determining which states to disable still requires reading the exit latency which is only available in this driver information.

In summary, I don't like this fix although I accept that it works and it'll continue to work for TC2 and any other platform where the idle driver cannot be changed at runtime.

What do you think?

In today's 1:1, I just talked with Mark, it is not a upstream quality fix for this issue. and better solution would ask more cpuidle part coordinate. :) So, I thought it maybe accepted for a quick fix before a final solution coming out. So I added more comments in commit log and remove the 'cpuidle_driver_unref()' which omitted in v1 patch.

below patch applied and tested. And noticed that, did not found deadlock and warning. Boot log attached and pasted. https://pastebin.linaro.org/view/56e64bde

-Naresh Kamboju

...

============== From 49cdbf3b800933bfb25a49ef1cbad918a072574e Mon Sep 17 00:00:00 2001 From: Alex Shi alex.shi@linaro.org Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.

[ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] *** DEADLOCK *** [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508

This patch is just a quick/cheap workaround for cpuidle_driver_lock deadlock. It works for TC2 and any other platform where the idle driver cannot be changed at runtime.

Signed-off-by: Alex Shi alex.shi@linaro.org

kernel/sched/fair.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 71da724..0269273 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3587,10 +3587,12 @@ static enum hrtimer_restart hmp_cpu_keepalive_notify(struct hrtimer *hrtimer)

If there are any, set ns_delay to

('target_residency of state with shortest too-big latency' - 1) * 1000.

*/ -static void hmp_keepalive_delay(unsigned int *ns_delay) +static void hmp_keepalive_delay(int cpu, unsigned int *ns_delay) {
  struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
  struct cpuidle_driver *drv;
  drv = cpuidle_driver_ref();
  drv = cpuidle_get_cpu_driver(dev);
  if (drv) {
          unsigned int us_delay = UINT_MAX;
          unsigned int us_max_delay = *ns_delay / 1000;
@@ -3609,7 +3611,6 @@ static void hmp_keepalive_delay(unsigned int *ns_delay) else *ns_delay = 1000 * (us_delay - 1); }
  cpuidle_driver_unref();
}

static void hmp_cpu_keepalive_trigger(void) @@ -3623,7 +3624,7 @@ static void hmp_cpu_keepalive_trigger(void) CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); keepalive->timer.function = hmp_cpu_keepalive_notify;
          hmp_keepalive_delay(&ns_delay);
          hmp_keepalive_delay(cpu, &ns_delay);
          keepalive->delay = ns_to_ktime(ns_delay);
          keepalive->init = true;
  }
-- 1.9.1

-- Thanks Alex

Alex Shi

27 Jun 27 Jun

1:06 a.m.

On 06/27/2014 02:51 AM, Naresh Kamboju wrote:

...

...
...
...
...
What do you think?

In today's 1:1, I just talked with Mark, it is not a upstream quality fix for this issue. and better solution would ask more cpuidle part coordinate. :) So, I thought it maybe accepted for a quick fix before a final solution coming out. So I added more comments in commit log and remove the 'cpuidle_driver_unref()' which omitted in v1 patch.

below patch applied and tested. And noticed that, did not found deadlock and warning. Boot log attached and pasted. https://pastebin.linaro.org/view/56e64bde

-Naresh Kamboju

thanks Naresh!

Tixy, Will you pick up this patch and then we merge from your tree?

-- Thanks Alex

Jon Medhurst (Tixy)

8:44 a.m.

On Fri, 2014-06-27 at 09:06 +0800, Alex Shi wrote:

...

On 06/27/2014 02:51 AM, Naresh Kamboju wrote:

...
...
...
...
...
What do you think?

In today's 1:1, I just talked with Mark, it is not a upstream quality fix for this issue. and better solution would ask more cpuidle part coordinate. :) So, I thought it maybe accepted for a quick fix before a final solution coming out. So I added more comments in commit log and remove the 'cpuidle_driver_unref()' which omitted in v1 patch.

below patch applied and tested. And noticed that, did not found deadlock and warning. Boot log attached and pasted. https://pastebin.linaro.org/view/56e64bde

-Naresh Kamboju

thanks Naresh!

Tixy, Will you pick up this patch and then we merge from your tree?

Chris, Mark, you both seemed to agree with this approach in your other replies, but it would be good to get confirmation that the specific patch is what you were agreeing with, i.e. the one at the end of: http://lists.linaro.org/pipermail/linaro-kernel/2014-June/015860.html

If so, I'll add it to my MP tree so LSK can pull it.

-- Tixy

Chris Redpath

9:10 a.m.

On 27/06/14 09:44, Jon Medhurst (Tixy) wrote:

...

On Fri, 2014-06-27 at 09:06 +0800, Alex Shi wrote:

...
On 06/27/2014 02:51 AM, Naresh Kamboju wrote:

...
...
...
...
> What do you think?

In today's 1:1, I just talked with Mark, it is not a upstream quality fix for this issue. and better solution would ask more cpuidle part coordinate. :) So, I thought it maybe accepted for a quick fix before a final solution coming out. So I added more comments in commit log and remove the 'cpuidle_driver_unref()' which omitted in v1 patch.

below patch applied and tested. And noticed that, did not found deadlock and warning. Boot log attached and pasted. https://pastebin.linaro.org/view/56e64bde

-Naresh Kamboju

thanks Naresh!

Tixy, Will you pick up this patch and then we merge from your tree?

Chris, Mark, you both seemed to agree with this approach in your other replies, but it would be good to get confirmation that the specific patch is what you were agreeing with, i.e. the one at the end of: http://lists.linaro.org/pipermail/linaro-kernel/2014-June/015860.html

If so, I'll add it to my MP tree so LSK can pull it.

I'm happy for you to go ahead with the patch below, which is the one you referenced at http://lists.linaro.org/pipermail/linaro-kernel/2014-June/015860.html

The first one hadn't removed the driver deref which was incorrect.

--Chris

From: Alex Shi <alex.shi at linaro.org> Date: Thu, 19 Jun 2014 13:47:35 +0800 Subject: [PATCH] HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init.

This patch is just a quick/cheap workaround for cpuidle_driver_lock deadlock. It works for TC2 and any other platform where the idle driver cannot be changed at runtime.

Signed-off-by: Alex Shi <alex.shi at linaro.org> --- kernel/sched/fair.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)

- hmp_keepalive_delay(&ns_delay); + hmp_keepalive_delay(cpu, &ns_delay); keepalive->delay = ns_to_ktime(ns_delay); keepalive->init = true; }

-- 1.9.1

Jon Medhurst (Tixy)

9:26 a.m.

On Fri, 2014-06-27 at 10:10 +0100, Chris Redpath wrote:

...

On 27/06/14 09:44, Jon Medhurst (Tixy) wrote:

...
On Fri, 2014-06-27 at 09:06 +0800, Alex Shi wrote:

[...]

...

...
...
Tixy, Will you pick up this patch and then we merge from your tree?

Chris, Mark, you both seemed to agree with this approach in your other replies, but it would be good to get confirmation that the specific patch is what you were agreeing with, i.e. the one at the end of: http://lists.linaro.org/pipermail/linaro-kernel/2014-June/015860.html

If so, I'll add it to my MP tree so LSK can pull it.

I'm happy for you to go ahead with the patch below, which is the one you referenced at http://lists.linaro.org/pipermail/linaro-kernel/2014-June/015860.html

The first one hadn't removed the driver deref which was incorrect.

Thanks Chris.

Mark, Alex, I've added the patch to my MP tree and you can pull it from the usual place...

The following changes since commit 4378062f289e67259f017f6b176ee385dc974836:

sched: hmp: fix out-of-range CPU possible (2014-06-11 15:08:35 +0100)

are available in the git repository at:

git://git.linaro.org/arm/big.LITTLE/mp.git for-lsk

for you to fetch changes up to 65abdc9b50378783981ed2f3453a0aae090404e4:

HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull (2014-06-27 10:18:30 +0100)

---------------------------------------------------------------- Alex Shi (1): HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

kernel/sched/fair.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)

Mark Brown

10:06 a.m.

On 27 June 2014 10:26, Jon Medhurst (Tixy) tixy@linaro.org wrote:

...

are available in the git repository at:

git://git.linaro.org/arm/big.LITTLE/mp.git for-lsk

for you to fetch changes up to 65abdc9b50378783981ed2f3453a0aae090404e4:

HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull (2014-06-27 10:18:30 +0100)

Alex Shi (1): HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull

kernel/sched/fair.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)

Pulled & pushed now thanks.

Viresh Kumar

18 Jun 18 Jun

1:21 p.m.

On 18 June 2014 18:20, Alex Shi alex.shi@linaro.org wrote:

...

when lock checking option enabled in kernel config, TC2 catch a deadlock bug as following,

Hey, I just bought a HMP phone: Note 3 Neo. Is it going to blow up anytime ? :)

Chris Redpath

1:29 p.m.

On 18/06/14 14:21, Viresh Kumar wrote:

...

On 18 June 2014 18:20, Alex Shi alex.shi@linaro.org wrote:

...
when lock checking option enabled in kernel config, TC2 catch a deadlock bug as following,

Hey, I just bought a HMP phone: Note 3 Neo. Is it going to blow up anytime ? :)

Hey Viresh,

I don't think you'll have the problematic idle-pull-replacing-forced-up-migration patches, so probably not.

I don't want to hardcode the values, but I'm not averse to adding a callback from cpuidle once a driver is configured.

--Chris

4033

days inactive

4042

days old

linaro-kernel@lists.linaro.org

24 comments

participants

tags (0)

participants (7)

Alex Shi
Chris Redpath
Daniel Lezcano
Jon Medhurst (Tixy)
Mark Brown
Naresh Kamboju
Viresh Kumar