Dear Petrick,
We agree with your suggestion, we are already working on this implementation and targeting to complete it by 8/27.
-----Original Message----- From: Patrick Bellasi [mailto:patrick.bellasi@arm.com] Sent: 24 August 2015 23:08 To: Jon Medhurst (Tixy) Cc: Rahul Khandelwal (Rahul Khandelwal); vincent.guittot@linaro.org; Chris Redpath; Morten Rasmussen; Alex.Shi@linaro.org; khilman@linaro.org; linaro-kernel@lists.linaro.org; Sanjeev Yadav (Sanjeev Kumar Yadav); Gaurav Jindal (Gaurav Jindal) Subject: Re: [PATCH 1/1] HMP: Do not send IPI if core already waked up
On Wed, Aug 19, 2015 at 01:27:27PM +0100, Jon Medhurst (Tixy) wrote:
Adding Patrick Bellasi to the 'to' list as he's been working on HMP with Chris Redpath....
On Wed, 2015-08-19 at 05:18 +0000, Rahul Khandelwal (Rahul Khandelwal) wrote:
Dear All,
Please consider the patch, It is related to HMP force up migration. It will avoid sending of unnecessary interrupts to CPUs of faster domain hence increase performance.
---------------------------- From 2d48749ac30a2c0a2ef77132f303d69605c3dd3f Mon Sep 17 00:00:00 2001 From: rahulkhandelwal rahul.khandelwal@spreadtrum.com Date: Fri, 14 Aug 2015 16:36:17 +0800 Subject: [PATCH 1/1] HMP: Do not send IPI if core already waked up
It is possible that we are sending IPI to a cpu in faster domain which is already waked up by other CPU in smaller domain.
HMP select the idle CPU using hmp_domain_min_load. Based on that HMP send IPI to the idle cpu in faster domain. There could be some latency by the core to wake up and set wake_for_idle_pull = 0. Next smaller cpu again check for idle CPU in faster domain and send IPI to already waked up CPU.
For example: In Octacore system, 0-3 are slower CPUs and 4-7 are faster CPUs. CPU0 and CPU1 has heavy tasks and CPU4 is idle. CPU0 execute hmp_force_up_migration find CPU4 as idle, it send IPI to CPU4 and return. After that CPU1 got the chance to run hmp_force_up_migration, it again find CPU4 as idle, send IPI to CPU4, which is not required.
This patch pinpoints another potential optimization. Not only this IPI is not required but at the end, once CPU4 finally wakes up, it will pull just one of the two big tasks which are currently running on the little domain.
Thus, if we really want to increase performance, we should also possibly select a better big CPU to wakeup. For example, in the described scenario, we should send the IPI to CPU5, or the next big idle core if any.
Signed-off-by: rahulkhandelwal rahul.khandelwal@spreadtrum.com
kernel/sched/fair.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1baf641..388836c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7200,6 +7200,11 @@ static void hmp_force_up_migration(int this_cpu) } p = task_of(curr); if (hmp_up_migration(cpu, &target_cpu, curr)) {
if
- (cpu_rq(target_cpu)->wake_for_idle_pull == 1) {
We should try to move this check into hmp_up_migration and return a valid target CPU only in case we verified that this CPUs has not yet been notified and thus its a suitable up migration target.
This update should have two benefits: 1. send IPIs only to big CPUs which have not yet been notified 2. reduce the latency to move multiple newly big tasks on big CPUs.
Still remains to be verified the impacts from an energy standpoint. Since the cluster will be on because one of the big CPUs is already going to be waken up, an (unlikely) false wakeup of a second big CPU should not have a big effect.
raw_spin_unlock_irqrestore(&target->lock, flags);
spin_unlock(&hmp_force_migration);
return;
} cpu_rq(target_cpu)->wake_for_idle_pull = 1; raw_spin_unlock_irqrestore(&target->lock, flags);
spin_unlock(&hmp_force_migration);
1.7.9.5
-- #include <best/regards.h>
Patrick Bellasi