Re: [PATCH 2/2] hmp: Restrict ILB events if no CPU has > 1 task

13 Aug 2014

On Wed, 2014-08-13 at 10:09 +0100, Chris Redpath wrote:
...
On 12/08/14 17:26, Jon Medhurst (Tixy) wrote:
...
On Tue, 2014-08-12 at 14:50 +0100, Chris Redpath wrote:
...
Frequently in HMP, the big CPUs are only active with one task per
CPU and there may be idle CPUs in the big cluster. This patch avoids
triggering an idle balance in situations where none of the active
CPUs in the current HMP domain have > 1 tasks running.
When packing is enabled, only enforce this behaviour when we are
not in the smallest domain - there we idle balance whenever a CPU
is over the up_threshold regardless of tasks in case one needs to
be moved.
Signed-off-by: Chris Redpath chris.redpath@arm.com
This looks sane to me, though I have one comment about the
implementation, see inline comment below.
...
kernel/sched/fair.c | 27 +++++++++++++++++++++------
  1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 90c8a81..41d0cbd 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6537,16 +6537,16 @@ static int nohz_test_cpu(int cpu)

Decide if the tasks on the busy CPUs in the
littlest domain would benefit from an idle balance

*/
-static int hmp_packing_ilb_needed(int cpu)
+static int hmp_packing_ilb_needed(int cpu, int ilb_needed)
  {
   struct hmp_domain *hmp;

/* always allow ilb on non-slowest domain */


/* allow previous decision on non-slowest domain */
if (!hmp_cpu_is_slowest(cpu))


return 1;




return ilb_needed;


/* if disabled, use normal ILB behaviour */
if (!hmp_packing_enabled)



return 1;




return ilb_needed;


hmp = hmp_cpu_domain(cpu);
for_each_cpu_and(cpu, &hmp->cpus, nohz.idle_cpus_mask) {


@@ -6558,19 +6558,34 @@ static int hmp_packing_ilb_needed(int cpu)
  }
  #endif
+DEFINE_PER_CPU(cpumask_var_t, ilb_tmpmask);

static inline int find_new_ilb(int call_cpu)
{
 int ilb = cpumask_first(nohz.idle_cpus_mask);
#ifdef CONFIG_SCHED_HMP


int ilb_needed = 1;


int ilb_needed = 0;
int cpu;
struct cpumask* tmp = per_cpu(ilb_tmpmask, smp_processor_id());

Why do we need a percpu static variable ilb_tmpmask? It seems to only be
used once, here in this function, so could we not instead just have a
local stack based temporary variable like:
struct cpumask tmp;
or have I missed something?
We could do that, but this is called during sched tick so I wanted to 
avoid creating the cpumask on the stack. Do you think that's a 
reasonable thing to do or do you think it'd be just as quick as calling 
smp_processor_id()?
Well, depends if you think HMP will be being used on enormous
supercluster with gazillions of cores. I thought it's just being used
for mobile devices running Android? In which case, presumably those
kernels are compiled with NR_CPUS <= 32 cpus and cpumask will be a
single 32bit word and probably won't use any extra stack compared to
what you have. There again, in the scheme of things, it's probably
doesn't make much difference, so might as well stick with what you have.
...
The per-cpu variable stems from having a cached mask and not wanting to 
share it during ticks.
You say 'cached' mask which makes me think that the value is being saved
for use later, but its not is it, its a temporary value (the name 'tmp'
is a clue) that is being calculated in find_new_ilb() and used
immediately, and nothing else then looks at the calculated value again
(it will get overwritten by the next call to find_new_ilb() on by that
cpu).
Anyway, I can see nothing wrong with the patch, and if no-one else
raises issues we can ask for it to be added to LSK once ARM have
finished their testing and give me the go-ahead...
-- 
Tixy




    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [PATCH 2/2] hmp: Restrict ILB events if no CPU has > 1 task

Signed-off-by: Chris Redpath chris.redpath@arm.com