Re: [Eas-dev] [PATCH V3] Per Sched domain over utilization

6 Sep 2017


      On 29/08/17 23:49, Thara Gopinath wrote:
...
On 06/30/2017 06:30 AM, Dietmar Eggemann wrote:
Hi Dietmar,
Sorry for the delay in posting the reply.
...
On 06/23/2017 03:37 PM, Thara Gopinath wrote:
[...]
...
...
...
This implementation still can have corner scenarios with respect to
misfit tasks. For example consider a sched group with n cpus and
n+1 70%utilized tasks. Ideally this is a case for load balance to happen
I mentioned in my v2 review (Feb 2017) that this is only valid for #cpu
in a cluster >=7.
Yes. But don't you think it is still worth mentioning.
I agree that it is still worth mentioning your example but you should
add the part that it is only valid for #cpu in the cluster >=7.
[...]
...
...
Do you have test results for this patch running on a big-Little
platform? It would be really helpful to let's say have a set of
synthetic workloads (rt-app based) like:
(1) Create a over-utilized scenario for one cpu
(2) Create over-utilized scenario for one cluster
(3) Create a misfit scenario
(4) Create a combined (over-utilized and misfit) scenario
(5) Create an anti-test pattern (where the over-utilization is not
    detectable)
...
With a couple of extra trace events (some of them we already have in our
product kernel, e.g. indicating over-utilization/non over-utilization
sections) you could compare the test results between vanilla eas/next
integration (20170522_1454) and the one with your patch in terms of how
much time the system/a cluster was over-utilized or not.
I guess these results are necessary before we can take the next step,
i.e. integrating this patch into the EAS product kernel.
I am working on the test cases. I will not be posting another version of
this patch till we align on the testing, test cases and results.
Agreed. It would be helpful to post these test results with the new
patch version then.
[...]
...
...
...
@@ -4800,6 +4823,7 @@ static void
  enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
  {
      struct cfs_rq *cfs_rq;

struct sched_domain *sd;
 struct sched_entity *se = &p->se;
 int task_new = !(flags & ENQUEUE_WAKEUP);
@@ -4843,9 +4867,12 @@ enqueue_task_fair(struct rq *rq, struct

task_struct *p, int flags)
        if (!se) {
          add_nr_running(rq, 1);

   if (!task_new && !rq->rd->overutilized &&


       cpu_overutilized(rq->cpu))


       rq->rd->overutilized = true;




   rcu_read_lock();


   sd = rcu_dereference(rq->sd);


   if (!task_new && !is_sd_overutilized(sd) &&



Already raised in v2: IMHO, this is problematic. is_sd_overutilized()
returns false if the sd is not overutilized or sd == NULL.
Shouldn't we change all of these occurrences where we fetch the sd
from rcu_dereference() to:
if (... && sd && is_sd_overutilized(sd) && ..)
I prefer the check being in a single place.  But you are right there is
a problem in the is_sd_overutilized fn. I can change this function from
bool to int with multiple return value. We are only interested in
!is_sd_overutilized(). So I can return -1 in case of !sd.
As long as you can distinguish between !sd and !sd_not_overutilized
that's fine with me.
...
I do not see the problem with set_sd_overutilized and
clear_sd_overutilized.
There is no issue with these functions. I just meant that if you test
for sd in the condition you can get rid of the if (sd) in those functions.
[...]

    

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] [PATCH V3] Per Sched domain over utilization