This patch series is to optimize both power and performance on
big.LITTLE system. Almost optimization methodology are same with
RFCv2 version, so you can refer the detailed description in [1].
In this patch series, the new enhencemences for performance is
to spread tasks to more clusters when detect the highest capacity cores
are busy, the criteria for 'big core busy' is the big core is not idle,
this is because we have patch "sched/fair: avoid small task to migrate
to higher capacity CPU" to filter out to only migrate relative big
load task to big core, so if there have task is running on big core,
that means the big core utilization is not small. If all big cores have
task running, usually system is quite busy so we should go back to
select idlest CPU to replace "want_affine". So this is mainly finished
by patch "sched/fair: avoid migrate single task to busy big CPU" and
patch "sched/fair: select idle CPUs when big cluster is busy".
This patch series also optimize power for both PELT and WALT signals,
this is finished by patch "sched/fair: save power for when use walt
signals".
[1] https://lists.linaro.org/pipermail/eas-dev/2016-July/000522.html
Leo Yan (18):
sched/fair: optimize to more chance to select previous CPU
sched/fair: select CPU based on using lowest capacity
sched/fair: support to spread task in lowest schedule domain
sched/fair: add path to migrate to higher capacity CPU
sched/fair: force idle balance when busiest group is overloaded
sched/fair: avoid small task to migrate to higher capacity CPU
sched/fair: set imbalance for too many tasks on rq
sched/fair: kick nohz idle balance for misfit task
sched/fair: consider over utilized only for CPU is not idle
sched/fair: filter task for energy aware path
sched/fair: replace capacity_of by capacity_orig_of
sched/fair: refine when task is allowed only run one CPU
Documentation: EAS performance tunning for sysfs
sched/fair: avoid migrate single task to busy big CPU
sched/fair: fix building error for schedtune_task_margin
sched/fair: save power for when use walt signals
sched/fair: check task boosted value on destination CPU
sched/fair: select idle CPUs when big cluster is busy
Documentation/scheduler/sched-energy.txt | 87 ++++++++++
kernel/sched/fair.c | 286 ++++++++++++++++++++++++++++---
2 files changed, 348 insertions(+), 25 deletions(-)
--
1.9.1
This patch series implements an alternative window assisted load tracking
mechanism in lieu of PELT based cpu utilization tracking. Testing has
shown that a window based non-decaying metric such as WALT guiding cpu
frequency and task placement decisions can improve performance/power
especially when running workloads more commonly found on mobile devices.
The aim of this series is to incorporate WALT accounting into the
scheduler and feed WALT statistics to schedutil in order to guide cpu
frequency selection. The implementation is detailed in the commit text
of Patch 1. The eventual goal is to also guide placement decisions
based on WALT statistics.
WALT has existed in out-of-tree kernels for ARM/ARM64 commercialized
devices for a few years. This is an effort to bring WALT to mainline
as well as to test on multiple architectures and with varied workloads.
This RFC version is mainly to preview what the code will look like on
mainline. Future RFC revisions will include a theoretical discussion and
benchmark results.
Tested on an Intel x86_64 machine (on top of 4.7-rc6). (Benchmark
results will be sent out separately and as part of this message in the
next RFC version).
Patch 1: Adds WALT tracking to the scheduler
Patches 2-3: Temporary patches to bring in EAS/sched-freq like capacity
table and to use Intel PMC counters for more accurate
frequency invariant load tracking on X86. Included for
completeness but not meant for merging.
include/linux/sched.h | 35 ++++++++++
include/linux/sched/sysctl.h | 2 +
include/trace/events/sched.h | 76 +++++++++++++++++++++
init/Kconfig | 9 +++
kernel/sched/Makefile | 1 +
kernel/sched/core.c | 29 ++++++++-
kernel/sched/cpufreq_schedutil.c | 44 ++++++++++++-
kernel/sched/cputime.c | 11 +++-
kernel/sched/debug.c | 10 +++
kernel/sched/fair.c | 7 +-
kernel/sched/sched.h | 13 ++++
kernel/sched/walt.c | 580 ++++++++++++++++++++++++++++++++++
kernel/sched/walt.h | 75 +++++++++++++++++++++
kernel/sysctl.c | 18 +++++
14 files changed, 904 insertions(+), 6 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
This patch series implements an alternative window assisted load tracking
mechanism in lieu of PELT based cpu utilization tracking. Testing has
shown that a window based non-decaying metric such as WALT guiding cpu
frequency and task placement decisions can improve performance/power
especially when running workloads more commonly found on mobile devices.
The aim of this series is to incorporate WALT accounting into the
scheduler and feed WALT statistics to schedutil in order to guide cpu
frequency selection. The implementation is detailed in the commit text
of Patch 1. The eventual goal is to also guide placement decisions
based on WALT statistics.
WALT has existed in out-of-tree kernels for ARM/ARM64 commercialized
devices for a few years. This is an effort to bring WALT to mainline
as well as to test on multiple architectures and with varied workloads.
This RFC version is mainly to preview what the code will look like on
mainline. Future RFC revisions will include a theoretical discussion and
benchmark results.
Tested on an Intel x86_64 machine (on top of 4.7-rc6). (Benchmark
results will be sent out separately and as part of this message in the
next RFC version).
Patch 1: Adds WALT tracking to the scheduler
Patches 2 and 3: Temporary patches to bring in EAS/sched-freq like capacity
table and to use Intel PMC counters for more accurate
frequency invariant load tracking on X86. Included for
completeness but not meant for merging.
include/linux/sched.h | 35 ++++++++++
include/linux/sched/sysctl.h | 2 +
include/trace/events/sched.h | 76 +++++++++++++++++++++
init/Kconfig | 9 +++
kernel/sched/Makefile | 1 +
kernel/sched/core.c | 29 ++++++++-
kernel/sched/cpufreq_schedutil.c | 44 ++++++++++++-
kernel/sched/cputime.c | 11 +++-
kernel/sched/debug.c | 10 +++
kernel/sched/fair.c | 7 +-
kernel/sched/sched.h | 13 ++++
kernel/sched/walt.c | 580 ++++++++++++++++++++++++++++++++++
kernel/sched/walt.h | 75 +++++++++++++++++++++
kernel/sysctl.c | 18 +++++
14 files changed, 904 insertions(+), 6 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
This patch series is to optimize performance and refine patches
according to review comments.
- Patch 0001 is add more chance to select previous CPU for cache hot;
- In EAS code, the critical path is task waken up with function
energy_aware_wake_cpu(); this function is purposed to select one
possible target CPU with most energy saving. So it includes two
underlying functionality: the first one is to select most power
efficiency CPU for the task in one cluster, another is to migrate
task from big core to little core if little core can meet performance
requirement.
For first functionality for selection most power efficiency CPU
within cluster, EAS prefers to select a non-idle CPU so as result it
packs tasks into one CPU as possible. This is not an optimal solution
with two reasons: the first reason is this introduces long schedule
latency after multiple tasks on the same rq; the second reason is it
easily gets result as small tasks packing within one CPU with higher
operating point. Finally this is the observed foremost issue if there
have multiple tasks, neither power or performance can achieve optimal
result.
So patch 0002 is to solve this issue to try to select CPU if can keep
CPU at lowest OPP as possible.
- Current code has no mechanism to spread these tasks throughout the
little cluster so tasks are packed on one CPU when CPU is not
“over-utilized”. In this case, only one CPU is very busy but other
CPUs in the same cluster are in idle state.
Patch 0003 is to spread task in lowest schedule domain (in cluster
level) after add a medium state named "half-utilized". This may a
temperary solution, due this likely a better solution is to unify
flag for "over-utilized".
- In CFS, PELT signals take long time to increase to high value and
decay to small value; on the other hand, EAS does not take account
load_avg value (runnable time) but only focus on util_avg value
(running time). So these issues are really dependent on fundamental
signals.
So hope have advanced method to accelerate PELT signals and dismiss
the issue introduced by long runnable time. Patch 0004 we can take it
as a temperary solution, likely we can use the big difference between
load_avg and util_avg to change to use inflate value, also can use it
to reflect runnable time.
Patch 0004 also has side effects for misfit flag. If any CPU has
“misfit” task on it, then EAS will set imbalance value as CPU
capacity and migrate such load from little core to big core. So
“misfit” is quite good for there have only one big task on the
little CPU so the CPU cannot meet task’s performance requirement
with function “task_fits_max(p, rq->cpu)”; but if there have two
tasks on the little CPU, then the task’s utilization value just
half of CPU capacity value so finally EAS considers CPU can meet
task requirement. Patch 0004 can more easily to set true for
misfit: rq->misfit_task = !task_fits_max(p, rq->cpu)
- In function energy_aware_wake_cpu(), it is possible to directly
migrate task from little core to big core, but the conditions are
rigid: the condition 1 is CPU capacity cannot meet this task
requirement; the condition 2 is source CPU is “over-utilized”. If the
source CPU is not “over-utilized” for condition 2, then even little
CPU cannot meet task requirement but EAS will compare CPU energy and
as the end it still selects previous little CPU
Patch 0005 is to add extra path to directly migrate task from little
core to big core.
- For very heavily workload with multi-threads, we observed the tasks
are not migrated within big cluster, also tasks are hard to migrate
from big cluster to little cluster even little cluster have idle CPUs
are available to run. So need optimize EAS to handle this case likely
to go back with CFS behaviour.
Patch 0006 and 0008 are to fix this related issues.
- SMP load balance may migrate small task onto big core, but usually at
this time point we are only looking forward big tasks migration,
finally this hurts both power and performance. So patch 0007 it will
avoid small task to migrate to higher capacity CPU so it will give
more chance to real big task migration to higher capacity CPU.
Leo Yan (8):
sched/fair: optimize to more chance to select previous CPU
sched/fair: select CPU based on using lowest capacity
sched/fair: support to spread task in lowest schedule domain
sched/fair: use load metrics to replace util when have big difference
sched/fair: add path to migrate to higher capacity CPU
sched/fair: force idle balance when busiest group is overloaded
sched/fair: avoid small task to migrate to higher capacity CPU
sched/fair: set imbalance for too many tasks on rq
kernel/sched/fair.c | 193 ++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 173 insertions(+), 20 deletions(-)
--
1.9.1
This patch series is to optimize performance.
Patch 0001 is to optimize CPU selection flow so let task has more
chance to stay on previous CPU. Patch 0002 actually is a big change
for EAS's policy for CPU selection, it trys to select idle CPU as
possible. From profiling result, 0002 have good effect that spread tasks
out if there have many tasks are running at the meantime.
Patches 0003~0004 are to optimize the scenario for single thread case.
In this case, the thread has relative high utilization value, but the
value cannot easily over tipping point. So patche 0004 try to set
criteria to in some condition change to use load_avg rather than
util_avg to boost the single thread.
Patch 0005 is to optimize the flow for spreading tasks within big
cluster.
Patches 0006~0007 is to fix the signal for avg_load.
Leo Yan (8):
sched/fair: optimize to more chance to select previous CPU
sched/fair: select idle CPU for waken up task
sched/fair: add path to migrate to higher capacity CPU
sched/fair: use load to replace util when have big difference
sched/fair: spread tasks in cluster when over tipping point
sched/fair: correct avg_load as CPU average load
sched/fair: fix to calculate average load cross cluster
sched/fair: set imbn to 1 for too many tasks on rq
include/linux/sched.h | 1 +
kernel/sched/fair.c | 93 +++++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 84 insertions(+), 10 deletions(-)
--
1.9.1
Hi Patrick,
[ + eas-dev ]
Here have a common question for how to define schedTune threshold
array for payoff. So basically I want check below questions:
- When every CGroup has its own perf_boost_idx for PB region and
perf_constrain_idx for PC region. So do you have suggestion or
guideline to define these index?
And for difference CGroup like "backgroud", "foreground" or
"performance" every CGroup will have its dedicated index or the
platform can share the same index value?
- How to define the array value for "threshold_gains"?
IIUC this array is platform dependency, but what's the
reasonable method to generate this table? Here have some suggested
testing for generating this table?
Or my understanding is wrong so this array is fixed, then just need
ajust perf_boost_idx/perf_constrain_idx for platform is enough?
- So far we cannot set these payoff parameters (including
perf_boost_idx/perf_constrain_idx and threshold_gains) from sysfs
dynamically, so how we can initilizae these value for platform
specific? Suppose now we can only set these value when kernel's
init flow, right?
Thanks,
Leo Yan
Hi,
I am Amanda,
Would you be interested in acquiring an email list of "Moms Email List" from USA?
We have data for Mortgage Email List, New Homeowner Email List, Online Shoppers List, Travelers Email List and many more. Choose the best one that meets your need. We provide you with current and active contact on every list. Take advantage of that, let your marketing efforts be fruitful.
Each record in the list Contact Name( First, Middle, Last Name), Direct Mailing Address ( Address, City, State, Zip Code), List Type, Source, IP Address, and Email Address.
All the contacts are opt-in verified, 100% permission based and can be used for unlimited multi-channel marketing.
Please let me know your thoughts towards Moms Email List.
Best Regards,
Amanda Clark
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
In current code the CPU's idle state cpufreq_pstates::idle is initialized
to '-1'; and until parse first "cpu_idle" event for the CPU then set CPU's
idle state to '0' or '1' corresponding to active or idle. This will cause
error for P-state's statistics: from the beginning to first "cpu_idle"
event, during this period the CPU's idle state is '-1' so function
cpu_change_pstate() will always think it's first update and finally abandon
previous time.
This will introduce very big error if the CPU is always running and never
run into idle state. So this patch is to fix this issue by initialize CPU's
corresponding C-state and P-state:
- Firstly gather every CPU's starting frequency and time stamp;
- Then gather CPU's idle state according to first cpu_idle log:
If the CPU first cpu_idle state is '-1', that means from the beginning
the CPU is stayed on idle state;
If the CPU first cpu_idle state is other value, means the CPU is active.
- With these info, finally initialize every CPU's C-state and P-state
before analyse trace logs.
Here should note one thing is: when CPU is idle at beginning, we don't know
exact idle state, so just assume CPU is in idle state 0; but this will not
impact too much for statistics, due usually idlestat will wakeup all CPUs
at the beginning. So it will introduce very small deviation.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
tracefile_idlestat.c | 123 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 123 insertions(+)
diff --git a/tracefile_idlestat.c b/tracefile_idlestat.c
index 3430693..d0cd366 100644
--- a/tracefile_idlestat.c
+++ b/tracefile_idlestat.c
@@ -152,6 +152,127 @@ int load_text_data_line(char *buffer, struct cpuidle_datas *datas, char *format,
return get_wakeup_irq(datas, buffer);
}
+
+/**
+ * init_cpu_idle_state - Init CPU's idle state according to first cpu_idle log.
+ * For a specific cpu_idle event, its state is '-1' then that means from the
+ * beginning the CPU is stayed on idle state; Otherwise means the CPU is active.
+ * So initilize per-CPU idle flag to get more accurate time.
+ *
+ * @datas: structure for P-state and C-state's statistics
+ * @f: the file handle of the idlestat trace file
+ */
+void init_cpu_idle_state(struct cpuidle_datas *datas, FILE *f)
+{
+ char buffer[BUFSIZE];
+ int state, cpu;
+ double time;
+ struct cpufreq_pstates *ps;
+
+ unsigned long *cpu_start_idle;
+ int *cpu_start_freq;
+ double cpu_start_time;
+
+ fseek(f, 0, SEEK_SET);
+
+ cpu_start_freq = malloc(sizeof(int) * datas->nrcpus);
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_freq[cpu] = 0xdeadbeef;
+
+ /*
+ * Find the start time stamp and the CPU's frequency at beginning;
+ * So we can use these info to add dummy info.
+ */
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_frequency")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return;
+ }
+ } else
+ continue;
+
+ if (cpu_start_freq[cpu] != 0xdeadbeef)
+ continue;
+
+ if (cpu == 0)
+ cpu_start_time = time;
+
+ cpu_start_freq[cpu] = state;
+
+ break;
+ }
+
+ /* After traverse file, reset offset */
+ fseek(f, 0, SEEK_SET);
+
+ /*
+ * Find the CPU's idle state at beginning
+ */
+ cpu_start_idle = malloc(sizeof(long) * datas->nrcpus);
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_idle[cpu] = 0xdeadbeef;
+
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_idle")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return;
+ }
+ } else
+ continue;
+
+ /* CPU's state has been initialized, skip it */
+ if (cpu_start_idle[cpu] != 0xdeadbeef)
+ continue;
+
+ /*
+ * The CPU's first cpu_idle is '-1', means CPU is staying in
+ * idle state and exit from idle until first cpu_idle event.
+ * Otherwise, means the CPU is active at beginning.
+ */
+ if (state == -1)
+ cpu_start_idle[cpu] = 0;
+ else
+ cpu_start_idle[cpu] = 4294967295;
+ }
+
+ /* After traverse file, reset offset */
+ fseek(f, 0, SEEK_SET);
+
+ /* Initialize every CPU's cstate and pstate */
+ for (cpu = 0; cpu < datas->nrcpus; cpu++) {
+
+ ps = &(datas->pstates[cpu]);
+
+ if (cpu_start_idle[cpu] == 0) {
+ /*
+ * CPU is idle at beginning, init cstate;
+ *
+ * here don't know exact idle state, so just assume CPU
+ * is in idle state 0; but this will not impace too much
+ * for statistics, due usually idlestat will wakeup all
+ * CPUs at the beginning.
+ */
+ ps->idle = 1;
+ store_data(cpu_start_time, 0, cpu, datas);
+ } else {
+ /* CPU is busy at beginning, init pstate */
+ ps->idle = 0;
+ cpu_change_pstate(datas, cpu, cpu_start_freq[cpu],
+ cpu_start_time);
+ }
+ }
+}
+
void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
{
double begin = 0, end = 0;
@@ -159,6 +280,8 @@ void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
setup_topo_states(datas);
+ init_cpu_idle_state(datas, f);
+
do {
if (load_text_data_line(buffer, datas, TRACE_FORMAT,
&begin, &end, &start) != -1) {
--
1.9.1
In current code the CPU's idle state cpufreq_pstates::idle is initialized
to '-1'; and until parse first "cpu_idle" event for the CPU then set CPU's
idle state to '0' or '1' corresponding to active or idle. This will cause
error for P-state's statistics: from the beginning to first "cpu_idle"
event, during this period the CPU's idle state is '-1' so function
cpu_change_pstate() will always think it's first update and finally abandon
previous time.
This will introduce very big error if the CPU is always running and never
run into idle state. So this patch is to fix this issue by initialize CPU's
corresponding C-state and P-state:
- Firstly gather every CPU's starting frequency and time stamp;
- Then gather CPU's idle state according to first cpu_idle log:
If the CPU first cpu_idle state is '-1', that means from the beginning
the CPU is stayed on idle state;
If the CPU first cpu_idle state is other value, means the CPU is active.
- With these info, finally initialize every CPU's C-state and P-state
before analyse trace logs.
Here should note one thing is: when CPU is idle at beginning, we don't know
exact idle state, so just assume CPU is in idle state 0; but this will not
impact too much for statistics, due usually idlestat will wakeup all CPUs
at the beginning. So it will introduce very small deviation.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
tracefile_idlestat.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 152 insertions(+)
diff --git a/tracefile_idlestat.c b/tracefile_idlestat.c
index 3430693..2674478 100644
--- a/tracefile_idlestat.c
+++ b/tracefile_idlestat.c
@@ -152,6 +152,153 @@ int load_text_data_line(char *buffer, struct cpuidle_datas *datas, char *format,
return get_wakeup_irq(datas, buffer);
}
+
+/**
+ * init_cpu_idle_state - Init CPU's idle state according to first cpu_idle log.
+ * For a specific cpu_idle event, its state is '-1' then that means from the
+ * beginning the CPU is stayed on idle state; Otherwise means the CPU is active.
+ * So initilize per-CPU idle flag to get more accurate time.
+ *
+ * @datas: structure for P-state and C-state's statistics
+ * @f: the file handle of the idlestat trace file
+ */
+int init_cpu_idle_state(struct cpuidle_datas *datas, FILE *f)
+{
+ char buffer[BUFSIZE];
+ int state, cpu;
+ double time;
+ struct cpufreq_pstates *ps;
+
+ unsigned long *cpu_start_idle;
+ int *cpu_start_freq;
+ double cpu_start_time;
+
+ int ret;
+
+ ret = fseek(f, 0, SEEK_SET);
+ if (ret < 0) {
+ fprintf(stderr, "failed to set the start file position\n");
+ return ret;
+ }
+
+ cpu_start_freq = malloc(sizeof(int) * datas->nrcpus);
+ if (!cpu_start_freq) {
+ fprintf(stderr, "failed to alloc for start frequency states\n");
+ return -1;
+ }
+
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_freq[cpu] = 0xdeadbeef;
+
+ /*
+ * Find the start time stamp and the CPU's frequency at beginning;
+ * So we can use these info to add dummy info.
+ */
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_frequency")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return -1;
+ }
+ } else
+ continue;
+
+ if (cpu_start_freq[cpu] != 0xdeadbeef)
+ continue;
+
+ if (cpu == 0)
+ cpu_start_time = time;
+
+ cpu_start_freq[cpu] = state;
+
+ break;
+ }
+
+ /* After traverse file, reset offset */
+ ret = fseek(f, 0, SEEK_SET);
+ if (ret < 0) {
+ fprintf(stderr, "failed to set the start file position\n");
+ return ret;
+ }
+
+ /*
+ * Find the CPU's idle state at beginning
+ */
+ cpu_start_idle = malloc(sizeof(long) * datas->nrcpus);
+ if (!cpu_start_idle) {
+ fprintf(stderr, "failed to alloc for start idle states\n");
+ return -1;
+ }
+
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_idle[cpu] = 0xdeadbeef;
+
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_idle")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return -1;
+ }
+ } else
+ continue;
+
+ /* CPU's state has been initialized, skip it */
+ if (cpu_start_idle[cpu] != 0xdeadbeef)
+ continue;
+
+ /*
+ * The CPU's first cpu_idle is '-1', means CPU is staying in
+ * idle state and exit from idle until first cpu_idle event.
+ * Otherwise, means the CPU is active at beginning.
+ */
+ if (state == -1)
+ cpu_start_idle[cpu] = 0;
+ else
+ cpu_start_idle[cpu] = 4294967295;
+ }
+
+ /* After traverse file, reset offset */
+ ret = fseek(f, 0, SEEK_SET);
+ if (ret < 0) {
+ fprintf(stderr, "failed to set the start file position\n");
+ return ret;
+ }
+
+ /* Initialize every CPU's cstate and pstate */
+ for (cpu = 0; cpu < datas->nrcpus; cpu++) {
+
+ ps = &(datas->pstates[cpu]);
+
+ if (cpu_start_idle[cpu] == 0) {
+ /*
+ * CPU is idle at beginning, init cstate;
+ *
+ * here don't know exact idle state, so just assume CPU
+ * is in idle state 0; but this will not impace too much
+ * for statistics, due usually idlestat will wakeup all
+ * CPUs at the beginning.
+ */
+ ps->idle = 1;
+ store_data(cpu_start_time, 0, cpu, datas);
+ } else {
+ /* CPU is busy at beginning, init pstate */
+ ps->idle = 0;
+ cpu_change_pstate(datas, cpu, cpu_start_freq[cpu],
+ cpu_start_time);
+ }
+ }
+
+ return 0;
+}
+
void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
{
double begin = 0, end = 0;
@@ -159,6 +306,11 @@ void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
setup_topo_states(datas);
+ if (init_cpu_idle_state(datas, f) < 0) {
+ fprintf(stderr, "failed to initlized cpu states\n");
+ exit(-1);
+ }
+
do {
if (load_text_data_line(buffer, datas, TRACE_FORMAT,
&begin, &end, &start) != -1) {
--
1.9.1
subscribe
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
If we enable CONFIG_SCHED_TUNE without CONFIG_CGROUPS we get the following
errors:
kernel/sched/fair.c: In function 'energy_diff_evaluate':
kernel/sched/fair.c:4795:2: error: implicit declaration of function 'schedtune_normalize_energy' [-Werror=implicit-function-declaration]
nrg_delta = schedtune_normalize_energy(eenv->nrg.diff);
^
kernel/sched/fair.c:4798:2: error: implicit declaration of function 'schedtune_accept_deltas' [-Werror=implicit-function-declaration]
eenv->payoff = schedtune_accept_deltas(
^
Fix this by making sure the dummy version of these functions are
defined if the real ones aren't.
Signed-off-by: Jon Medhurst <tixy(a)linaro.org>
---
This is another build fix for the 3.18 backport of EAS [1] but I'm not
sure if the missing functions are actually meant to do something in the
case when we have SCHED_TUNE without CGROUPS?
[1] http://git.linaro.org/arm/eas/kernel.git/shortlog/refs/heads/linux-3.18-eas…
kernel/sched/tune.h | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/kernel/sched/tune.h b/kernel/sched/tune.h
index da1f7b2..3410a1d 100644
--- a/kernel/sched/tune.h
+++ b/kernel/sched/tune.h
@@ -1,6 +1,3 @@
-
-#ifdef CONFIG_SCHED_TUNE
-
#ifdef CONFIG_CGROUP_SCHEDTUNE
int schedtune_cpu_boost(int cpu);
@@ -13,14 +10,7 @@ int schedtune_normalize_energy(int energy);
int schedtune_accept_deltas(int nrg_delta, int cap_delta,
struct task_struct *task);
-#else /* CONFIG_CGROUP_SCHEDTUNE */
-
-#define schedtune_enqueue_task(task, cpu) do { } while (0)
-#define schedtune_dequeue_task(task, cpu) do { } while (0)
-
-#endif /* CONFIG_CGROUP_SCHEDTUNE */
-
-#else /* CONFIG_SCHED_TUNE */
+#else
#define schedtune_enqueue_task(task, cpu) do { } while (0)
#define schedtune_dequeue_task(task, cpu) do { } while (0)
@@ -28,4 +18,4 @@ int schedtune_accept_deltas(int nrg_delta, int cap_delta,
#define schedtune_normalize_energy(energy) energy
#define schedtune_accept_deltas(nrg_delta, cap_delta, task) nrg_delta
-#endif /* CONFIG_SCHED_TUNE */
+#endif
--
2.1.4
Dear Dev,
This is to confirm that one or more of your parcels has been shipped.
You can review complete details of your order in the find attached.
Regards,
Francis Alexander,
FedEx Station Agent.
Hi
I found some bugs when integrating the 3.18 EAS backport [1] into
the LSK 3.18 based kernel I look after for ARM's Juno and Versatile
Express boards. These patches are my fixes for those bugs, I don't know
whether they are useful or relevent to other versions of EAS. If they
are OK, I guess I should at least add them to [1] ?
----------------------------------------------------------------
Jon Medhurst (3):
arm: Fix build error "conflicting types for 'scale_cpu_capacity'"
arm: Fix #if/#ifdef mixup in topology.c
sched/tune: Avoid null pointer dereference in schedtune_add_cluster_nrg
arch/arm/include/asm/topology.h | 1 +
arch/arm/kernel/topology.c | 2 +-
kernel/sched/tune.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
[1] http://git.linaro.org/arm/eas/kernel.git/shortlog/refs/heads/linux-3.18-eas…
On 2016年05月26日 17:44, eas-dev-request(a)lists.linaro.org wrote:
> Send eas-dev mailing list submissions to
> eas-dev(a)lists.linaro.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.linaro.org/mailman/listinfo/eas-dev
> or, via email, send a message with subject or body 'help' to
> eas-dev-request(a)lists.linaro.org
>
> You can reach the person managing the list at
> eas-dev-owner(a)lists.linaro.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of eas-dev digest..."
>
>
> Today's Topics:
>
> 1. Re: [PATCH] sched/tune: fix payoff calculation for
> performance constraint region (Patrick Bellasi)
> 2. Re: [PATCH] sched/tune: fix payoff calculation for
> performance constraint region (Leo Yan)
> 3. Re: [PATCH] sched/tune: fix payoff calculation for
> performance constraint region (Patrick Bellasi)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 25 May 2016 18:53:28 +0100
> From: Patrick Bellasi <patrick.bellasi(a)arm.com>
> To: Leo Yan <leo.yan(a)linaro.org>
> Cc: eas-dev(a)lists.linaro.org
> Subject: Re: [Eas-dev] [PATCH] sched/tune: fix payoff calculation for
> performance constraint region
> Message-ID: <20160525175328.GA15730@e105326-lin>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Leo,
> thanks a lot for reviewing SchedTune and pointing out this issue.
>
> Actually going through the code I've noticed another big issue
> related to the definition of the acceptable regions, following
> commented inline. Basically, with the current implementation
> we was getting a correct "by chance" C region, while the acceptance
> for the B region was completely wrong.
>
> In attachment a new version of the patch, please have a look and let
> me know if you have any doubt and/or suggestions.
> If the patch is ok for you, lemme also know if it's ok for you to add
> your sign-off.
>
> Cheers Patrick
>
> On 23-May 21:47, Leo Yan wrote:
>> On Fri, May 20, 2016 at 12:24:49AM +0800, Leo Yan wrote:
>>> When calculate payoff criteria for performance constraint region,
>>> the inequality formula is wrong:
>>>
>>> cap_delta / nrg_delta > cap_gain / nrg_gain
>>>
>>> Here nrg_delta < 0, so when multiply it both side then should then
>>> multiplying nrg_delta inverts the inequality:
>>>
>>> nrg_delta * cap_gain > cap_delta * nrg_gain
>>>
>>> So finally we can get unified formula for both performance constraint
>>> region and performance boost region. So this patch unified these the
>>> calculation after fixed inequality formula.
>>>
>>> Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
>>> ---
>>> kernel/sched/tune.c | 54 ++++++++++++++++++++++++++++-------------------------
>>> 1 file changed, 29 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/kernel/sched/tune.c b/kernel/sched/tune.c
>>> index 9d8eeb4..1da85a8 100644
>>> --- a/kernel/sched/tune.c
>>> +++ b/kernel/sched/tune.c
>>> @@ -58,36 +58,40 @@ __schedtune_accept_deltas(int nrg_delta, int cap_delta,
>>> int perf_boost_idx, int perf_constrain_idx)
>>> {
>>> int payoff = -INT_MAX;
>>> + int idx = -1;
>>>
>>> /* Performance Boost (B) region */
>>> - if (nrg_delta > 0 && cap_delta > 0) {
>>> - /*
>>> - * Evaluate "Performance Boost" vs "Energy Increase"
>>> - * payoff criteria:
>>> - * cap_delta / nrg_delta < cap_gain / nrg_gain
>>> - * which is:
>>> - * nrg_delta * cap_gain > cap_delta * nrg_gain
>>> - */
>>> - payoff = nrg_delta * threshold_gains[perf_boost_idx].cap_gain;
>>> - payoff -= cap_delta * threshold_gains[perf_boost_idx].nrg_gain;
>>> - return payoff;
>>> - }
>>> -
>>> + if (nrg_delta > 0 && cap_delta > 0)
>
> Looking better at the called, I think it's worth con accept also
> points in the P axis, thus:
>
> if (nrg_delta >= 0 && cap_delta > 0)
>
>
>>> + idx = perf_boost_idx;
>>> /* Performance Constraint (C) region */
>>> - if (nrg_delta < 0 && cap_delta < 0) {
>>> - /*
>>> - * Evaluate "Performance Boost" vs "Energy Increase"
>>> - * payoff criteria:
>>> - * cap_delta / nrg_delta > cap_gain / nrg_gain
>>> - * which is:
>>> - * cap_delta * nrg_gain > nrg_delta * cap_gain
>>> - */
>>> - payoff = cap_delta * threshold_gains[perf_constrain_idx].nrg_gain;
>>> - payoff -= nrg_delta * threshold_gains[perf_constrain_idx].cap_gain;
>>> - return payoff;
>>> - }
>>> + else if (nrg_delta < 0 && cap_delta < 0)
>
> For the same considerations we should better accept points in the E
> axis, thus:
>
> else if (nrg_delta < 0 && cap_delta <= 0)
>
>>> + idx = perf_constrain_idx;
>>>
>>> /* Default: reject schedule candidate */
>>> + if (idx == -1)
>>> + return payoff;
>>> +
>>> + /*
>>> + * Evaluate "Performance Boost" vs "Energy Increase"
>>> + *
>>> + * - Performance Boost (B) region
>>> + *
>>> + * Condition: nrg_delta > 0 && cap_delta > 0
>>> + * Payoff criteria:
>>> + * cap_delta / nrg_delta < cap_gain / nrg_gain =
>
Looking better to put the condition "nrg_delta == 0" or "cap_delta == 0"
in function "accept_deltas", to avoid fetch rcu lock and more functions
called, thus:
/* Optimal (O) region */
if ((nrg_delta < 0 && cap_delta >= 0) || (nrg_delta <=0 && cap_delta >
0)) {
trace_sched_tune_filter(nrg_delta, cap_delta, 0, 0, 1, 0);
return INT_MAX;
}
> Here the inequality has a wrong direction.
> The schedule candidate acceptable in the B region are those for which:
>
> cap_gain / nrg_gain < cap_delta / nrg_delta
>
> which represents points in the "upper cut".
>
> Thus:
>>> + * nrg_delta * cap_gain > cap_delta * nrg_gain
>
> has to be:
> cap_gain * nrg_delta < cap_delta * nrg_gain
>
> Which results into a "positive accept" payoff defined as:
>
> payoff = (cap_delta * nrg_gain) - (cap_gain * nrg_delta)
>
>>> + * (note: nrg_delta > 0, nrg_gain > 0)
>>> + *
>>> + * - Performance Constraint (C) region
>>> + *
>>> + * Condition: nrg_delta < 0 && cap_delta < 0
>>> + * payoff criteria:
>>> + * cap_delta / nrg_delta > cap_gain / nrg_gain =
>>> + * nrg_delta * cap_gain > cap_delta * nrg_gain
>
>
> In the C region we have both a wrong definition and the sign error you
> reported, which turned out to provide a "by change" correct
> implementation, which is:
>
> cap_gain / nrg_gain > cap_delta / nrg_delta =
> cap_gain * nrg_delta < cap_delta * nrg_gain
>
> Which results into a "positive accept" payoff defined as:
>
> payoff = (cap_delta * nrg_gain) - (cap_gain * nrg_delta)
>
> The same as for the B region...
>
>>> + * (note: nrg_delta < 0, nrg_gain > 0)
>>> + */
>>> + payoff = nrg_delta * threshold_gains[perf_boost_idx].cap_gain;
>>> + payoff -= cap_delta * threshold_gains[perf_boost_idx].nrg_gain;
>>
>> Sorry, here should be:
>> + payoff = nrg_delta * threshold_gains[idx].cap_gain;
>> + payoff -= cap_delta * threshold_gains[idx].nrg_gain;
>
> ... which means that this two operations have to be inverted:
>
> payoff = cap_delta * threshold_gains[gain_idx].nrg_gain;
> payoff -= nrg_delta * threshold_gains[gain_idx].cap_gain;
>
>>
>>> return payoff;
>>> }
>>>
>>> --
>>> 1.9.1
>>>
>>
>
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Dear Dev,
Your parcel has arrived at May 28. Courier was unable to deliver the parcel to you.
Shipment Label is attached to email.
Yours trully,
Warren Carney,
Sr. Support Manager.
Dear Dev,
We could not deliver your item.
You can review complete details of your order in the find attached.
Yours faithfully,
Dwight Mccall,
Sr. Delivery Manager.
Hi Deitmar,
[ + eas-dev ]
On Sun, May 22, 2016 at 06:06:47PM +0100, Dietmar Eggemann wrote:
> On 05/21/2016 06:25 AM, Leo Yan wrote:
> >On Fri, May 20, 2016 at 08:08:16PM +0100, Dietmar Eggemann wrote:
> >>Hi Leo,
> >>
> >>try the attached testcase in LISA which runs the wl's in '/tg_1/tg_11'
> >>You can create different tg level-hierarchies in rfc_tg.config if you wish.
> >>
> >>08:05:17 DEBUG : sudo -- sh -c '/root/devlib-target/bin/shutils cgroups_run_into /tg_1/tg_11 '\''/root/devlib-target/bin/rt-app /root/devlib-target/run_dir/06_pct_00.json'\'''
> >
> >Very appreciate for the case.
>
> I thought about this again and maybe you want to test task migration of
> a task running in a task group? This would make much more sense than
> only running task in a task group in case you want to test the pelt signals.
After enable EAS, I can see the task running in task group is migrated
between different CPUs when task is waken up.
> I added some functionality to rt-app which lets you restrict the cpu
> affinity of a task per phase of its run so you can create a task inside
> a task group which alternates between two cpus while running. This
> migration is done by the running task (so it's
> sched_setaffinity()->__set_cpus_allowed_ptr()->stop_one_cpu(...,
> migration_cpu_stop, ...)->__migrate_task()->move_queued_task()
>
> So if you interested in this just ask me on eas-dev so I can share the
> rt-app functionality and a how-to build rt-app on the list for a broader
> audience.
Yes, this is another path we should test for task migration. So could
you share this on mailing list? We also can consider to integrate this
into rt-app's repo.
Thanks,
Leo Yan
In current code the CPU's idle state cpufreq_pstates::idle is initialized
to '-1'; and until parse first "cpu_idle" event for the CPU then set CPU's
idle state to '0' or '1' corresponding to active or idle. This will cause
error for P-state's statistics: from the beginning to first "cpu_idle"
event, during this period the CPU's idle state is '-1' so function
cpu_change_pstate() will always think it's first update and finally abandon
previous time.
This will introduce very big error if the CPU is always running and never
run into idle state. So this patch is to fix this issue by initialize CPU's
idle state before parse P-state and C-state's time. Initialize CPU's idle
state according to first cpu_idle log:
- If the CPU first cpu_idle state is '-1', that means from the beginning
the CPU is stayed on idle state;
- If the CPU first cpu_idle state is other value, means the CPU is active.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
tracefile_idlestat.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 51 insertions(+)
diff --git a/tracefile_idlestat.c b/tracefile_idlestat.c
index 3430693..fb0b3d8 100644
--- a/tracefile_idlestat.c
+++ b/tracefile_idlestat.c
@@ -152,6 +152,55 @@ int load_text_data_line(char *buffer, struct cpuidle_datas *datas, char *format,
return get_wakeup_irq(datas, buffer);
}
+/**
+ * init_cpu_idle_state - Init CPU's idle state according to first cpu_idle log.
+ * For a specific cpu_idle event, its state is '-1' then that means from the
+ * beginning the CPU is stayed on idle state; Otherwise means the CPU is active.
+ * So initilize per-CPU idle flag to get more accurate time.
+ *
+ * @datas: structure for P-state and C-state's statistics
+ * @f: the file handle of the idlestat trace file
+ */
+void init_cpu_idle_state(struct cpuidle_datas *datas, FILE *f)
+{
+ struct cpufreq_pstates *ps;
+ unsigned int state, freq, cpu;
+ double time;
+ char buffer[BUFSIZE];
+
+ do {
+ if (strstr(buffer, "cpu_idle")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return -1;
+ }
+ }
+
+ ps = &(datas->pstates[cpu]);
+
+ /* CPU's state has been initialized, skip it */
+ if (ps->idle != -1)
+ continue;
+
+ /*
+ * The CPU's first cpu_idle is '-1', means CPU is staying in
+ * idle state and exit from idle until first cpu_idle event.
+ * Otherwise, means the CPU is active at beginning.
+ */
+ if (state == -1)
+ ps->idle = 1;
+ else
+ ps->idle = 0;
+
+ } while (fgets(buffer, BUFSIZE, f));
+
+ /* After traverse file, reset offset */
+ fseek(f, 0, SEEK_SET);
+}
+
void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
{
double begin = 0, end = 0;
@@ -159,6 +208,8 @@ void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
setup_topo_states(datas);
+ init_cpu_idle_state(datas, f);
+
do {
if (load_text_data_line(buffer, datas, TRACE_FORMAT,
&begin, &end, &start) != -1) {
--
1.9.1
Hi Patrick,
[ + eas-dev ]
With non-root user in Android, I cannot add PID to SchedTune's cgroup;
At beginning I thought it's related with cgroup's file node
attribution, so tried to use "root" user to change permission with
"a+rwx", even so still cannot set cgroup's node by non-root user.
hikey:/ $ su
hikey:/ # chmod a+rwx /sys/fs/cgroup/stune/performance/cgroup.procs
hikey:/ # exit
hikey:/ $ echo 1937 > /sys/fs/cgroup/stune/performance/cgroup.procs
hikey:/ $ cat /sys/fs/cgroup/stune/performance/cgroup.procs
Do you have suggestion for what's the formal method for adding PID to
SchedTune's cgroup with non-root user?
Thanks,
Leo Yan
When task is migrated from CPU_A to CPU_B, scheduler will decrease
the task's load/util from the task's cfs_rq and also add them into
migrated cfs_rq. But if kernel enables CONFIG_FAIR_GROUP_SCHED then this
cfs_rq is not the same one with cpu's cfs_rq. As a result, after task is
migrated to CPU_B, then CPU_A still have task's stale value for
load/util; on the other hand CPU_B also cannot reflect new load/util
which introduced by the task.
So this patch is to operate the task's load/util to cpu's cfs_rq, so
finally cpu's cfs_rq can really reflect task's migration.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
kernel/sched/fair.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0fe30e6..10ca1a9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2825,12 +2825,24 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
{
struct sched_avg *sa = &cfs_rq->avg;
+ struct sched_avg *cpu_sa = NULL;
int decayed, removed = 0;
+ int cpu = cpu_of(rq_of(cfs_rq));
+
+ if (&cpu_rq(cpu)->cfs != cfs_rq)
+ cpu_sa = &cpu_rq(cpu)->cfs.avg;
if (atomic_long_read(&cfs_rq->removed_load_avg)) {
s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
sa->load_avg = max_t(long, sa->load_avg - r, 0);
sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
+
+ if (cpu_sa) {
+ cpu_sa->load_avg = max_t(long, cpu_sa->load_avg - r, 0);
+ cpu_sa->load_sum = max_t(s64,
+ cpu_sa->load_sum - r * LOAD_AVG_MAX, 0);
+ }
+
removed = 1;
}
@@ -2838,6 +2850,12 @@ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
long r = atomic_long_xchg(&cfs_rq->removed_util_avg, 0);
sa->util_avg = max_t(long, sa->util_avg - r, 0);
sa->util_sum = max_t(s32, sa->util_sum - r * LOAD_AVG_MAX, 0);
+
+ if (cpu_sa) {
+ cpu_sa->util_avg = max_t(long, cpu_sa->util_avg - r, 0);
+ cpu_sa->util_sum = max_t(s64,
+ cpu_sa->util_sum - r * LOAD_AVG_MAX, 0);
+ }
}
decayed = __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa,
@@ -2896,6 +2914,8 @@ static inline void update_load_avg(struct sched_entity *se, int update_tg)
static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
+ int cpu = cpu_of(rq_of(cfs_rq));
+
if (!sched_feat(ATTACH_AGE_LOAD))
goto skip_aging;
@@ -2919,6 +2939,13 @@ skip_aging:
cfs_rq->avg.load_sum += se->avg.load_sum;
cfs_rq->avg.util_avg += se->avg.util_avg;
cfs_rq->avg.util_sum += se->avg.util_sum;
+
+ if (&cpu_rq(cpu)->cfs != cfs_rq) {
+ cpu_rq(cpu)->cfs.avg.load_avg += se->avg.load_avg;
+ cpu_rq(cpu)->cfs.avg.load_sum += se->avg.load_sum;
+ cpu_rq(cpu)->cfs.avg.util_avg += se->avg.util_avg;
+ cpu_rq(cpu)->cfs.avg.util_sum += se->avg.util_sum;
+ }
}
static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
--
1.9.1
When log CPU's load and utilization, should directly use CPU's cfs_rq
for tracking. If use the task's cfs_rq, it may introduce error value
by using task_group's cfs_rq but not real CPU's cfs_rq.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6ac9ea3..26f3f2d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2825,7 +2825,7 @@ static inline void update_load_avg(struct sched_entity *se, int update_tg)
if (entity_is_task(se))
trace_sched_load_avg_task(task_of(se), &se->avg);
- trace_sched_load_avg_cpu(cpu, cfs_rq);
+ trace_sched_load_avg_cpu(cpu, &cpu_rq(cpu)->cfs);
}
static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
--
1.9.1