This patch series is to optimize performance.
Patch 0001 is to optimize CPU selection flow so let task has more
chance to stay on previous CPU. Patch 0002 actually is a big change
for EAS's policy for CPU selection, it trys to select idle CPU as
possible. From profiling result, 0002 have good effect that spread tasks
out if there have many tasks are running at the meantime.
Patches 0003~0004 are to optimize the scenario for single thread case.
In this case, the thread has relative high utilization value, but the
value cannot easily over tipping point. So patche 0004 try to set
criteria to in some condition change to use load_avg rather than
util_avg to boost the single thread.
Patch 0005 is to optimize the flow for spreading tasks within big
cluster.
Patches 0006~0007 is to fix the signal for avg_load.
Leo Yan (8):
sched/fair: optimize to more chance to select previous CPU
sched/fair: select idle CPU for waken up task
sched/fair: add path to migrate to higher capacity CPU
sched/fair: use load to replace util when have big difference
sched/fair: spread tasks in cluster when over tipping point
sched/fair: correct avg_load as CPU average load
sched/fair: fix to calculate average load cross cluster
sched/fair: set imbn to 1 for too many tasks on rq
include/linux/sched.h | 1 +
kernel/sched/fair.c | 93 +++++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 84 insertions(+), 10 deletions(-)
--
1.9.1
Hi Patrick,
[ + eas-dev ]
Here have a common question for how to define schedTune threshold
array for payoff. So basically I want check below questions:
- When every CGroup has its own perf_boost_idx for PB region and
perf_constrain_idx for PC region. So do you have suggestion or
guideline to define these index?
And for difference CGroup like "backgroud", "foreground" or
"performance" every CGroup will have its dedicated index or the
platform can share the same index value?
- How to define the array value for "threshold_gains"?
IIUC this array is platform dependency, but what's the
reasonable method to generate this table? Here have some suggested
testing for generating this table?
Or my understanding is wrong so this array is fixed, then just need
ajust perf_boost_idx/perf_constrain_idx for platform is enough?
- So far we cannot set these payoff parameters (including
perf_boost_idx/perf_constrain_idx and threshold_gains) from sysfs
dynamically, so how we can initilizae these value for platform
specific? Suppose now we can only set these value when kernel's
init flow, right?
Thanks,
Leo Yan
Hi,
I am Amanda,
Would you be interested in acquiring an email list of "Moms Email List" from USA?
We have data for Mortgage Email List, New Homeowner Email List, Online Shoppers List, Travelers Email List and many more. Choose the best one that meets your need. We provide you with current and active contact on every list. Take advantage of that, let your marketing efforts be fruitful.
Each record in the list Contact Name( First, Middle, Last Name), Direct Mailing Address ( Address, City, State, Zip Code), List Type, Source, IP Address, and Email Address.
All the contacts are opt-in verified, 100% permission based and can be used for unlimited multi-channel marketing.
Please let me know your thoughts towards Moms Email List.
Best Regards,
Amanda Clark
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
In current code the CPU's idle state cpufreq_pstates::idle is initialized
to '-1'; and until parse first "cpu_idle" event for the CPU then set CPU's
idle state to '0' or '1' corresponding to active or idle. This will cause
error for P-state's statistics: from the beginning to first "cpu_idle"
event, during this period the CPU's idle state is '-1' so function
cpu_change_pstate() will always think it's first update and finally abandon
previous time.
This will introduce very big error if the CPU is always running and never
run into idle state. So this patch is to fix this issue by initialize CPU's
corresponding C-state and P-state:
- Firstly gather every CPU's starting frequency and time stamp;
- Then gather CPU's idle state according to first cpu_idle log:
If the CPU first cpu_idle state is '-1', that means from the beginning
the CPU is stayed on idle state;
If the CPU first cpu_idle state is other value, means the CPU is active.
- With these info, finally initialize every CPU's C-state and P-state
before analyse trace logs.
Here should note one thing is: when CPU is idle at beginning, we don't know
exact idle state, so just assume CPU is in idle state 0; but this will not
impact too much for statistics, due usually idlestat will wakeup all CPUs
at the beginning. So it will introduce very small deviation.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
tracefile_idlestat.c | 123 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 123 insertions(+)
diff --git a/tracefile_idlestat.c b/tracefile_idlestat.c
index 3430693..d0cd366 100644
--- a/tracefile_idlestat.c
+++ b/tracefile_idlestat.c
@@ -152,6 +152,127 @@ int load_text_data_line(char *buffer, struct cpuidle_datas *datas, char *format,
return get_wakeup_irq(datas, buffer);
}
+
+/**
+ * init_cpu_idle_state - Init CPU's idle state according to first cpu_idle log.
+ * For a specific cpu_idle event, its state is '-1' then that means from the
+ * beginning the CPU is stayed on idle state; Otherwise means the CPU is active.
+ * So initilize per-CPU idle flag to get more accurate time.
+ *
+ * @datas: structure for P-state and C-state's statistics
+ * @f: the file handle of the idlestat trace file
+ */
+void init_cpu_idle_state(struct cpuidle_datas *datas, FILE *f)
+{
+ char buffer[BUFSIZE];
+ int state, cpu;
+ double time;
+ struct cpufreq_pstates *ps;
+
+ unsigned long *cpu_start_idle;
+ int *cpu_start_freq;
+ double cpu_start_time;
+
+ fseek(f, 0, SEEK_SET);
+
+ cpu_start_freq = malloc(sizeof(int) * datas->nrcpus);
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_freq[cpu] = 0xdeadbeef;
+
+ /*
+ * Find the start time stamp and the CPU's frequency at beginning;
+ * So we can use these info to add dummy info.
+ */
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_frequency")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return;
+ }
+ } else
+ continue;
+
+ if (cpu_start_freq[cpu] != 0xdeadbeef)
+ continue;
+
+ if (cpu == 0)
+ cpu_start_time = time;
+
+ cpu_start_freq[cpu] = state;
+
+ break;
+ }
+
+ /* After traverse file, reset offset */
+ fseek(f, 0, SEEK_SET);
+
+ /*
+ * Find the CPU's idle state at beginning
+ */
+ cpu_start_idle = malloc(sizeof(long) * datas->nrcpus);
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_idle[cpu] = 0xdeadbeef;
+
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_idle")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return;
+ }
+ } else
+ continue;
+
+ /* CPU's state has been initialized, skip it */
+ if (cpu_start_idle[cpu] != 0xdeadbeef)
+ continue;
+
+ /*
+ * The CPU's first cpu_idle is '-1', means CPU is staying in
+ * idle state and exit from idle until first cpu_idle event.
+ * Otherwise, means the CPU is active at beginning.
+ */
+ if (state == -1)
+ cpu_start_idle[cpu] = 0;
+ else
+ cpu_start_idle[cpu] = 4294967295;
+ }
+
+ /* After traverse file, reset offset */
+ fseek(f, 0, SEEK_SET);
+
+ /* Initialize every CPU's cstate and pstate */
+ for (cpu = 0; cpu < datas->nrcpus; cpu++) {
+
+ ps = &(datas->pstates[cpu]);
+
+ if (cpu_start_idle[cpu] == 0) {
+ /*
+ * CPU is idle at beginning, init cstate;
+ *
+ * here don't know exact idle state, so just assume CPU
+ * is in idle state 0; but this will not impace too much
+ * for statistics, due usually idlestat will wakeup all
+ * CPUs at the beginning.
+ */
+ ps->idle = 1;
+ store_data(cpu_start_time, 0, cpu, datas);
+ } else {
+ /* CPU is busy at beginning, init pstate */
+ ps->idle = 0;
+ cpu_change_pstate(datas, cpu, cpu_start_freq[cpu],
+ cpu_start_time);
+ }
+ }
+}
+
void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
{
double begin = 0, end = 0;
@@ -159,6 +280,8 @@ void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
setup_topo_states(datas);
+ init_cpu_idle_state(datas, f);
+
do {
if (load_text_data_line(buffer, datas, TRACE_FORMAT,
&begin, &end, &start) != -1) {
--
1.9.1
In current code the CPU's idle state cpufreq_pstates::idle is initialized
to '-1'; and until parse first "cpu_idle" event for the CPU then set CPU's
idle state to '0' or '1' corresponding to active or idle. This will cause
error for P-state's statistics: from the beginning to first "cpu_idle"
event, during this period the CPU's idle state is '-1' so function
cpu_change_pstate() will always think it's first update and finally abandon
previous time.
This will introduce very big error if the CPU is always running and never
run into idle state. So this patch is to fix this issue by initialize CPU's
corresponding C-state and P-state:
- Firstly gather every CPU's starting frequency and time stamp;
- Then gather CPU's idle state according to first cpu_idle log:
If the CPU first cpu_idle state is '-1', that means from the beginning
the CPU is stayed on idle state;
If the CPU first cpu_idle state is other value, means the CPU is active.
- With these info, finally initialize every CPU's C-state and P-state
before analyse trace logs.
Here should note one thing is: when CPU is idle at beginning, we don't know
exact idle state, so just assume CPU is in idle state 0; but this will not
impact too much for statistics, due usually idlestat will wakeup all CPUs
at the beginning. So it will introduce very small deviation.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
tracefile_idlestat.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 152 insertions(+)
diff --git a/tracefile_idlestat.c b/tracefile_idlestat.c
index 3430693..2674478 100644
--- a/tracefile_idlestat.c
+++ b/tracefile_idlestat.c
@@ -152,6 +152,153 @@ int load_text_data_line(char *buffer, struct cpuidle_datas *datas, char *format,
return get_wakeup_irq(datas, buffer);
}
+
+/**
+ * init_cpu_idle_state - Init CPU's idle state according to first cpu_idle log.
+ * For a specific cpu_idle event, its state is '-1' then that means from the
+ * beginning the CPU is stayed on idle state; Otherwise means the CPU is active.
+ * So initilize per-CPU idle flag to get more accurate time.
+ *
+ * @datas: structure for P-state and C-state's statistics
+ * @f: the file handle of the idlestat trace file
+ */
+int init_cpu_idle_state(struct cpuidle_datas *datas, FILE *f)
+{
+ char buffer[BUFSIZE];
+ int state, cpu;
+ double time;
+ struct cpufreq_pstates *ps;
+
+ unsigned long *cpu_start_idle;
+ int *cpu_start_freq;
+ double cpu_start_time;
+
+ int ret;
+
+ ret = fseek(f, 0, SEEK_SET);
+ if (ret < 0) {
+ fprintf(stderr, "failed to set the start file position\n");
+ return ret;
+ }
+
+ cpu_start_freq = malloc(sizeof(int) * datas->nrcpus);
+ if (!cpu_start_freq) {
+ fprintf(stderr, "failed to alloc for start frequency states\n");
+ return -1;
+ }
+
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_freq[cpu] = 0xdeadbeef;
+
+ /*
+ * Find the start time stamp and the CPU's frequency at beginning;
+ * So we can use these info to add dummy info.
+ */
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_frequency")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return -1;
+ }
+ } else
+ continue;
+
+ if (cpu_start_freq[cpu] != 0xdeadbeef)
+ continue;
+
+ if (cpu == 0)
+ cpu_start_time = time;
+
+ cpu_start_freq[cpu] = state;
+
+ break;
+ }
+
+ /* After traverse file, reset offset */
+ ret = fseek(f, 0, SEEK_SET);
+ if (ret < 0) {
+ fprintf(stderr, "failed to set the start file position\n");
+ return ret;
+ }
+
+ /*
+ * Find the CPU's idle state at beginning
+ */
+ cpu_start_idle = malloc(sizeof(long) * datas->nrcpus);
+ if (!cpu_start_idle) {
+ fprintf(stderr, "failed to alloc for start idle states\n");
+ return -1;
+ }
+
+ for (cpu = 0; cpu < datas->nrcpus; cpu++)
+ cpu_start_idle[cpu] = 0xdeadbeef;
+
+ while (fgets(buffer, BUFSIZE, f)) {
+
+ if (strstr(buffer, "cpu_idle")) {
+ if (sscanf(buffer, TRACE_FORMAT, &time, &state, &cpu)
+ != 3) {
+ fprintf(stderr, "warning: Unrecognized cpuidle "
+ "record. The result of analysis might "
+ "be wrong.\n");
+ return -1;
+ }
+ } else
+ continue;
+
+ /* CPU's state has been initialized, skip it */
+ if (cpu_start_idle[cpu] != 0xdeadbeef)
+ continue;
+
+ /*
+ * The CPU's first cpu_idle is '-1', means CPU is staying in
+ * idle state and exit from idle until first cpu_idle event.
+ * Otherwise, means the CPU is active at beginning.
+ */
+ if (state == -1)
+ cpu_start_idle[cpu] = 0;
+ else
+ cpu_start_idle[cpu] = 4294967295;
+ }
+
+ /* After traverse file, reset offset */
+ ret = fseek(f, 0, SEEK_SET);
+ if (ret < 0) {
+ fprintf(stderr, "failed to set the start file position\n");
+ return ret;
+ }
+
+ /* Initialize every CPU's cstate and pstate */
+ for (cpu = 0; cpu < datas->nrcpus; cpu++) {
+
+ ps = &(datas->pstates[cpu]);
+
+ if (cpu_start_idle[cpu] == 0) {
+ /*
+ * CPU is idle at beginning, init cstate;
+ *
+ * here don't know exact idle state, so just assume CPU
+ * is in idle state 0; but this will not impace too much
+ * for statistics, due usually idlestat will wakeup all
+ * CPUs at the beginning.
+ */
+ ps->idle = 1;
+ store_data(cpu_start_time, 0, cpu, datas);
+ } else {
+ /* CPU is busy at beginning, init pstate */
+ ps->idle = 0;
+ cpu_change_pstate(datas, cpu, cpu_start_freq[cpu],
+ cpu_start_time);
+ }
+ }
+
+ return 0;
+}
+
void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
{
double begin = 0, end = 0;
@@ -159,6 +306,11 @@ void load_text_data_lines(FILE *f, char *buffer, struct cpuidle_datas *datas)
setup_topo_states(datas);
+ if (init_cpu_idle_state(datas, f) < 0) {
+ fprintf(stderr, "failed to initlized cpu states\n");
+ exit(-1);
+ }
+
do {
if (load_text_data_line(buffer, datas, TRACE_FORMAT,
&begin, &end, &start) != -1) {
--
1.9.1
subscribe
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
If we enable CONFIG_SCHED_TUNE without CONFIG_CGROUPS we get the following
errors:
kernel/sched/fair.c: In function 'energy_diff_evaluate':
kernel/sched/fair.c:4795:2: error: implicit declaration of function 'schedtune_normalize_energy' [-Werror=implicit-function-declaration]
nrg_delta = schedtune_normalize_energy(eenv->nrg.diff);
^
kernel/sched/fair.c:4798:2: error: implicit declaration of function 'schedtune_accept_deltas' [-Werror=implicit-function-declaration]
eenv->payoff = schedtune_accept_deltas(
^
Fix this by making sure the dummy version of these functions are
defined if the real ones aren't.
Signed-off-by: Jon Medhurst <tixy(a)linaro.org>
---
This is another build fix for the 3.18 backport of EAS [1] but I'm not
sure if the missing functions are actually meant to do something in the
case when we have SCHED_TUNE without CGROUPS?
[1] http://git.linaro.org/arm/eas/kernel.git/shortlog/refs/heads/linux-3.18-eas…
kernel/sched/tune.h | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/kernel/sched/tune.h b/kernel/sched/tune.h
index da1f7b2..3410a1d 100644
--- a/kernel/sched/tune.h
+++ b/kernel/sched/tune.h
@@ -1,6 +1,3 @@
-
-#ifdef CONFIG_SCHED_TUNE
-
#ifdef CONFIG_CGROUP_SCHEDTUNE
int schedtune_cpu_boost(int cpu);
@@ -13,14 +10,7 @@ int schedtune_normalize_energy(int energy);
int schedtune_accept_deltas(int nrg_delta, int cap_delta,
struct task_struct *task);
-#else /* CONFIG_CGROUP_SCHEDTUNE */
-
-#define schedtune_enqueue_task(task, cpu) do { } while (0)
-#define schedtune_dequeue_task(task, cpu) do { } while (0)
-
-#endif /* CONFIG_CGROUP_SCHEDTUNE */
-
-#else /* CONFIG_SCHED_TUNE */
+#else
#define schedtune_enqueue_task(task, cpu) do { } while (0)
#define schedtune_dequeue_task(task, cpu) do { } while (0)
@@ -28,4 +18,4 @@ int schedtune_accept_deltas(int nrg_delta, int cap_delta,
#define schedtune_normalize_energy(energy) energy
#define schedtune_accept_deltas(nrg_delta, cap_delta, task) nrg_delta
-#endif /* CONFIG_SCHED_TUNE */
+#endif
--
2.1.4
Dear Dev,
This is to confirm that one or more of your parcels has been shipped.
You can review complete details of your order in the find attached.
Regards,
Francis Alexander,
FedEx Station Agent.
Hi
I found some bugs when integrating the 3.18 EAS backport [1] into
the LSK 3.18 based kernel I look after for ARM's Juno and Versatile
Express boards. These patches are my fixes for those bugs, I don't know
whether they are useful or relevent to other versions of EAS. If they
are OK, I guess I should at least add them to [1] ?
----------------------------------------------------------------
Jon Medhurst (3):
arm: Fix build error "conflicting types for 'scale_cpu_capacity'"
arm: Fix #if/#ifdef mixup in topology.c
sched/tune: Avoid null pointer dereference in schedtune_add_cluster_nrg
arch/arm/include/asm/topology.h | 1 +
arch/arm/kernel/topology.c | 2 +-
kernel/sched/tune.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
[1] http://git.linaro.org/arm/eas/kernel.git/shortlog/refs/heads/linux-3.18-eas…