On 10/17/2012 12:39 AM, Julius Werner wrote:
When cpuidle drivers do not supply explicit power_usage values, cpuidle/driver.c inserts dummy values instead. When a running processor dynamically gains new C-states (e.g. after ACPI events), the power_usage values of those states will stay uninitialized, and cpuidle governors will never choose to enter them.
This patch moves the dummy value initialization from cpuidle_register_driver to cpuidle_enable_device, which drivers such as acpi/processor_idle.c will call again when they add or remove C-states. Tested and verified on an Acer AC700 Chromebook, which supports the C3 state only when it switches from AC to battery power.
Signed-off-by: Julius Werner jwerner@chromium.org
This is specific to the acpi and should be handled in the processor_idle.c file instead of the cpuidle core code.
Could be the function 'acpi_processor_cst_has_changed' the right place to set a dummy power value for the power in the new C-state ?
drivers/cpuidle/cpuidle.c | 24 ++++++++++++++++++++++++ drivers/cpuidle/driver.c | 25 ------------------------- 2 files changed, 24 insertions(+), 25 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 7f15b85..bef3a31 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -298,6 +298,28 @@ static void poll_idle_init(struct cpuidle_driver *drv) static void poll_idle_init(struct cpuidle_driver *drv) {} #endif /* CONFIG_ARCH_HAS_CPU_RELAX */ +static void set_power_states(struct cpuidle_driver *drv) +{
- int i;
- /*
* cpuidle driver should set the drv->power_specified bit
* before registering if the driver provides
* power_usage numbers.
*
* If power_specified is not set,
* we fill in power_usage with decreasing values as the
* cpuidle code has an implicit assumption that state Cn
* uses less power than C(n-1).
*
* With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned
* an power value of -1. So we use -2, -3, etc, for other
* c-states.
*/
- for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++)
drv->states[i].power_usage = -1 - i;
+}
/**
- cpuidle_enable_device - enables idle PM for a CPU
- @dev: the CPU
@@ -330,6 +352,8 @@ int cpuidle_enable_device(struct cpuidle_device *dev) cpuidle_enter_tk : cpuidle_enter; poll_idle_init(drv);
- if (!drv->power_specified)
set_power_states(drv);
if ((ret = cpuidle_add_state_sysfs(dev))) return ret; diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index 87db387..caaed27 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -18,28 +18,6 @@ static struct cpuidle_driver *cpuidle_curr_driver; DEFINE_SPINLOCK(cpuidle_driver_lock); int cpuidle_driver_refcount; -static void set_power_states(struct cpuidle_driver *drv) -{
- int i;
- /*
* cpuidle driver should set the drv->power_specified bit
* before registering if the driver provides
* power_usage numbers.
*
* If power_specified is not set,
* we fill in power_usage with decreasing values as the
* cpuidle code has an implicit assumption that state Cn
* uses less power than C(n-1).
*
* With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned
* an power value of -1. So we use -2, -3, etc, for other
* c-states.
*/
- for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++)
drv->states[i].power_usage = -1 - i;
-}
/**
- cpuidle_register_driver - registers a driver
- @drv: the driver
@@ -58,9 +36,6 @@ int cpuidle_register_driver(struct cpuidle_driver *drv) return -EBUSY; }
- if (!drv->power_specified)
set_power_states(drv);
- cpuidle_curr_driver = drv;
spin_unlock(&cpuidle_driver_lock);
On 10/17/2012 12:31 PM, Daniel Lezcano wrote:
On 10/17/2012 12:39 AM, Julius Werner wrote:
When cpuidle drivers do not supply explicit power_usage values, cpuidle/driver.c inserts dummy values instead. When a running processor dynamically gains new C-states (e.g. after ACPI events), the power_usage values of those states will stay uninitialized, and cpuidle governors will never choose to enter them.
This patch moves the dummy value initialization from cpuidle_register_driver to cpuidle_enable_device, which drivers such as acpi/processor_idle.c will call again when they add or remove C-states. Tested and verified on an Acer AC700 Chromebook, which supports the C3 state only when it switches from AC to battery power.
Signed-off-by: Julius Werner jwerner@chromium.org
This is specific to the acpi and should be handled in the processor_idle.c file instead of the cpuidle core code.
Could be the function 'acpi_processor_cst_has_changed' the right place to set a dummy power value for the power in the new C-state ?
btw, good catch ! :)
This is specific to the acpi and should be handled in the processor_idle.c file instead of the cpuidle core code.
Could be the function 'acpi_processor_cst_has_changed' the right place to set a dummy power value for the power in the new C-state ?
Thanks for your feedback. I think it wouldn't be wise to split the dummy power value logic over two places, but I could submit a patch that makes set_power_states globally accessible and calls it from acpi_processor_cst_has_changed instead.
However, I do not think this should really be ACPI specific. It applies to any cpuidle driver that wants to change its idle states at runtime. Currently only the ACPI one does, but the future might bring others that would run into the same problem. I also think that set_power_states fits much better into cpuidle_enable_device conceptually anyway (right next to poll_idle_init which also does state initialization).
Let me know what you think.
On 10/17/2012 08:43 PM, Julius Werner wrote:
This is specific to the acpi and should be handled in the processor_idle.c file instead of the cpuidle core code.
Could be the function 'acpi_processor_cst_has_changed' the right place to set a dummy power value for the power in the new C-state ?
Thanks for your feedback. I think it wouldn't be wise to split the dummy power value logic over two places, but I could submit a patch that makes set_power_states globally accessible and calls it from acpi_processor_cst_has_changed instead.
No please, do not export this function. That will add more confusion on the acpi code and more generally in the cpuidle core code.
IIUC, a new state is inserted/deleted and we will set the entire array of states to setup the power.
You have the acpi_processor_cst_has_changed function calling the acpi_processor_setup_cpuidle_states function. It seems to be a good candidate to setup the power of the new state. All the states are filled again AFAICS under the lock.
However, I do not think this should really be ACPI specific. It applies to any cpuidle driver that wants to change its idle states at runtime. Currently only the ACPI one does, but the future might bring others that would run into the same problem. I also think that set_power_states fits much better into cpuidle_enable_device conceptually anyway (right next to poll_idle_init which also does state initialization).
The states are now part of the cpuidle driver and the set_power_states should remain to this file. The dynamic C-states brought some complexity in the acpi code and honestly this code is very confusing.
Maybe one day, that would make sense but until then I am in favor of keeping the arch specific bits in the drivers, especially when they are *hum* so "complex".
poll_idle_init looks hackish for me and probably move it to the arch would also make sense.
Thanks -- Daniel
When cpuidle drivers do not supply explicit power_usage values, cpuidle/driver.c inserts dummy values instead. When a running processor dynamically gains new C-states (e.g. after ACPI events), the power_usage values of those states will stay uninitialized, and cpuidle governors will never choose to enter them.
This patch ensures that the ACPI cpuidle driver sets those dummy power values itself whenever it (re-)initializes its idle states. Tested and verified on an Acer AC700 Chromebook, which supports the C3 state only when it switches from AC to battery power.
Signed-off-by: Julius Werner jwerner@chromium.org --- drivers/acpi/processor_idle.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index e8086c7..078e22f 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -1090,6 +1090,9 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr) state->exit_latency = cx->latency; state->target_residency = cx->latency * latency_factor;
+ /* reinitialize this in case new states are added after boot */ + state->power_usage = -1 - count; + state->flags = 0; switch (cx->type) { case ACPI_STATE_C1:
On 10/19/2012 11:50 PM, Julius Werner wrote:
When cpuidle drivers do not supply explicit power_usage values, cpuidle/driver.c inserts dummy values instead. When a running processor dynamically gains new C-states (e.g. after ACPI events), the power_usage values of those states will stay uninitialized, and cpuidle governors will never choose to enter them.
This patch ensures that the ACPI cpuidle driver sets those dummy power values itself whenever it (re-)initializes its idle states. Tested and verified on an Acer AC700 Chromebook, which supports the C3 state only when it switches from AC to battery power.
Signed-off-by: Julius Werner jwerner@chromium.org
I am not against this patch but I am wondering if it is not time to do some cleanup around this.
The flag 'power_specified' is never used in any driver.
And the field 'power_usage' is used only in the tegra3 driver where logically as power_specified is not set, it will be overwritten at the init (could someone could check the /sys/devices/system/cpu/cpu0/cpuidle/state1/power is different from 600 on tegra3 ?)
The drivers define their states in a power consumption descendant order making de facto the same ordering than the 'set_power_state' function in driver.c
The governor looks at the power_usage (which is always filled by 'set_power_state').
static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) [ ... ]
if (s->power_usage < power_usage) { power_usage = s->power_usage; data->last_state_idx = i; data->exit_us = s->exit_latency; }
[ ... ]
Could we just say this is always true because state[i+1] consumes less than state[i] ?
And then just remove the 'set_power_state' function, and the field 'driver->power_specified' ?
That will cleanup the code and fix this problem, no ?
Thanks -- Daniel
drivers/acpi/processor_idle.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index e8086c7..078e22f 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -1090,6 +1090,9 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr) state->exit_latency = cx->latency; state->target_residency = cx->latency * latency_factor;
/* reinitialize this in case new states are added after boot */
state->power_usage = -1 - count;
- state->flags = 0; switch (cx->type) { case ACPI_STATE_C1:
Could we just say this is always true because state[i+1] consumes less than state[i] ?
And then just remove the 'set_power_state' function, and the field 'driver->power_specified' ?
That will cleanup the code and fix this problem, no ?
I totally agree with your analysis. Even if a driver were to set proper usage values (and the power_specified bit), none of the existing governors would care about those actual numbers (and since the vast majority of drivers uses fake values anyway, this is not likely to change in the future). This seems to be a classic example of unnecessary over-engineering.
I am mostly interested in getting that bug fixed right now, but removing unnecessary code is always a good thing. If you think it would have a good chance of getting merged, I would be happy to draft up a larger patch that refactors power_usage away completely.
On 10/22/2012 07:13 PM, Julius Werner wrote:
Could we just say this is always true because state[i+1] consumes less than state[i] ?
And then just remove the 'set_power_state' function, and the field 'driver->power_specified' ?
That will cleanup the code and fix this problem, no ?
I totally agree with your analysis. Even if a driver were to set proper usage values (and the power_specified bit), none of the existing governors would care about those actual numbers (and since the vast majority of drivers uses fake values anyway, this is not likely to change in the future). This seems to be a classic example of unnecessary over-engineering.
I am mostly interested in getting that bug fixed right now, but removing unnecessary code is always a good thing. If you think it would have a good chance of getting merged, I would be happy to draft up a larger patch that refactors power_usage away completely.
I am in favor of removing the unnecessary code as it fixes a bug also but I am not a maintainer, so I can't tell if it has a good chance to be merged as a bug fix.
I think Rafael can tell us what approach he would prefer.
Thanks -- Daniel
This patch follows the discussion about reinitializing the power usage when a C-state is added/removed.
https://lkml.org/lkml/2012/10/16/518
We realized the power usage field is never filled and when it is filled for tegra, the power_specified flag is not set making all these values to be resetted when the driver is initialized with the set_power_state function.
Julius and I feel this is over-engineered and the power_specified flag could be simply removed and continue assuming the states are backward sorted.
The menu governor select function is simplified as the power is ordered. Actually the condition is always true with the current code.
The cpuidle_play_dead function is also simplified by doing a reverse lookup on the array.
The set_power_states function is removed as it does no make sense anymore.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org --- drivers/cpuidle/cpuidle.c | 17 ++++------------- drivers/cpuidle/driver.c | 25 ------------------------- drivers/cpuidle/governors/menu.c | 8 ++------ include/linux/cpuidle.h | 2 +- 4 files changed, 7 insertions(+), 45 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 711dd83..f983262 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -69,24 +69,15 @@ int cpuidle_play_dead(void) { struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices); struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); - int i, dead_state = -1; - int power_usage = -1; + int i;
if (!drv) return -ENODEV;
/* Find lowest-power state that supports long-term idle */ - for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) { - struct cpuidle_state *s = &drv->states[i]; - - if (s->power_usage < power_usage && s->enter_dead) { - power_usage = s->power_usage; - dead_state = i; - } - } - - if (dead_state != -1) - return drv->states[dead_state].enter_dead(dev, dead_state); + for (i = drv->state_count; i >= CPUIDLE_DRIVER_STATE_START; i--) + if (drv->states[i].play_dead) + return drv->states[i].enter_dead(dev, i);
return -ENODEV; } diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index 3af841f..bb045f1 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -19,34 +19,9 @@ DEFINE_SPINLOCK(cpuidle_driver_lock); static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu); static struct cpuidle_driver * __cpuidle_get_cpu_driver(int cpu);
-static void set_power_states(struct cpuidle_driver *drv) -{ - int i; - - /* - * cpuidle driver should set the drv->power_specified bit - * before registering if the driver provides - * power_usage numbers. - * - * If power_specified is not set, - * we fill in power_usage with decreasing values as the - * cpuidle code has an implicit assumption that state Cn - * uses less power than C(n-1). - * - * With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned - * an power value of -1. So we use -2, -3, etc, for other - * c-states. - */ - for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) - drv->states[i].power_usage = -1 - i; -} - static void __cpuidle_driver_init(struct cpuidle_driver *drv) { drv->refcnt = 0; - - if (!drv->power_specified) - set_power_states(drv); }
static int __cpuidle_register_driver(struct cpuidle_driver *drv, int cpu) diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index 2efee27..14eb11f 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -312,7 +312,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) { struct menu_device *data = &__get_cpu_var(menu_devices); int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY); - int power_usage = -1; int i; int multiplier; struct timespec t; @@ -383,11 +382,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) if (s->exit_latency * multiplier > data->predicted_us) continue;
- if (s->power_usage < power_usage) { - power_usage = s->power_usage; - data->last_state_idx = i; - data->exit_us = s->exit_latency; - } + data->last_state_idx = i; + data->exit_us = s->exit_latency; }
/* not deepest C-state chosen for low predicted residency */ diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 3711b34..24cd103 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -126,9 +126,9 @@ struct cpuidle_driver { struct module *owner; int refcnt;
- unsigned int power_specified:1; /* set to 1 to use the core cpuidle time keeping (for all states). */ unsigned int en_core_tk_irqen:1; + /* states array must be ordered in decreasing power consumption */ struct cpuidle_state states[CPUIDLE_STATE_MAX]; int state_count; int safe_state_index;
Thanks for moving this along, Daniel. I think this is the right approach... the cpuidle driver shouldn't be more complex than necessary.
Note that you are starting your loop too high in cpuidle_play_dead... states[state_count] is not an actual state anymore, it should start at state_count - 1. Also, I think you can go ahead and do the same last-to-first loop transformation with immediate return in the menu governor, for an extra tiny bit of performance.
On Mon, Nov 12, 2012 at 12:26 PM, Daniel Lezcano daniel.lezcano@linaro.org wrote:
This patch follows the discussion about reinitializing the power usage when a C-state is added/removed.
https://lkml.org/lkml/2012/10/16/518
We realized the power usage field is never filled and when it is filled for tegra, the power_specified flag is not set making all these values to be resetted when the driver is initialized with the set_power_state function.
Julius and I feel this is over-engineered and the power_specified flag could be simply removed and continue assuming the states are backward sorted.
The menu governor select function is simplified as the power is ordered. Actually the condition is always true with the current code.
The cpuidle_play_dead function is also simplified by doing a reverse lookup on the array.
The set_power_states function is removed as it does no make sense anymore.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
drivers/cpuidle/cpuidle.c | 17 ++++------------- drivers/cpuidle/driver.c | 25 ------------------------- drivers/cpuidle/governors/menu.c | 8 ++------ include/linux/cpuidle.h | 2 +- 4 files changed, 7 insertions(+), 45 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 711dd83..f983262 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -69,24 +69,15 @@ int cpuidle_play_dead(void) { struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices); struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
int i, dead_state = -1;
int power_usage = -1;
int i; if (!drv) return -ENODEV; /* Find lowest-power state that supports long-term idle */
for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
struct cpuidle_state *s = &drv->states[i];
if (s->power_usage < power_usage && s->enter_dead) {
power_usage = s->power_usage;
dead_state = i;
}
}
if (dead_state != -1)
return drv->states[dead_state].enter_dead(dev, dead_state);
for (i = drv->state_count; i >= CPUIDLE_DRIVER_STATE_START; i--)
if (drv->states[i].play_dead)
return drv->states[i].enter_dead(dev, i); return -ENODEV;
} diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index 3af841f..bb045f1 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -19,34 +19,9 @@ DEFINE_SPINLOCK(cpuidle_driver_lock); static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu); static struct cpuidle_driver * __cpuidle_get_cpu_driver(int cpu);
-static void set_power_states(struct cpuidle_driver *drv) -{
int i;
/*
* cpuidle driver should set the drv->power_specified bit
* before registering if the driver provides
* power_usage numbers.
*
* If power_specified is not set,
* we fill in power_usage with decreasing values as the
* cpuidle code has an implicit assumption that state Cn
* uses less power than C(n-1).
*
* With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned
* an power value of -1. So we use -2, -3, etc, for other
* c-states.
*/
for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++)
drv->states[i].power_usage = -1 - i;
-}
static void __cpuidle_driver_init(struct cpuidle_driver *drv) { drv->refcnt = 0;
if (!drv->power_specified)
set_power_states(drv);
}
static int __cpuidle_register_driver(struct cpuidle_driver *drv, int cpu) diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index 2efee27..14eb11f 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -312,7 +312,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) { struct menu_device *data = &__get_cpu_var(menu_devices); int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
int power_usage = -1; int i; int multiplier; struct timespec t;
@@ -383,11 +382,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) if (s->exit_latency * multiplier > data->predicted_us) continue;
if (s->power_usage < power_usage) {
power_usage = s->power_usage;
data->last_state_idx = i;
data->exit_us = s->exit_latency;
}
data->last_state_idx = i;
data->exit_us = s->exit_latency; } /* not deepest C-state chosen for low predicted residency */
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 3711b34..24cd103 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -126,9 +126,9 @@ struct cpuidle_driver { struct module *owner; int refcnt;
unsigned int power_specified:1; /* set to 1 to use the core cpuidle time keeping (for all states). */ unsigned int en_core_tk_irqen:1;
/* states array must be ordered in decreasing power consumption */ struct cpuidle_state states[CPUIDLE_STATE_MAX]; int state_count; int safe_state_index;
-- 1.7.5.4
On 11/12/2012 10:09 PM, Julius Werner wrote:
Thanks for moving this along, Daniel. I think this is the right approach... the cpuidle driver shouldn't be more complex than necessary.
Note that you are starting your loop too high in cpuidle_play_dead... states[state_count] is not an actual state anymore, it should start at state_count - 1.
Yep. Thanks for catching this.
Also, I think you can go ahead and do the same last-to-first loop transformation with immediate return in the menu governor, for an extra tiny bit of performance.
Yes, that makes sense.
Thanks for the review.
-- Daniel
On Mon, Nov 12, 2012 at 12:26 PM, Daniel Lezcano daniel.lezcano@linaro.org wrote:
This patch follows the discussion about reinitializing the power usage when a C-state is added/removed.
https://lkml.org/lkml/2012/10/16/518
We realized the power usage field is never filled and when it is filled for tegra, the power_specified flag is not set making all these values to be resetted when the driver is initialized with the set_power_state function.
Julius and I feel this is over-engineered and the power_specified flag could be simply removed and continue assuming the states are backward sorted.
The menu governor select function is simplified as the power is ordered. Actually the condition is always true with the current code.
The cpuidle_play_dead function is also simplified by doing a reverse lookup on the array.
The set_power_states function is removed as it does no make sense anymore.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
drivers/cpuidle/cpuidle.c | 17 ++++------------- drivers/cpuidle/driver.c | 25 ------------------------- drivers/cpuidle/governors/menu.c | 8 ++------ include/linux/cpuidle.h | 2 +- 4 files changed, 7 insertions(+), 45 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 711dd83..f983262 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -69,24 +69,15 @@ int cpuidle_play_dead(void) { struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices); struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
int i, dead_state = -1;
int power_usage = -1;
int i; if (!drv) return -ENODEV; /* Find lowest-power state that supports long-term idle */
for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
struct cpuidle_state *s = &drv->states[i];
if (s->power_usage < power_usage && s->enter_dead) {
power_usage = s->power_usage;
dead_state = i;
}
}
if (dead_state != -1)
return drv->states[dead_state].enter_dead(dev, dead_state);
for (i = drv->state_count; i >= CPUIDLE_DRIVER_STATE_START; i--)
if (drv->states[i].play_dead)
return drv->states[i].enter_dead(dev, i); return -ENODEV;
} diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index 3af841f..bb045f1 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -19,34 +19,9 @@ DEFINE_SPINLOCK(cpuidle_driver_lock); static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu); static struct cpuidle_driver * __cpuidle_get_cpu_driver(int cpu);
-static void set_power_states(struct cpuidle_driver *drv) -{
int i;
/*
* cpuidle driver should set the drv->power_specified bit
* before registering if the driver provides
* power_usage numbers.
*
* If power_specified is not set,
* we fill in power_usage with decreasing values as the
* cpuidle code has an implicit assumption that state Cn
* uses less power than C(n-1).
*
* With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned
* an power value of -1. So we use -2, -3, etc, for other
* c-states.
*/
for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++)
drv->states[i].power_usage = -1 - i;
-}
static void __cpuidle_driver_init(struct cpuidle_driver *drv) { drv->refcnt = 0;
if (!drv->power_specified)
set_power_states(drv);
}
static int __cpuidle_register_driver(struct cpuidle_driver *drv, int cpu) diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index 2efee27..14eb11f 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -312,7 +312,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) { struct menu_device *data = &__get_cpu_var(menu_devices); int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
int power_usage = -1; int i; int multiplier; struct timespec t;
@@ -383,11 +382,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) if (s->exit_latency * multiplier > data->predicted_us) continue;
if (s->power_usage < power_usage) {
power_usage = s->power_usage;
data->last_state_idx = i;
data->exit_us = s->exit_latency;
}
data->last_state_idx = i;
data->exit_us = s->exit_latency; } /* not deepest C-state chosen for low predicted residency */
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 3711b34..24cd103 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -126,9 +126,9 @@ struct cpuidle_driver { struct module *owner; int refcnt;
unsigned int power_specified:1; /* set to 1 to use the core cpuidle time keeping (for all states). */ unsigned int en_core_tk_irqen:1;
/* states array must be ordered in decreasing power consumption */ struct cpuidle_state states[CPUIDLE_STATE_MAX]; int state_count; int safe_state_index;
-- 1.7.5.4
Hi,
On 11/12/2012 09:26 PM, Daniel Lezcano wrote:
This patch follows the discussion about reinitializing the power usage when a C-state is added/removed.
https://lkml.org/lkml/2012/10/16/518
We realized the power usage field is never filled and when it is filled for tegra, the power_specified flag is not set making all these values to be resetted when the driver is initialized with the set_power_state function.
Julius and I feel this is over-engineered and the power_specified flag could be simply removed and continue assuming the states are backward sorted.
The menu governor select function is simplified as the power is ordered. Actually the condition is always true with the current code.
The cpuidle_play_dead function is also simplified by doing a reverse lookup on the array.
The set_power_states function is removed as it does no make sense anymore.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
drivers/cpuidle/cpuidle.c | 17 ++++------------- drivers/cpuidle/driver.c | 25 ------------------------- drivers/cpuidle/governors/menu.c | 8 ++------ include/linux/cpuidle.h | 2 +- 4 files changed, 7 insertions(+), 45 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 711dd83..f983262 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -69,24 +69,15 @@ int cpuidle_play_dead(void) { struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices); struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
- int i, dead_state = -1;
- int power_usage = -1;
- int i;
if (!drv) return -ENODEV; /* Find lowest-power state that supports long-term idle */
- for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
struct cpuidle_state *s = &drv->states[i];
if (s->power_usage < power_usage && s->enter_dead) {
power_usage = s->power_usage;
dead_state = i;
}
- }
- if (dead_state != -1)
return drv->states[dead_state].enter_dead(dev, dead_state);
- for (i = drv->state_count; i >= CPUIDLE_DRIVER_STATE_START; i--)
if (drv->states[i].play_dead)
I guess you meant drv->states[i].enter_dead
return drv->states[i].enter_dead(dev, i);
-- Francesco
On 11/18/2012 09:40 AM, Francesco Lavra wrote:
Hi,
On 11/12/2012 09:26 PM, Daniel Lezcano wrote:
This patch follows the discussion about reinitializing the power usage when a C-state is added/removed.
https://lkml.org/lkml/2012/10/16/518
We realized the power usage field is never filled and when it is filled for tegra, the power_specified flag is not set making all these values to be resetted when the driver is initialized with the set_power_state function.
Julius and I feel this is over-engineered and the power_specified flag could be simply removed and continue assuming the states are backward sorted.
The menu governor select function is simplified as the power is ordered. Actually the condition is always true with the current code.
The cpuidle_play_dead function is also simplified by doing a reverse lookup on the array.
The set_power_states function is removed as it does no make sense anymore.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
drivers/cpuidle/cpuidle.c | 17 ++++------------- drivers/cpuidle/driver.c | 25 ------------------------- drivers/cpuidle/governors/menu.c | 8 ++------ include/linux/cpuidle.h | 2 +- 4 files changed, 7 insertions(+), 45 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 711dd83..f983262 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -69,24 +69,15 @@ int cpuidle_play_dead(void) { struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices); struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
- int i, dead_state = -1;
- int power_usage = -1;
- int i;
if (!drv) return -ENODEV; /* Find lowest-power state that supports long-term idle */
- for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
struct cpuidle_state *s = &drv->states[i];
if (s->power_usage < power_usage && s->enter_dead) {
power_usage = s->power_usage;
dead_state = i;
}
- }
- if (dead_state != -1)
return drv->states[dead_state].enter_dead(dev, dead_state);
- for (i = drv->state_count; i >= CPUIDLE_DRIVER_STATE_START; i--)
if (drv->states[i].play_dead)
I guess you meant drv->states[i].enter_dead
Yep :)
Hi,
What is the current status of this? Daniel, do you think you have got enough feedback to submit a definitive patch for this? Rafael, would you approve of such a change?
The bug with dynamically added C-states that is tied to this still hurts the battery life for some users across all distros every day, so I think it would be valuable to get a consistent solution into the mainline soon before everyone has to roll their own.
On 11/12/2012 09:26 PM, Daniel Lezcano wrote:
This patch follows the discussion about reinitializing the power usage when a C-state is added/removed.
https://lkml.org/lkml/2012/10/16/518
We realized the power usage field is never filled and when it is filled for tegra, the power_specified flag is not set making all these values to be resetted when the driver is initialized with the set_power_state function.
Julius and I feel this is over-engineered and the power_specified flag could be simply removed and continue assuming the states are backward sorted.
The menu governor select function is simplified as the power is ordered. Actually the condition is always true with the current code.
The cpuidle_play_dead function is also simplified by doing a reverse lookup on the array.
The set_power_states function is removed as it does no make sense anymore.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
drivers/cpuidle/cpuidle.c | 17 ++++------------- drivers/cpuidle/driver.c | 25 ------------------------- drivers/cpuidle/governors/menu.c | 8 ++------ include/linux/cpuidle.h | 2 +- 4 files changed, 7 insertions(+), 45 deletions(-)
On Monday, December 10, 2012 11:09:58 AM Julius Werner wrote:
Hi,
What is the current status of this? Daniel, do you think you have got enough feedback to submit a definitive patch for this? Rafael, would you approve of such a change?
I need to talk to Len about that before I give you a reliable answer.
Thanks, Rafael
The bug with dynamically added C-states that is tied to this still hurts the battery life for some users across all distros every day, so I think it would be valuable to get a consistent solution into the mainline soon before everyone has to roll their own.
On 11/12/2012 09:26 PM, Daniel Lezcano wrote:
This patch follows the discussion about reinitializing the power usage when a C-state is added/removed.
https://lkml.org/lkml/2012/10/16/518
We realized the power usage field is never filled and when it is filled for tegra, the power_specified flag is not set making all these values to be resetted when the driver is initialized with the set_power_state function.
Julius and I feel this is over-engineered and the power_specified flag could be simply removed and continue assuming the states are backward sorted.
The menu governor select function is simplified as the power is ordered. Actually the condition is always true with the current code.
The cpuidle_play_dead function is also simplified by doing a reverse lookup on the array.
The set_power_states function is removed as it does no make sense anymore.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
drivers/cpuidle/cpuidle.c | 17 ++++------------- drivers/cpuidle/driver.c | 25 ------------------------- drivers/cpuidle/governors/menu.c | 8 ++------ include/linux/cpuidle.h | 2 +- 4 files changed, 7 insertions(+), 45 deletions(-)
On 12/10/2012 08:09 PM, Julius Werner wrote:
Hi,
What is the current status of this? Daniel, do you think you have got enough feedback to submit a definitive patch for this?
Yes, I have a definitive patch. I will resend it tomorrow.
Thanks -- Daniel