The commit 8e92b6605d introduced the TIME_VALID flag for the C1 state if this one is a mwait state assuming the interrupt will be enabled before reading the end time of the idle state.
The changelog of this commit mention a potential problem with the menu governor but not a real observation and I assume it described an old code as the commit is from 2008.
I have been digging through the code and I didn't find any place where the interrupts are enabled before reading the time. Moreover with the changes in the meantime, we moved the time measurements in the cpuidle core as well as the interrupts enabling making sure the time is measured before the interrupt are enabled again in a single place.
Remove this test as the time measurement is always valid for this state.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org --- drivers/acpi/processor_idle.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 380b4b4..7afba40 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -985,8 +985,6 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr) state->flags = 0; switch (cx->type) { case ACPI_STATE_C1: - if (cx->entry_method != ACPI_CSTATE_FFH) - state->flags |= CPUIDLE_FLAG_TIME_INVALID;
state->enter = acpi_idle_enter_c1; state->enter_dead = acpi_idle_play_dead;
The time measurement is always valid for all the drivers for all the idle states. This flag is no longer needed.
Remove it as well as the code using it.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org --- drivers/cpuidle/governors/ladder.c | 8 ++------ drivers/cpuidle/governors/menu.c | 25 ++++++++----------------- include/linux/cpuidle.h | 2 -- 3 files changed, 10 insertions(+), 25 deletions(-)
diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c index 37263d9..55a1fc9 100644 --- a/drivers/cpuidle/governors/ladder.c +++ b/drivers/cpuidle/governors/ladder.c @@ -79,12 +79,8 @@ static int ladder_select_state(struct cpuidle_driver *drv,
last_state = &ldev->states[last_idx];
- if (!(drv->states[last_idx].flags & CPUIDLE_FLAG_TIME_INVALID)) { - last_residency = cpuidle_get_last_residency(dev) - \ - drv->states[last_idx].exit_latency; - } - else - last_residency = last_state->threshold.promotion_time + 1; + last_residency = cpuidle_get_last_residency(dev) - \ + drv->states[last_idx].exit_latency;
/* consider promotion */ if (last_idx < drv->state_count - 1 && diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index 659d7b0..2191ea0 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -395,32 +395,23 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * Try to figure out how much time passed between entry to low * power state and occurrence of the wakeup event. * - * If the entered idle state didn't support residency measurements, - * we are basically lost in the dark how much time passed. - * As a compromise, assume we slept for the whole expected time. - * * Any measured amount of time will include the exit latency. * Since we are interested in when the wakeup begun, not when it * was completed, we must subtract the exit latency. However, if * the measured amount of time is less than the exit latency, * assume the state was never reached and the exit latency is 0. */ - if (unlikely(target->flags & CPUIDLE_FLAG_TIME_INVALID)) { - /* Use timer value as is */ - measured_us = data->next_timer_us;
- } else { - /* Use measured value */ - measured_us = cpuidle_get_last_residency(dev); + /* Use measured value */ + measured_us = cpuidle_get_last_residency(dev);
- /* Deduct exit latency */ - if (measured_us > target->exit_latency) - measured_us -= target->exit_latency; + /* Deduct exit latency */ + if (measured_us > target->exit_latency) + measured_us -= target->exit_latency;
- /* Make sure our coefficients do not exceed unity */ - if (measured_us > data->next_timer_us) - measured_us = data->next_timer_us; - } + /* Make sure our coefficients do not exceed unity */ + if (measured_us > data->next_timer_us) + measured_us = data->next_timer_us;
/* Update our correction ratio */ new_factor = data->correction_factor[data->bucket]; diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index a07e087..d697b9e 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -53,7 +53,6 @@ struct cpuidle_state { };
/* Idle State Flags */ -#define CPUIDLE_FLAG_TIME_INVALID (0x01) /* is residency time measurable? */ #define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */ #define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */
@@ -90,7 +89,6 @@ DECLARE_PER_CPU(struct cpuidle_device, cpuidle_dev); * cpuidle_get_last_residency - retrieves the last state's residency time * @dev: the target CPU * - * NOTE: this value is invalid if CPUIDLE_FLAG_TIME_INVALID is set */ static inline int cpuidle_get_last_residency(struct cpuidle_device *dev) {
On Friday, November 21, 2014 10:29:51 AM Daniel Lezcano wrote:
The commit 8e92b6605d introduced the TIME_VALID flag for the C1 state if this one is a mwait state assuming the interrupt will be enabled before reading the end time of the idle state.
The changelog of this commit mention a potential problem with the menu governor but not a real observation and I assume it described an old code as the commit is from 2008.
I have been digging through the code and I didn't find any place where the interrupts are enabled before reading the time. Moreover with the changes in the meantime, we moved the time measurements in the cpuidle core as well as the interrupts enabling making sure the time is measured before the interrupt are enabled again in a single place.
Remove this test as the time measurement is always valid for this state.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
Well, I need Len to have a look at this.
drivers/acpi/processor_idle.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 380b4b4..7afba40 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -985,8 +985,6 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr) state->flags = 0; switch (cx->type) { case ACPI_STATE_C1:
if (cx->entry_method != ACPI_CSTATE_FFH)
state->flags |= CPUIDLE_FLAG_TIME_INVALID;
state->enter = acpi_idle_enter_c1; state->enter_dead = acpi_idle_play_dead;
On 11/21/2014 04:05 PM, Rafael J. Wysocki wrote:
On Friday, November 21, 2014 10:29:51 AM Daniel Lezcano wrote:
The commit 8e92b6605d introduced the TIME_VALID flag for the C1 state if this one is a mwait state assuming the interrupt will be enabled before reading the end time of the idle state.
The changelog of this commit mention a potential problem with the menu governor but not a real observation and I assume it described an old code as the commit is from 2008.
I have been digging through the code and I didn't find any place where the interrupts are enabled before reading the time. Moreover with the changes in the meantime, we moved the time measurements in the cpuidle core as well as the interrupts enabling making sure the time is measured before the interrupt are enabled again in a single place.
Remove this test as the time measurement is always valid for this state.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
Well, I need Len to have a look at this.
Ok thanks.
If you have time, is it possible also you have a look at the patchset I sent :
[PATCH V3 0/6] sched: idle: cpuidle: cleanups and fixes
and give your opinion about the poll state Peter and I we were discussing ?
Regards
-- Daniel
drivers/acpi/processor_idle.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 380b4b4..7afba40 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -985,8 +985,6 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr) state->flags = 0; switch (cx->type) { case ACPI_STATE_C1:
if (cx->entry_method != ACPI_CSTATE_FFH)
state->flags |= CPUIDLE_FLAG_TIME_INVALID; state->enter = acpi_idle_enter_c1; state->enter_dead = acpi_idle_play_dead;
On 11/21/2014 04:01 PM, Daniel Lezcano wrote:
On 11/21/2014 04:05 PM, Rafael J. Wysocki wrote:
On Friday, November 21, 2014 10:29:51 AM Daniel Lezcano wrote:
The commit 8e92b6605d introduced the TIME_VALID flag for the C1 state if this one is a mwait state assuming the interrupt will be enabled before reading the end time of the idle state.
The changelog of this commit mention a potential problem with the menu governor but not a real observation and I assume it described an old code as the commit is from 2008.
I have been digging through the code and I didn't find any place where the interrupts are enabled before reading the time. Moreover with the changes in the meantime, we moved the time measurements in the cpuidle core as well as the interrupts enabling making sure the time is measured before the interrupt are enabled again in a single place.
Remove this test as the time measurement is always valid for this state.
Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org
Well, I need Len to have a look at this.
Hi,
any news on that ?
Thanks
-- Daniel
Ok thanks.
If you have time, is it possible also you have a look at the patchset I sent :
[PATCH V3 0/6] sched: idle: cpuidle: cleanups and fixes
and give your opinion about the poll state Peter and I we were discussing ?
Regards
-- Daniel
drivers/acpi/processor_idle.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 380b4b4..7afba40 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -985,8 +985,6 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr) state->flags = 0; switch (cx->type) { case ACPI_STATE_C1:
if (cx->entry_method != ACPI_CSTATE_FFH)
state->flags |= CPUIDLE_FLAG_TIME_INVALID; state->enter = acpi_idle_enter_c1; state->enter_dead = acpi_idle_play_dead;
The commit 8e92b6605d introduced the TIME_VALID flag for the C1 state if this one is a mwait state assuming the interrupt will be enabled before reading the end time of the idle state.
...
I have been digging through the code and I didn't find any place where the interrupts are enabled before reading the time.
Linux is correct as it stands, and the patch proposed here is not correct.
Note that on x86, the "STI" instruction enables interrupts:
static inline void native_safe_halt(void) { asm volatile("sti; hlt": : :"memory"); }
We get here via acpi_safe_halt(), which is invoked with interrupts disabled via the cpuidle enter path. As it needs to return with interrupts disabled, it hacks them off again, but not before the actual interrupt is serviced -- which is what throws off the time-stamps in this path, and why the CPUIDLE_FLAG_TIME_INVALID exists.
/* * Callers should disable interrupts before the call and enable * interrupts after return. */ static void acpi_safe_halt(void) { if (!tif_need_resched()) { safe_halt(); local_irq_disable(); } }
That said... I think if ladder and menu were more clever, we could delete CPUIDLE_FLAG_TIME_INVALID... Today, we use this flag to set our last_residency to the amount of time until the predicted next timer expires. There are only two cases:
First, we could have taken and serviced an interrupt -- all before the timer would have expired. In this case, we are rounding up last_residency to the max, the duration till the next timer would have expired.
Second, we could take the timer, and we could service it for a long time, and we'd return a time that is longer than the expected time. So here we are truncating down to the expected duration till the next timer.
But in case #1, our "invalid" measurement is actually more accurate than what we are rounding up to. And in case #2, we don't need a flag to detect it -- indeed menu already checks for that case:
/* Make sure our coefficients do not exceed unity */ if (measured_us > data->next_timer_us) measured_us = data->next_timer_us;
cheers, Len Brown, Intel Open Source Technology Center
linaro-kernel@lists.linaro.org