RE: [PATCH] OMAP CPUIDLE: CPU Idle latency measurement

2 Sep 2010


      ...
-----Original Message-----
From: linaro-dev-bounces@lists.linaro.org [mailto:linaro-dev-
bounces@lists.linaro.org] On Behalf Of Amit Kucheria
Sent: Thursday, September 02, 2010 1:26 PM
To: Kevin Hilman
Cc: linaro-dev@lists.linaro.org; linux-omap@vger.kernel.org
Subject: Re: [PATCH] OMAP CPUIDLE: CPU Idle latency measurement
On 10 Aug 27, Kevin Hilman wrote:
...
vishwanath.sripathy@linaro.org writes:
...
From: Vishwanath BS vishwanath.sripathy@linaro.org
This patch has instrumentation code for measuring latencies for
various CPUIdle C states for OMAP. Idea here is to capture the
timestamp at various phases of CPU Idle and then compute the sw
latency for various c states.  For OMAP, 32k clock is chosen as
reference clock this as is an always on clock.  wkup domain memory
(scratchpad memory) is used for storing timestamps.  One can see the
worstcase latencies in below sysfs entries (after enabling
CONFIG_CPU_IDLE_PROF in .config). This information can be used to
correctly configure cpu idle latencies for various C states after
adding HW latencies for each of these sw latencies.
/sys/devices/system/cpu/cpu0/cpuidle/state<n>/actual_latency
/sys/devices/system/cpu/cpu0/cpuidle/state<n>/sleep_latency
/sys/devices/system/cpu/cpu0/cpuidle/state<n>/wkup_latency
THis patch is tested on OMAP ZOOM3 using kevin's pm branch.
Signed-off-by: Vishwanath BS vishwanath.sripathy@linaro.org
Cc: linaro-dev@lists.linaro.org
While I have many problems with the implementation details, I won't go
into them because in general this is the wrong direction for kernel
instrumentation.
This approach adds quite a bit overhead to the idle path itself.  With
all the reads/writes from/to the scratchpad(?) and all the
multiplications
...
and divides in every idle path, as well as the wait-for-idlest in both
the sleep and resume paths.  The additional overhead added is non
trivial.
...
Basically, I'd like get away from custom instrumentation and measurement
coded inside the kernel itself.  This kind of code never stops growing
and morphing into ugliness, and rarely scales well when new SoCs are
added.
With ftrace/perf, we can add tracepoints at specific points and use
external tools to extract and analyze the delays, latencys etc.
The point is to keep the minimum possible in the kernel: just the
tracepoints we're interested in.   The rest (calculations, averages,
analysis, etc.) does not need to be in the kernel and can be done easier
and with more powerful tools outside the kernel.
Kevin,
I agree. We discussed this a little in our weekly meeting. Vishwa's main
concern was the lack of ability to instrument the last bit of SRAM code.
I have a feeling that even with caches on when we enter this code, we
won't
see too much variance in the latency to execute this bit of code. Vishwa
is
going to confirm that now. If that hypothesis is true, we can indeed move
to
using tracepoints in the idle path and use external tools to track latency.
There will be difference with and without caches but the delta latency will be constant with caches and without caches. Another important point is
he lowest level code should be just profiled once and for worst CPU/BUS clock speed.
...
Even if it isn't true, the rest of the idle path could still contain
tracepoints.
I also think this would be better approach considering a generic solution.
Regards,
Santosh

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

RE: [PATCH] OMAP CPUIDLE: CPU Idle latency measurement