This patch series is to improve timestamp handling in per-thread mode.
The current code doesn't validate timestamp and always return success for per-thread mode, for a sane implementation, the first patch is to allow validation timestamp tracing in per-thread mode.
The second patch is to respect timestamp option "--timestamp" or "-T", when users set this option, the tool will automatically enable hardware timestamp tracing in Arm CoreSight.
Leo Yan (2): perf cs-etm: Validate timestamp tracing in per-thread mode perf cs-etm: Respect timestamp option
tools/perf/arch/arm/util/cs-etm.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
So far, it's impossible to validate timestamp trace in Arm CoreSight when the perf is in the per-thread mode. E.g. for the command:
perf record -e cs_etm/timestamp/ --per-thread -- ls
The command enables config 'timestamp' for 'cs_etm' event in the per-thread mode. In this case, the function cs_etm_validate_config() directly bails out and skips validation.
Given profiled process can be scheduled on any CPUs in the per-thread mode, this patch validates timestamp tracing for all CPUs when detect the CPU map is empty.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/arch/arm/util/cs-etm.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index b8d6a953fd74..cf9ef9ba800b 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -205,8 +205,17 @@ static int cs_etm_validate_config(struct auxtrace_record *itr, for (i = 0; i < cpu__max_cpu().cpu; i++) { struct perf_cpu cpu = { .cpu = i, };
- if (!perf_cpu_map__has(event_cpus, cpu) || - !perf_cpu_map__has(online_cpus, cpu)) + /* + * In per-cpu case, do the validation for CPUs to work with. + * In per-thread case, the CPU map is empty. Since the traced + * program can run on any CPUs in this case, thus don't skip + * validation. + */ + if (!perf_cpu_map__empty(event_cpus) && + !perf_cpu_map__has(event_cpus, cpu)) + continue; + + if (!perf_cpu_map__has(online_cpus, cpu)) continue;
err = cs_etm_validate_context_id(itr, evsel, i);
On 27/08/2023 14:35, Leo Yan wrote:
So far, it's impossible to validate timestamp trace in Arm CoreSight when the perf is in the per-thread mode. E.g. for the command:
perf record -e cs_etm/timestamp/ --per-thread -- ls
The command enables config 'timestamp' for 'cs_etm' event in the per-thread mode. In this case, the function cs_etm_validate_config() directly bails out and skips validation.
Given profiled process can be scheduled on any CPUs in the per-thread mode, this patch validates timestamp tracing for all CPUs when detect the CPU map is empty.
There is an edge case where the profiled process is known by the user to be pinned to a specific CPU, rather than possibly running on all CPUs, so this isn't always true.
But I think that can be worked around by changing it to a per-cpu session to get around the new error. Given that this validation was only supposed to be best effort information and not get in the way you could say to not make it more restrictive.
But it's quite a small edge case so either way:
Reviewed-by: James Clark james.clark@arm.com
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/arch/arm/util/cs-etm.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index b8d6a953fd74..cf9ef9ba800b 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -205,8 +205,17 @@ static int cs_etm_validate_config(struct auxtrace_record *itr, for (i = 0; i < cpu__max_cpu().cpu; i++) { struct perf_cpu cpu = { .cpu = i, };
if (!perf_cpu_map__has(event_cpus, cpu) ||
!perf_cpu_map__has(online_cpus, cpu))
/*
* In per-cpu case, do the validation for CPUs to work with.
* In per-thread case, the CPU map is empty. Since the traced
* program can run on any CPUs in this case, thus don't skip
* validation.
*/
if (!perf_cpu_map__empty(event_cpus) &&
!perf_cpu_map__has(event_cpus, cpu))
continue;
if (!perf_cpu_map__has(online_cpus, cpu)) continue;
err = cs_etm_validate_context_id(itr, evsel, i);
On Mon, Sep 04, 2023 at 04:23:43PM +0100, James Clark wrote:
On 27/08/2023 14:35, Leo Yan wrote:
So far, it's impossible to validate timestamp trace in Arm CoreSight when the perf is in the per-thread mode. E.g. for the command:
perf record -e cs_etm/timestamp/ --per-thread -- ls
The command enables config 'timestamp' for 'cs_etm' event in the per-thread mode. In this case, the function cs_etm_validate_config() directly bails out and skips validation.
Given profiled process can be scheduled on any CPUs in the per-thread mode, this patch validates timestamp tracing for all CPUs when detect the CPU map is empty.
There is an edge case where the profiled process is known by the user to be pinned to a specific CPU, rather than possibly running on all CPUs, so this isn't always true.
Good point.
However, when a process is pinned to specific CPUs, we still can dynamically change the scheduling affinity to any other CPUs by using taskset command or calling sched_setaffinity(). From a perf session's pespective, it is sane to validate timestamp tracing for all online CPUs for per-thread mode.
But I think that can be worked around by changing it to a per-cpu session to get around the new error. Given that this validation was only supposed to be best effort information and not get in the way you could say to not make it more restrictive.
But it's quite a small edge case so either way:
Reviewed-by: James Clark james.clark@arm.com
Thanks for review!
Leo
When users pass the option '--timestamp' or '-T' in the record command, all events will set the PERF_SAMPLE_TIME bit in the attribution. In this case, the AUX event will record the kernel timestamp, but it doesn't mean Arm CoreSight enables timestamp packets in its hardware tracing.
If the option '--timestamp' or '-T' is set, this patch always enables Arm CoreSight timestamp, as a result, the bit 28 in event's config is to be set.
Before:
# perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 69 }, type = 12, size = 136, config = 0, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ...
After:
# perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 49 }, type = 12, size = 136, config = 0x10000000, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ...
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/arch/arm/util/cs-etm.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index cf9ef9ba800b..58c506e9788d 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -442,6 +442,15 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, "contextid", 1); }
+ /* + * When the option '--timestamp' or '-T' is enabled, the PERF_SAMPLE_TIME + * bit is set for all events. In this case, always enable Arm CoreSight + * timestamp tracing. + */ + if (opts->sample_time_set) + evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, + "timestamp", 1); + /* Add dummy event to keep tracking */ err = parse_event(evlist, "dummy:u"); if (err)
On 27/08/2023 14:35, Leo Yan wrote:
When users pass the option '--timestamp' or '-T' in the record command, all events will set the PERF_SAMPLE_TIME bit in the attribution. In this case, the AUX event will record the kernel timestamp, but it doesn't mean Arm CoreSight enables timestamp packets in its hardware tracing.
If the option '--timestamp' or '-T' is set, this patch always enables Arm CoreSight timestamp, as a result, the bit 28 in event's config is to be set.
Before:
# perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 69 }, type = 12, size = 136, config = 0, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ...
After:
# perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 49 }, type = 12, size = 136, config = 0x10000000, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ...
Signed-off-by: Leo Yan leo.yan@linaro.org
Reviewed-by: James Clark james.clark@arm.com
tools/perf/arch/arm/util/cs-etm.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index cf9ef9ba800b..58c506e9788d 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -442,6 +442,15 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, "contextid", 1); }
- /*
* When the option '--timestamp' or '-T' is enabled, the PERF_SAMPLE_TIME
* bit is set for all events. In this case, always enable Arm CoreSight
* timestamp tracing.
*/
- if (opts->sample_time_set)
evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
"timestamp", 1);
- /* Add dummy event to keep tracking */ err = parse_event(evlist, "dummy:u"); if (err)