From: Andrew Kilroy andrew.kilroy@arm.com
Since the size is already printed earlier in hex, print the same data using the same format, in hex.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: Andrew Kilroy andrew.kilroy@arm.com Signed-off-by: German Gomez german.gomez@arm.com --- tools/perf/util/cs-etm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f323adb1af85..4f672f7d008c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -537,7 +537,7 @@ static void cs_etm__dump_event(struct cs_etm_queue *etmq,
fprintf(stdout, "\n"); color_fprintf(stdout, color, - ". ... CoreSight %s Trace data: size %zu bytes\n", + ". ... CoreSight %s Trace data: size %#zx bytes\n", cs_etm_decoder__get_name(etmq->decoder), buffer->size);
do {
From: Andrew Kilroy andrew.kilroy@arm.com
Since the size is already printed earlier in hex, print the same data using the same format, in hex.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: Andrew Kilroy andrew.kilroy@arm.com Signed-off-by: German Gomez german.gomez@arm.com --- tools/perf/util/arm-spe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 58b7069c5a5f..2196291976d9 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -100,7 +100,7 @@ static void arm_spe_dump(struct arm_spe *spe __maybe_unused, const char *color = PERF_COLOR_BLUE;
color_fprintf(stdout, color, - ". ... ARM SPE data: size %zu bytes\n", + ". ... ARM SPE data: size %#zx bytes\n", len);
while (len) {
On Thu, Sep 16, 2021 at 04:46:32PM +0100, German Gomez wrote:
From: Andrew Kilroy andrew.kilroy@arm.com
Since the size is already printed earlier in hex, print the same data using the same format, in hex.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: Andrew Kilroy andrew.kilroy@arm.com Signed-off-by: German Gomez german.gomez@arm.com
Reviewed-by: Leo Yan leo.yan@linaro.org
tools/perf/util/arm-spe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 58b7069c5a5f..2196291976d9 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -100,7 +100,7 @@ static void arm_spe_dump(struct arm_spe *spe __maybe_unused, const char *color = PERF_COLOR_BLUE; color_fprintf(stdout, color,
". ... ARM SPE data: size %zu bytes\n",
". ... ARM SPE data: size %#zx bytes\n", len);
while (len) { -- 2.17.1
This patch enabled support for snapshot mode of arm_spe events, including the implementation of the necessary callbacks (excluding find_snapshot, which is to be included in a followup commit).
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com --- tools/perf/arch/arm64/util/arm-spe.c | 130 +++++++++++++++++++++++++++ 1 file changed, 130 insertions(+)
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index a4420d4df503..f8b03d164b42 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -84,6 +84,55 @@ static int arm_spe_info_fill(struct auxtrace_record *itr, return 0; }
+static void +arm_spe_snapshot_resolve_auxtrace_defaults(struct record_opts *opts, + bool privileged) +{ + /* + * The default snapshot size is the auxtrace mmap size. If neither auxtrace mmap size nor + * snapshot size is specified, then the default is 4MiB for privileged users, 128KiB for + * unprivileged users. + * + * The default auxtrace mmap size is 4MiB/page_size for privileged users, 128KiB for + * unprivileged users. If an unprivileged user does not specify mmap pages, the mmap pages + * will be reduced from the default 512KiB/page_size to 256KiB/page_size, otherwise the + * user is likely to get an error as they exceed their mlock limmit. + */ + + /* + * No size were given to '-S' or '-m,', so go with the default + */ + if (!opts->auxtrace_snapshot_size && !opts->auxtrace_mmap_pages) { + if (privileged) { + opts->auxtrace_mmap_pages = MiB(4) / page_size; + } else { + opts->auxtrace_mmap_pages = KiB(128) / page_size; + if (opts->mmap_pages == UINT_MAX) + opts->mmap_pages = KiB(256) / page_size; + } + } else if (!opts->auxtrace_mmap_pages && !privileged && opts->mmap_pages == UINT_MAX) { + opts->mmap_pages = KiB(256) / page_size; + } + + /* + * '-m,xyz' was specified but no snapshot size, so make the snapshot size as big as the + * auxtrace mmap area. + */ + if (!opts->auxtrace_snapshot_size) + opts->auxtrace_snapshot_size = opts->auxtrace_mmap_pages * (size_t)page_size; + + /* + * '-Sxyz' was specified but no auxtrace mmap area, so make the auxtrace mmap area big + * enough to fit the requested snapshot size. + */ + if (!opts->auxtrace_mmap_pages) { + size_t sz = opts->auxtrace_snapshot_size; + + sz = round_up(sz, page_size) / page_size; + opts->auxtrace_mmap_pages = roundup_pow_of_two(sz); + } +} + static int arm_spe_recording_options(struct auxtrace_record *itr, struct evlist *evlist, struct record_opts *opts) @@ -115,6 +164,36 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, if (!opts->full_auxtrace) return 0;
+ /* + * we are in snapshot mode. + */ + if (opts->auxtrace_snapshot_mode) { + /* + * Command arguments '-Sxyz' and/or '-m,xyz' are missing, so fill those in with + * default values. + */ + if (!opts->auxtrace_snapshot_size || !opts->auxtrace_mmap_pages) + arm_spe_snapshot_resolve_auxtrace_defaults(opts, privileged); + + /* + * Snapshot size can't be bigger than the auxtrace area. + */ + if (opts->auxtrace_snapshot_size > opts->auxtrace_mmap_pages * (size_t)page_size) { + pr_err("Snapshot size %zu must not be greater than AUX area tracing mmap size %zu\n", + opts->auxtrace_snapshot_size, + opts->auxtrace_mmap_pages * (size_t)page_size); + return -EINVAL; + } + + /* + * Something went wrong somewhere - this shouldn't happen. + */ + if (!opts->auxtrace_snapshot_size || !opts->auxtrace_mmap_pages) { + pr_err("Failed to calculate default snapshot size and/or AUX area tracing mmap pages\n"); + return -EINVAL; + } + } + /* We are in full trace mode but '-m,xyz' wasn't specified */ if (!opts->auxtrace_mmap_pages) { if (privileged) { @@ -138,6 +217,9 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, } }
+ if (opts->auxtrace_snapshot_mode) + pr_debug2("%sx snapshot size: %zu\n", ARM_SPE_PMU_NAME, + opts->auxtrace_snapshot_size);
/* * To obtain the auxtrace buffer file descriptor, the auxtrace event @@ -172,6 +254,51 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, return 0; }
+static int arm_spe_parse_snapshot_options(struct auxtrace_record *itr __maybe_unused, + struct record_opts *opts, + const char *str) +{ + unsigned long long snapshot_size = 0; + char *endptr; + + if (str) { + snapshot_size = strtoull(str, &endptr, 0); + if (*endptr || snapshot_size > SIZE_MAX) + return -1; + } + + opts->auxtrace_snapshot_mode = true; + opts->auxtrace_snapshot_size = snapshot_size; + + return 0; +} + +static int arm_spe_snapshot_start(struct auxtrace_record *itr) +{ + struct arm_spe_recording *ptr = + container_of(itr, struct arm_spe_recording, itr); + struct evsel *evsel; + + evlist__for_each_entry(ptr->evlist, evsel) { + if (evsel->core.attr.type == ptr->arm_spe_pmu->type) + return evsel__disable(evsel); + } + return -EINVAL; +} + +static int arm_spe_snapshot_finish(struct auxtrace_record *itr) +{ + struct arm_spe_recording *ptr = + container_of(itr, struct arm_spe_recording, itr); + struct evsel *evsel; + + evlist__for_each_entry(ptr->evlist, evsel) { + if (evsel->core.attr.type == ptr->arm_spe_pmu->type) + return evsel__enable(evsel); + } + return -EINVAL; +} + static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused) { struct timespec ts; @@ -207,6 +334,9 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
sper->arm_spe_pmu = arm_spe_pmu; sper->itr.pmu = arm_spe_pmu; + sper->itr.snapshot_start = arm_spe_snapshot_start; + sper->itr.snapshot_finish = arm_spe_snapshot_finish; + sper->itr.parse_snapshot_options = arm_spe_parse_snapshot_options; sper->itr.recording_options = arm_spe_recording_options; sper->itr.info_priv_size = arm_spe_info_priv_size; sper->itr.info_fill = arm_spe_info_fill;
On Thu, Sep 16, 2021 at 04:46:33PM +0100, German Gomez wrote:
This patch enabled support for snapshot mode of arm_spe events, including the implementation of the necessary callbacks (excluding find_snapshot, which is to be included in a followup commit).
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com
Reviewed-by: Leo Yan leo.yan@linaro.org Tested-by: Leo Yan leo.yan@linaro.org
The head pointer of the AUX buffer managed by the arm_spe_pmu.c driver is not monotonically increasing, therefore the find_snapshot callback is needed in order to find the trace data within the AUX buffer and avoid wasting space in the perf.data file.
The pointer is assumed to have wrapped if the buffer contains non-zero data at the end. If it has wrapped, the entire contents of the AUX buffer are stored in the perf.data file. Otherwise only the data up to the head pointer is stored.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com --- tools/perf/arch/arm64/util/arm-spe.c | 145 +++++++++++++++++++++++++++ 1 file changed, 145 insertions(+)
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index f8b03d164b42..56785034fc84 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -23,6 +23,7 @@ #include "../../../util/auxtrace.h" #include "../../../util/record.h" #include "../../../util/arm-spe.h" +#include <tools/libc_compat.h> // reallocarray
#define KiB(x) ((x) * 1024) #define MiB(x) ((x) * 1024 * 1024) @@ -31,6 +32,8 @@ struct arm_spe_recording { struct auxtrace_record itr; struct perf_pmu *arm_spe_pmu; struct evlist *evlist; + int wrapped_cnt; + bool *wrapped; };
static void arm_spe_set_timestamp(struct auxtrace_record *itr, @@ -299,6 +302,146 @@ static int arm_spe_snapshot_finish(struct auxtrace_record *itr) return -EINVAL; }
+static int arm_spe_alloc_wrapped_array(struct arm_spe_recording *ptr, int idx) +{ + bool *wrapped; + int cnt = ptr->wrapped_cnt, new_cnt, i; + + /* + * No need to allocate, so return early. + */ + if (idx < cnt) + return 0; + + /* + * Make ptr->wrapped as big as idx. + */ + new_cnt = idx + 1; + + /* + * Free'ed in arm_spe_recording_free(). + */ + wrapped = reallocarray(ptr->wrapped, new_cnt, sizeof(bool)); + if (!wrapped) + return -ENOMEM; + + /* + * init new allocated values. + */ + for (i = cnt; i < new_cnt; i++) + wrapped[i] = false; + + ptr->wrapped_cnt = new_cnt; + ptr->wrapped = wrapped; + + return 0; +} + +static bool arm_spe_buffer_has_wrapped(unsigned char *buffer, + size_t buffer_size, u64 head) +{ + u64 i, watermark; + u64 *buf = (u64 *)buffer; + size_t buf_size = buffer_size; + + /* + * Defensively handle the case where head might be continually increasing - if its value is + * equal or greater than the size of the ring buffer, then we can safely determine it has + * wrapped around. Otherwise, continue to detect if head might have wrapped. + */ + if (head >= buffer_size) + return true; + + /* + * We want to look the very last 512 byte (chosen arbitrarily) in the ring buffer. + */ + watermark = buf_size - 512; + + /* + * The value of head is somewhere within the size of the ring buffer. This can be that there + * hasn't been enough data to fill the ring buffer yet or the trace time was so long that + * head has numerically wrapped around. To find we need to check if we have data at the + * very end of the ring buffer. We can reliably do this because mmap'ed pages are zeroed + * out and there is a fresh mapping with every new session. + */ + + /* + * head is less than 512 byte from the end of the ring buffer. + */ + if (head > watermark) + watermark = head; + + /* + * Speed things up by using 64 bit transactions (see "u64 *buf" above) + */ + watermark /= sizeof(u64); + buf_size /= sizeof(u64); + + /* + * If we find trace data at the end of the ring buffer, head has been there and has + * numerically wrapped around at least once. + */ + for (i = watermark; i < buf_size; i++) + if (buf[i]) + return true; + + return false; +} + +static int arm_spe_find_snapshot(struct auxtrace_record *itr, int idx, + struct auxtrace_mmap *mm, unsigned char *data, + u64 *head, u64 *old) +{ + int err; + bool wrapped; + struct arm_spe_recording *ptr = + container_of(itr, struct arm_spe_recording, itr); + + /* + * Allocate memory to keep track of wrapping if this is the first + * time we deal with this *mm. + */ + if (idx >= ptr->wrapped_cnt) { + err = arm_spe_alloc_wrapped_array(ptr, idx); + if (err) + return err; + } + + /* + * Check to see if *head has wrapped around. If it hasn't only the + * amount of data between *head and *old is snapshot'ed to avoid + * bloating the perf.data file with zeros. But as soon as *head has + * wrapped around the entire size of the AUX ring buffer it taken. + */ + wrapped = ptr->wrapped[idx]; + if (!wrapped && arm_spe_buffer_has_wrapped(data, mm->len, *head)) { + wrapped = true; + ptr->wrapped[idx] = true; + } + + pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n", + __func__, idx, (size_t)*old, (size_t)*head, mm->len); + + /* + * No wrap has occurred, we can just use *head and *old. + */ + if (!wrapped) + return 0; + + /* + * *head has wrapped around - adjust *head and *old to pickup the + * entire content of the AUX buffer. + */ + if (*head >= mm->len) { + *old = *head - mm->len; + } else { + *head += mm->len; + *old = *head - mm->len; + } + + return 0; +} + static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused) { struct timespec ts; @@ -313,6 +456,7 @@ static void arm_spe_recording_free(struct auxtrace_record *itr) struct arm_spe_recording *sper = container_of(itr, struct arm_spe_recording, itr);
+ free(sper->wrapped); free(sper); }
@@ -336,6 +480,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err, sper->itr.pmu = arm_spe_pmu; sper->itr.snapshot_start = arm_spe_snapshot_start; sper->itr.snapshot_finish = arm_spe_snapshot_finish; + sper->itr.find_snapshot = arm_spe_find_snapshot; sper->itr.parse_snapshot_options = arm_spe_parse_snapshot_options; sper->itr.recording_options = arm_spe_recording_options; sper->itr.info_priv_size = arm_spe_info_priv_size;
Hi German,
On Thu, Sep 16, 2021 at 04:46:34PM +0100, German Gomez wrote:
The head pointer of the AUX buffer managed by the arm_spe_pmu.c driver is not monotonically increasing, therefore the find_snapshot callback is needed in order to find the trace data within the AUX buffer and avoid wasting space in the perf.data file.
The pointer is assumed to have wrapped if the buffer contains non-zero data at the end. If it has wrapped, the entire contents of the AUX buffer are stored in the perf.data file. Otherwise only the data up to the head pointer is stored.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com
tools/perf/arch/arm64/util/arm-spe.c | 145 +++++++++++++++++++++++++++ 1 file changed, 145 insertions(+)
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index f8b03d164b42..56785034fc84 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -23,6 +23,7 @@ #include "../../../util/auxtrace.h" #include "../../../util/record.h" #include "../../../util/arm-spe.h" +#include <tools/libc_compat.h> // reallocarray #define KiB(x) ((x) * 1024) #define MiB(x) ((x) * 1024 * 1024) @@ -31,6 +32,8 @@ struct arm_spe_recording { struct auxtrace_record itr; struct perf_pmu *arm_spe_pmu; struct evlist *evlist;
- int wrapped_cnt;
- bool *wrapped;
}; static void arm_spe_set_timestamp(struct auxtrace_record *itr, @@ -299,6 +302,146 @@ static int arm_spe_snapshot_finish(struct auxtrace_record *itr) return -EINVAL; } +static int arm_spe_alloc_wrapped_array(struct arm_spe_recording *ptr, int idx) +{
- bool *wrapped;
- int cnt = ptr->wrapped_cnt, new_cnt, i;
- /*
* No need to allocate, so return early.
*/
- if (idx < cnt)
return 0;
- /*
* Make ptr->wrapped as big as idx.
*/
- new_cnt = idx + 1;
- /*
* Free'ed in arm_spe_recording_free().
*/
- wrapped = reallocarray(ptr->wrapped, new_cnt, sizeof(bool));
- if (!wrapped)
return -ENOMEM;
- /*
* init new allocated values.
*/
- for (i = cnt; i < new_cnt; i++)
wrapped[i] = false;
- ptr->wrapped_cnt = new_cnt;
- ptr->wrapped = wrapped;
- return 0;
+}
+static bool arm_spe_buffer_has_wrapped(unsigned char *buffer,
size_t buffer_size, u64 head)
+{
- u64 i, watermark;
- u64 *buf = (u64 *)buffer;
- size_t buf_size = buffer_size;
- /*
* Defensively handle the case where head might be continually increasing - if its value is
* equal or greater than the size of the ring buffer, then we can safely determine it has
* wrapped around. Otherwise, continue to detect if head might have wrapped.
*/
- if (head >= buffer_size)
return true;
- /*
* We want to look the very last 512 byte (chosen arbitrarily) in the ring buffer.
*/
- watermark = buf_size - 512;
- /*
* The value of head is somewhere within the size of the ring buffer. This can be that there
* hasn't been enough data to fill the ring buffer yet or the trace time was so long that
* head has numerically wrapped around. To find we need to check if we have data at the
* very end of the ring buffer. We can reliably do this because mmap'ed pages are zeroed
* out and there is a fresh mapping with every new session.
*/
- /*
* head is less than 512 byte from the end of the ring buffer.
*/
- if (head > watermark)
watermark = head;
- /*
* Speed things up by using 64 bit transactions (see "u64 *buf" above)
*/
- watermark /= sizeof(u64);
- buf_size /= sizeof(u64);
- /*
* If we find trace data at the end of the ring buffer, head has been there and has
* numerically wrapped around at least once.
*/
- for (i = watermark; i < buf_size; i++)
if (buf[i])
return true;
- return false;
+}
+static int arm_spe_find_snapshot(struct auxtrace_record *itr, int idx,
struct auxtrace_mmap *mm, unsigned char *data,
u64 *head, u64 *old)
+{
- int err;
- bool wrapped;
- struct arm_spe_recording *ptr =
container_of(itr, struct arm_spe_recording, itr);
- /*
* Allocate memory to keep track of wrapping if this is the first
* time we deal with this *mm.
*/
- if (idx >= ptr->wrapped_cnt) {
err = arm_spe_alloc_wrapped_array(ptr, idx);
if (err)
return err;
- }
- /*
* Check to see if *head has wrapped around. If it hasn't only the
* amount of data between *head and *old is snapshot'ed to avoid
* bloating the perf.data file with zeros. But as soon as *head has
* wrapped around the entire size of the AUX ring buffer it taken.
*/
- wrapped = ptr->wrapped[idx];
- if (!wrapped && arm_spe_buffer_has_wrapped(data, mm->len, *head)) {
wrapped = true;
ptr->wrapped[idx] = true;
- }
- pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n",
__func__, idx, (size_t)*old, (size_t)*head, mm->len);
- /*
* No wrap has occurred, we can just use *head and *old.
*/
- if (!wrapped)
return 0;
- /*
* *head has wrapped around - adjust *head and *old to pickup the
* entire content of the AUX buffer.
*/
- if (*head >= mm->len) {
*old = *head - mm->len;
- } else {
*head += mm->len;
*old = *head - mm->len;
- }
- return 0;
+}
static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused) { struct timespec ts; @@ -313,6 +456,7 @@ static void arm_spe_recording_free(struct auxtrace_record *itr) struct arm_spe_recording *sper = container_of(itr, struct arm_spe_recording, itr);
- free(sper->wrapped); free(sper);
} @@ -336,6 +480,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err, sper->itr.pmu = arm_spe_pmu; sper->itr.snapshot_start = arm_spe_snapshot_start; sper->itr.snapshot_finish = arm_spe_snapshot_finish;
- sper->itr.find_snapshot = arm_spe_find_snapshot;
If I understand correctly, this patch copies the code from cs-etm for snapshot handling. About 2 months ago, we removed the Arm cs-etm's specific snapshot callback function and directly use perf's function __auxtrace_mmap__read() to handle 'head' and 'tail' pointers. Please see the commit for details: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Before I review more details for snapshot enabling in patches 03 and 04, could you confirm if Arm SPE can use the same way with cs-etm for snapshot handling? From my understanding, this is a better way to handle AUX buffer's 'head' and 'tail'.
Thanks, Leo
sper->itr.parse_snapshot_options = arm_spe_parse_snapshot_options; sper->itr.recording_options = arm_spe_recording_options; sper->itr.info_priv_size = arm_spe_info_priv_size; -- 2.17.1
On Thu, Sep 23, 2021 at 09:50:16PM +0800, Leo Yan wrote:
[...]
@@ -336,6 +480,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err, sper->itr.pmu = arm_spe_pmu; sper->itr.snapshot_start = arm_spe_snapshot_start; sper->itr.snapshot_finish = arm_spe_snapshot_finish;
- sper->itr.find_snapshot = arm_spe_find_snapshot;
If I understand correctly, this patch copies the code from cs-etm for snapshot handling. About 2 months ago, we removed the Arm cs-etm's specific snapshot callback function and directly use perf's function __auxtrace_mmap__read() to handle 'head' and 'tail' pointers. Please see the commit for details: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Before I review more details for snapshot enabling in patches 03 and 04, could you confirm if Arm SPE can use the same way with cs-etm for snapshot handling? From my understanding, this is a better way to handle AUX buffer's 'head' and 'tail'.
In other words, if we can only apply patch 03 and can pass the testing in patch 05, then it would be a very neat implementation.
I will try to verify these patches and will get back result.
Thanks, Leo
Hi Leo,
On 23/09/2021 15:40, Leo Yan wrote:
On Thu, Sep 23, 2021 at 09:50:16PM +0800, Leo Yan wrote:
[...]
@@ -336,6 +480,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err, sper->itr.pmu = arm_spe_pmu; sper->itr.snapshot_start = arm_spe_snapshot_start; sper->itr.snapshot_finish = arm_spe_snapshot_finish;
- sper->itr.find_snapshot = arm_spe_find_snapshot;
If I understand correctly, this patch copies the code from cs-etm for snapshot handling. About 2 months ago, we removed the Arm cs-etm's specific snapshot callback function and directly use perf's function __auxtrace_mmap__read() to handle 'head' and 'tail' pointers. Please see the commit for details: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Before I review more details for snapshot enabling in patches 03 and 04, could you confirm if Arm SPE can use the same way with cs-etm for snapshot handling? From my understanding, this is a better way to handle AUX buffer's 'head' and 'tail'.
In other words, if we can only apply patch 03 and can pass the testing in patch 05, then it would be a very neat implementation.
I will try to verify these patches and will get back result.
Thanks, Leo
The patch is indeed based on that commit. The reason behind it is that the values for *head are being wrapped in the driver side (see the macro PERF_IDX2OFF which is used at various points in /drivers/perf/arm_spe_pmu.c).
If this callback is not to be added, I believe the driver needs to be updated first so that the head pointer monotonically increases like in cs-etm. Do you think this makes sense for SPE?
(note that the patch will skip the wrap-around detection if this is the case, in order to handle both cases in the userspace perf tool).
Thanks, German
Hi German,
On Thu, Sep 30, 2021 at 01:26:15PM +0100, German Gomez wrote:
[...]
The patch is indeed based on that commit. The reason behind it is that the values for *head are being wrapped in the driver side (see the macro PERF_IDX2OFF which is used at various points in /drivers/perf/arm_spe_pmu.c).
Yes, I noted that Arm SPE driver doesn't use monotonical increasing for AUX head.
If this callback is not to be added, I believe the driver needs to be updated > first so that the head pointer monotonically increases like in cs-etm. Do you think this makes sense for SPE?
Please note, there have two cases should be handled for snapshot mode: - Wrap-around case, somehow function __auxtrace_mmap__read() has handled this case, see [1]; - It's possible that there have overrun case for snapshot mode, e.g. the kernel space receives multiple signals and take snapshot to save Arm SPE trace data into AUX buffer for multiple times, but the userspace tool cannot catch up to save AUX data into perf.data file. Thus the AUX head might be wrapped around for multiple times, for this case, I think monotonically increasing AUX head is the right solution to handle overrun issue.
So simply say, I think the head pointer monotonically increasing is the right thing to do in Arm SPE driver.
(note that the patch will skip the wrap-around detection if this is the case, in order to handle both cases in the userspace perf tool).
Almost agree, I read multiple times but have no idea what's the "both cases" in the last sentence.
Please let me know if anything is not clear.
Thanks, Leo
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool...
Hi Leo,
Many thanks for your comments.
On 04/10/2021 13:27, Leo Yan wrote:
Hi German,
On Thu, Sep 30, 2021 at 01:26:15PM +0100, German Gomez wrote:
[...]
So simply say, I think the head pointer monotonically increasing is the right thing to do in Arm SPE driver.
I will talk to James about how we can proceed on this.
(note that the patch will skip the wrap-around detection if this is the case, in order to handle both cases in the userspace perf tool).
Almost agree, I read multiple times but have no idea what's the "both cases" in the last sentence.
Apologies for the later part was not clear. What I meant to say was that in the original patch for cs-etm, it seemed to handle both cases where AUX head might be monotonically and non-monotonically increasing, so we applied the same for the Arm SPE patch.
Please let me know if anything is not clear.
Thanks, Leo
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool...
On Wed, Oct 06, 2021 at 10:35:20AM +0100, German Gomez wrote:
[...]
So simply say, I think the head pointer monotonically increasing is the right thing to do in Arm SPE driver.
I will talk to James about how we can proceed on this.
Thanks!
(note that the patch will skip the wrap-around detection if this is the case, in order to handle both cases in the userspace perf tool).
Almost agree, I read multiple times but have no idea what's the "both cases" in the last sentence.
Apologies for the later part was not clear. What I meant to say was that in the original patch for cs-etm, it seemed to handle both cases where AUX head might be monotonically and non-monotonically increasing, so we applied the same for the Arm SPE patch.
No worries, thanks for explanation.
Leo
Hi Leo,
On 06/10/2021 10:51, Leo Yan wrote:
On Wed, Oct 06, 2021 at 10:35:20AM +0100, German Gomez wrote:
[...]
So simply say, I think the head pointer monotonically increasing is the right thing to do in Arm SPE driver.
I will talk to James about how we can proceed on this.
Thanks!
I took this offline with James and, though it looks possible to patch the SPE driver to have a monotonically increasing head pointer in order to simplify the handling in the perf tool, it could be a breaking change for users of the perf_event_open syscall that currently rely on the way it works now.
An alternative way we considered to simplify the patch is to change the logic inside the find_snapshot callback so that it records the entire contents of the aux buffer every time.
What do you think?
Many thanks, German
Hi German,
On Mon, Oct 11, 2021 at 04:55:37PM +0100, German Gomez wrote:
Hi Leo,
On 06/10/2021 10:51, Leo Yan wrote:
On Wed, Oct 06, 2021 at 10:35:20AM +0100, German Gomez wrote:
[...]
So simply say, I think the head pointer monotonically increasing is the right thing to do in Arm SPE driver.
I will talk to James about how we can proceed on this.
Thanks!
I took this offline with James and, though it looks possible to patch the SPE driver to have a monotonically increasing head pointer in order to simplify the handling in the perf tool, it could be a breaking change for users of the perf_event_open syscall that currently rely on the way it works now.
Here I cannot create the connection between AUX head pointer and the breakage of calling perf_event_open().
Could you elaborate what's the reason the monotonical increasing head pointer will lead to the breakage for perf_event_open()?
An alternative way we considered to simplify the patch is to change the logic inside the find_snapshot callback so that it records the entire contents of the aux buffer every time.
What do you think?
We cannot do this way. If we send USR2 signal with very small interval, then it's possible the hardware trace data cannot fill the full of AUX buffer. You could use below commands for the testing and should can observe it produces small chunk trace data:
perf record -e arm_spe_0// -S -a -- dd if=/dev/zero of=/dev/null & PERFPID=$! sleep 1 kill -USR2 $PERFPID sleep .1 kill -USR2 $PERFPID
Thanks, Leo
On Thu, Sep 16, 2021 at 04:46:34PM +0100, German Gomez wrote:
[...]
+static int arm_spe_find_snapshot(struct auxtrace_record *itr, int idx,
struct auxtrace_mmap *mm, unsigned char *data,
u64 *head, u64 *old)
+{
- int err;
- bool wrapped;
- struct arm_spe_recording *ptr =
container_of(itr, struct arm_spe_recording, itr);
- /*
* Allocate memory to keep track of wrapping if this is the first
* time we deal with this *mm.
*/
- if (idx >= ptr->wrapped_cnt) {
err = arm_spe_alloc_wrapped_array(ptr, idx);
if (err)
return err;
- }
- /*
* Check to see if *head has wrapped around. If it hasn't only the
* amount of data between *head and *old is snapshot'ed to avoid
* bloating the perf.data file with zeros. But as soon as *head has
* wrapped around the entire size of the AUX ring buffer it taken.
*/
- wrapped = ptr->wrapped[idx];
- if (!wrapped && arm_spe_buffer_has_wrapped(data, mm->len, *head)) {
wrapped = true;
ptr->wrapped[idx] = true;
- }
- pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n",
__func__, idx, (size_t)*old, (size_t)*head, mm->len);
- /*
* No wrap has occurred, we can just use *head and *old.
*/
- if (!wrapped)
return 0;
- /*
* *head has wrapped around - adjust *head and *old to pickup the
* entire content of the AUX buffer.
*/
- if (*head >= mm->len) {
*old = *head - mm->len;
- } else {
*head += mm->len;
*old = *head - mm->len;
- }
- return 0;
+}
If run a test case (the test is pasted at the end of the reply), I can get quite different AUX trace data with passing different wait period before sending the first USR2 signal.
# sh test_arm_spe_snapshot.sh 2 Couldn't synthesize bpf events. stress: info: [5768] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 2.833 MB perf.data ]
# sh test_arm_spe_snapshot.sh 10 Couldn't synthesize bpf events. stress: info: [5776] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 24.356 MB perf.data ]
The first command passes argument '2' so the test will wait for 2 seconds before send USR2 signal for snapshot, and the perf data file is 2.833 MB (so this means the Arm SPE trace data is about 2MB) for three snapshots. In the second command, the argument '10' means it will wait for 10 seconds before sending the USR2 signals, and every time it records the trace data from the full AUX buffer (8MB), at the end it gets 24MB AUX trace data.
The issue happens in the second command, waiting for 10 seconds leads to the *full* AUX ring buffer is filled by Arm SPE, so the function arm_spe_buffer_has_wrapped() always return back true for this case. Afterwards, arm_spe_find_snapshot() doesn't respect the passed old header (from '*old') and assumes the trace data size is 'mm->len'.
To allow arm_spe_buffer_has_wrapped() to work properly, I think we need to clean up the top 8 bytes of the AUX buffer in Arm SPE driver when start the PMU event (please note, this change has an assumption that is meantioned in another email that suggests to remove redundant PERF_RECORD_AUX events so the function arm_spe_perf_aux_output_begin() is invoked only once when start PMU event, so we can use the top 8 bytes in AUX buffer to indicate trace is wrap around or not).
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index d44bcc29d99c..eb35f85d0efb 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -493,6 +493,16 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, if (limit) limit |= BIT(SYS_PMBLIMITR_EL1_E_SHIFT);
+ /* + * Cleanup the top 8 bytes for snapshot mode; these 8 bytes are + * used to indicate if trace data is wrap around if they are not + * zero. + */ + if (buf->snapshot) { + void *tail = buf->base + (buf->nr_pages << PAGE_SHIFT) - 8; + memset(tail, 0x0, 8); + } + limit += (u64)buf->base; base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); write_sysreg_s(base, SYS_PMBPTR_EL1);
Thanks, Leo
---8<---
#!/bin/sh
./perf record -e arm_spe/period=148576/u -C 0 -S -m8M,8M -- taskset --cpu-list 0 stress --cpu 1 &
PERFPID=$!
echo "sleep $1 seconds" > /sys/kernel/debug/tracing/trace_marker
# Wait for perf program sleep $1
# Send signal to snapshot trace data kill -USR2 $PERFPID sleep .03 kill -USR2 $PERFPID sleep .03 kill -USR2 $PERFPID
echo "Stop snapshot" > /sys/kernel/debug/tracing/trace_marker
kill $PERFPID wait $PERFPID
On Sun, Oct 17, 2021 at 08:05:46PM +0800, Leo Yan wrote:
[...]
To allow arm_spe_buffer_has_wrapped() to work properly, I think we need to clean up the top 8 bytes of the AUX buffer in Arm SPE driver when start the PMU event (please note, this change has an assumption that is meantioned in another email that suggests to remove redundant PERF_RECORD_AUX events so the function arm_spe_perf_aux_output_begin() is invoked only once when start PMU event, so we can use the top 8 bytes in AUX buffer to indicate trace is wrap around or not).
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index d44bcc29d99c..eb35f85d0efb 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -493,6 +493,16 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, if (limit) limit |= BIT(SYS_PMBLIMITR_EL1_E_SHIFT);
/*
* Cleanup the top 8 bytes for snapshot mode; these 8 bytes are
* used to indicate if trace data is wrap around if they are not
* zero.
*/
if (buf->snapshot) {
void *tail = buf->base + (buf->nr_pages << PAGE_SHIFT) - 8;
memset(tail, 0x0, 8);
Here need to add below code for flushing data cache:
flush_dcache_range((unsigned long)tail, (unsigned long)tail+8);
Sorry for spamming.
Leo
}
limit += (u64)buf->base; base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); write_sysreg_s(base, SYS_PMBPTR_EL1);
Hi Leo,
On 17/10/2021 13:05, Leo Yan wrote:
On Thu, Sep 16, 2021 at 04:46:34PM +0100, German Gomez wrote:
[...]
If run a test case (the test is pasted at the end of the reply), I can get quite different AUX trace data with passing different wait period before sending the first USR2 signal.
# sh test_arm_spe_snapshot.sh 2 Couldn't synthesize bpf events. stress: info: [5768] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 2.833 MB perf.data ]
# sh test_arm_spe_snapshot.sh 10 Couldn't synthesize bpf events. stress: info: [5776] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 24.356 MB perf.data ]
The first command passes argument '2' so the test will wait for 2 seconds before send USR2 signal for snapshot, and the perf data file is 2.833 MB (so this means the Arm SPE trace data is about 2MB) for three snapshots. In the second command, the argument '10' means it will wait for 10 seconds before sending the USR2 signals, and every time it records the trace data from the full AUX buffer (8MB), at the end it gets 24MB AUX trace data.
The issue happens in the second command, waiting for 10 seconds leads to the *full* AUX ring buffer is filled by Arm SPE, so the function arm_spe_buffer_has_wrapped() always return back true for this case. Afterwards, arm_spe_find_snapshot() doesn't respect the passed old header (from '*old') and assumes the trace data size is 'mm->len'.
Returning the entire contents of the buffer once the first wrap-around was detected was the intention of the patch, so I don't currently see it as wrong. What were the values you were expecting to see in the test?
If the handling of snapshot mode by the perf tool can be improved after upstreaming the changes to the driver, we could submit a followup patch after that has been fixed.
To allow arm_spe_buffer_has_wrapped() to work properly, I think we need to clean up the top 8 bytes of the AUX buffer in Arm SPE driver when start the PMU event (please note, this change has an assumption that is meantioned in another email that suggests to remove redundant PERF_RECORD_AUX events so the function arm_spe_perf_aux_output_begin() is invoked only once when start PMU event, so we can use the top 8 bytes in AUX buffer to indicate trace is wrap around or not).
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index d44bcc29d99c..eb35f85d0efb 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -493,6 +493,16 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, if (limit) limit |= BIT(SYS_PMBLIMITR_EL1_E_SHIFT);
/*
* Cleanup the top 8 bytes for snapshot mode; these 8 bytes are
* used to indicate if trace data is wrap around if they are not
* zero.
*/
if (buf->snapshot) {
void *tail = buf->base + (buf->nr_pages << PAGE_SHIFT) - 8;
memset(tail, 0x0, 8);
}
limit += (u64)buf->base; base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); write_sysreg_s(base, SYS_PMBPTR_EL1);
Thanks, Leo
I will try these and the other driver changes and discuss them with the team internally, thanks!
---8<---
#!/bin/sh
./perf record -e arm_spe/period=148576/u -C 0 -S -m8M,8M -- taskset --cpu-list 0 stress --cpu 1 &
PERFPID=$!
echo "sleep $1 seconds" > /sys/kernel/debug/tracing/trace_marker
# Wait for perf program sleep $1
# Send signal to snapshot trace data kill -USR2 $PERFPID sleep .03 kill -USR2 $PERFPID sleep .03 kill -USR2 $PERFPID
echo "Stop snapshot" > /sys/kernel/debug/tracing/trace_marker
kill $PERFPID wait $PERFPID
On Tue, Oct 19, 2021 at 06:34:24PM +0100, German Gomez wrote:
Hi Leo,
On 17/10/2021 13:05, Leo Yan wrote:
On Thu, Sep 16, 2021 at 04:46:34PM +0100, German Gomez wrote:
[...]
If run a test case (the test is pasted at the end of the reply), I can get quite different AUX trace data with passing different wait period before sending the first USR2 signal.
# sh test_arm_spe_snapshot.sh 2 Couldn't synthesize bpf events. stress: info: [5768] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 2.833 MB perf.data ]
# sh test_arm_spe_snapshot.sh 10 Couldn't synthesize bpf events. stress: info: [5776] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 24.356 MB perf.data ]
The first command passes argument '2' so the test will wait for 2 seconds before send USR2 signal for snapshot, and the perf data file is 2.833 MB (so this means the Arm SPE trace data is about 2MB) for three snapshots. In the second command, the argument '10' means it will wait for 10 seconds before sending the USR2 signals, and every time it records the trace data from the full AUX buffer (8MB), at the end it gets 24MB AUX trace data.
The issue happens in the second command, waiting for 10 seconds leads to the *full* AUX ring buffer is filled by Arm SPE, so the function arm_spe_buffer_has_wrapped() always return back true for this case. Afterwards, arm_spe_find_snapshot() doesn't respect the passed old header (from '*old') and assumes the trace data size is 'mm->len'.
Returning the entire contents of the buffer once the first wrap-around was detected was the intention of the patch, so I don't currently see it as wrong. What were the values you were expecting to see in the test?
I expect the second command takes three snapshots: the first time it should record AUX trace data with full buffer size (8MB) after waiting for 10 seconds, and later two times will take small AUX trace data since the interval (0.03s) is short and Arm SPE has not filled the full AUX buffer.
If the handling of snapshot mode by the perf tool can be improved after upstreaming the changes to the driver, we could submit a followup patch after that has been fixed.
Okay, I understand now the main concern is for kernel driver changes, this patch for perf tool is fine for me:
Reviewed-by: Leo Yan leo.yan@linaro.org Tested-by: Leo Yan leo.yan@linaro.org
[...]
I will try these and the other driver changes and discuss them with the team internally, thanks!
Thanks a lot!
Shell script test_arm_spe.sh has been added to test the recording of SPE tracing events in snapshot mode.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com --- tools/perf/tests/shell/test_arm_spe.sh | 91 ++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100755 tools/perf/tests/shell/test_arm_spe.sh
diff --git a/tools/perf/tests/shell/test_arm_spe.sh b/tools/perf/tests/shell/test_arm_spe.sh new file mode 100755 index 000000000000..9ed817e76f95 --- /dev/null +++ b/tools/perf/tests/shell/test_arm_spe.sh @@ -0,0 +1,91 @@ +#!/bin/sh +# Check Arm SPE trace data recording and synthesized samples + +# Uses the 'perf record' to record trace data of Arm SPE events; +# then verify if any SPE event samples are generated by SPE with +# 'perf script' and 'perf report' commands. + +# SPDX-License-Identifier: GPL-2.0 +# German Gomez german.gomez@arm.com, 2021 + +skip_if_no_arm_spe_event() { + perf list | egrep -q 'arm_spe_[0-9]+//' && return 0 + + # arm_spe event doesn't exist + return 2 +} + +skip_if_no_arm_spe_event || exit 2 + +perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) +glb_err=0 + +cleanup_files() +{ + rm -f ${perfdata} + trap - exit term int + kill -2 $$ # Forward sigint to parent + exit $glb_err +} + +trap cleanup_files exit term int + +arm_spe_report() { + if [ $2 != 0 ]; then + echo "$1: FAIL" + glb_err=$2 + else + echo "$1: PASS" + fi +} + +perf_script_samples() { + echo "Looking at perf.data file for dumping samples:" + + # from arm-spe.c/arm_spe_synth_events() + events="(ld1-miss|ld1-access|llc-miss|lld-access|tlb-miss|tlb-access|branch-miss|remote-access|memory)" + + # Below is an example of the samples dumping: + # dd 3048 [002] 1 l1d-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so) + # dd 3048 [002] 1 tlb-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so) + # dd 3048 [002] 1 memory: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so) + perf script -F,-time -i ${perfdata} 2>&1 | \ + egrep " +$1 +[0-9]+ .* +${events}:(.*:)? +" > /dev/null 2>&1 +} + +perf_report_samples() { + echo "Looking at perf.data file for reporting samples:" + + # Below is an example of the samples reporting: + # 73.04% 73.04% dd libc-2.27.so [.] _dl_addr + # 7.71% 7.71% dd libc-2.27.so [.] getenv + # 2.59% 2.59% dd ld-2.27.so [.] strcmp + perf report --stdio -i ${perfdata} 2>&1 | \ + egrep " +[0-9]+.[0-9]+% +[0-9]+.[0-9]+% +$1 " > /dev/null 2>&1 +} + +arm_spe_snapshot_test() { + echo "Recording trace with snapshot mode $perfdata" + perf record -o ${perfdata} -e arm_spe// -S \ + -- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 & + PERFPID=$! + + # Wait for perf program + sleep 1 + + # Send signal to snapshot trace data + kill -USR2 $PERFPID + + # Stop perf program + kill $PERFPID + wait $PERFPID + + perf_script_samples dd && + perf_report_samples dd + + err=$? + arm_spe_report "SPE snapshot testing" $err +} + +arm_spe_snapshot_test +exit $glb_err \ No newline at end of file
On Thu, Sep 16, 2021 at 04:46:35PM +0100, German Gomez wrote:
Shell script test_arm_spe.sh has been added to test the recording of SPE tracing events in snapshot mode.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com
tools/perf/tests/shell/test_arm_spe.sh | 91 ++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100755 tools/perf/tests/shell/test_arm_spe.sh
diff --git a/tools/perf/tests/shell/test_arm_spe.sh b/tools/perf/tests/shell/test_arm_spe.sh new file mode 100755 index 000000000000..9ed817e76f95 --- /dev/null +++ b/tools/perf/tests/shell/test_arm_spe.sh @@ -0,0 +1,91 @@ +#!/bin/sh +# Check Arm SPE trace data recording and synthesized samples
+# Uses the 'perf record' to record trace data of Arm SPE events; +# then verify if any SPE event samples are generated by SPE with +# 'perf script' and 'perf report' commands.
+# SPDX-License-Identifier: GPL-2.0 +# German Gomez german.gomez@arm.com, 2021
+skip_if_no_arm_spe_event() {
- perf list | egrep -q 'arm_spe_[0-9]+//' && return 0
- # arm_spe event doesn't exist
- return 2
+}
+skip_if_no_arm_spe_event || exit 2
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) +glb_err=0
+cleanup_files() +{
- rm -f ${perfdata}
- trap - exit term int
- kill -2 $$ # Forward sigint to parent
I understand you copy this code from Arm cs-etm testing, but I found the sentence 'kill -2 $$' will cause a failure at my side with the command:
root@ubuntu:/home/leoy/linux/tools/perf# ./perf test 85 -v 85: Check Arm SPE trace data recording and synthesized samples : --- start --- test child forked, pid 29053 Recording trace with snapshot mode /tmp/__perf_test.perf.data.uughb Looking at perf.data file for dumping samples: Looking at perf.data file for reporting samples: SPE snapshot testing: PASS test child finished with -1 ---- end ---- Check Arm SPE trace data recording and synthesized samples: FAILED!
I changed to use below code and looks it works for me:
if [[ "$1" == "int" ]]; then kill -SIGINT $$ fi if [[ "$1" == "term" ]]; then kill -SIGTERM $$ fi
Thanks, Leo
- exit $glb_err
+}
+trap cleanup_files exit term int
+arm_spe_report() {
- if [ $2 != 0 ]; then
echo "$1: FAIL"
glb_err=$2
- else
echo "$1: PASS"
- fi
+}
+perf_script_samples() {
- echo "Looking at perf.data file for dumping samples:"
- # from arm-spe.c/arm_spe_synth_events()
- events="(ld1-miss|ld1-access|llc-miss|lld-access|tlb-miss|tlb-access|branch-miss|remote-access|memory)"
- # Below is an example of the samples dumping:
- # dd 3048 [002] 1 l1d-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 tlb-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 memory: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- perf script -F,-time -i ${perfdata} 2>&1 | \
egrep " +$1 +[0-9]+ .* +${events}:(.*:)? +" > /dev/null 2>&1
+}
+perf_report_samples() {
- echo "Looking at perf.data file for reporting samples:"
- # Below is an example of the samples reporting:
- # 73.04% 73.04% dd libc-2.27.so [.] _dl_addr
- # 7.71% 7.71% dd libc-2.27.so [.] getenv
- # 2.59% 2.59% dd ld-2.27.so [.] strcmp
- perf report --stdio -i ${perfdata} 2>&1 | \
egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " > /dev/null 2>&1
+}
+arm_spe_snapshot_test() {
- echo "Recording trace with snapshot mode $perfdata"
- perf record -o ${perfdata} -e arm_spe// -S \
-- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 &
- PERFPID=$!
- # Wait for perf program
- sleep 1
- # Send signal to snapshot trace data
- kill -USR2 $PERFPID
- # Stop perf program
- kill $PERFPID
- wait $PERFPID
- perf_script_samples dd &&
- perf_report_samples dd
- err=$?
- arm_spe_report "SPE snapshot testing" $err
+}
+arm_spe_snapshot_test +exit $glb_err \ No newline at end of file -- 2.17.1
Hi Leo,
I'm unable to reproduce. I've tried on top of the most recent perf/core branch but I still get exit code 0 consistently:
$ git log commit be8ecc57f180415e8a7c1cc5620c5236be2a7e56 (grafted, origin/perf/core) Author: Tony Garnock-Jones tonyg@leastfixedpoint.com Date: Thu Sep 16 14:09:39 2021 +0200
$ ./perf test 88 -v Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 88: Check Arm SPE trace data recording and synthesized samples : --- start --- test child forked, pid 18700 Recording trace with snapshot mode /tmp/__perf_test.perf.data.xgsUt Looking at perf.data file for dumping samples: Looking at perf.data file for reporting samples: SPE snapshot testing: PASS test child finished with 0 ---- end ---- Check Arm SPE trace data recording and synthesized samples: Ok
On 20/10/2021 14:13, Leo Yan wrote:
On Thu, Sep 16, 2021 at 04:46:35PM +0100, German Gomez wrote:
Shell script test_arm_spe.sh has been added to test the recording of SPE tracing events in snapshot mode.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com
tools/perf/tests/shell/test_arm_spe.sh | 91 ++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100755 tools/perf/tests/shell/test_arm_spe.sh
diff --git a/tools/perf/tests/shell/test_arm_spe.sh b/tools/perf/tests/shell/test_arm_spe.sh new file mode 100755 index 000000000000..9ed817e76f95 --- /dev/null +++ b/tools/perf/tests/shell/test_arm_spe.sh @@ -0,0 +1,91 @@ +#!/bin/sh +# Check Arm SPE trace data recording and synthesized samples
+# Uses the 'perf record' to record trace data of Arm SPE events; +# then verify if any SPE event samples are generated by SPE with +# 'perf script' and 'perf report' commands.
+# SPDX-License-Identifier: GPL-2.0 +# German Gomez german.gomez@arm.com, 2021
+skip_if_no_arm_spe_event() {
- perf list | egrep -q 'arm_spe_[0-9]+//' && return 0
- # arm_spe event doesn't exist
- return 2
+}
+skip_if_no_arm_spe_event || exit 2
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) +glb_err=0
+cleanup_files() +{
- rm -f ${perfdata}
- trap - exit term int
- kill -2 $$ # Forward sigint to parent
I understand you copy this code from Arm cs-etm testing, but I found the sentence 'kill -2 $$' will cause a failure at my side with the command:
root@ubuntu:/home/leoy/linux/tools/perf# ./perf test 85 -v 85: Check Arm SPE trace data recording and synthesized samples : --- start --- test child forked, pid 29053 Recording trace with snapshot mode /tmp/__perf_test.perf.data.uughb Looking at perf.data file for dumping samples: Looking at perf.data file for reporting samples: SPE snapshot testing: PASS test child finished with -1 ---- end ---- Check Arm SPE trace data recording and synthesized samples: FAILED!
I changed to use below code and looks it works for me:
if [[ "$1" == "int" ]]; then kill -SIGINT $$ fi if [[ "$1" == "term" ]]; then kill -SIGTERM $$ fi
Thanks, Leo
- exit $glb_err
+}
+trap cleanup_files exit term int
+arm_spe_report() {
- if [ $2 != 0 ]; then
echo "$1: FAIL"
glb_err=$2
- else
echo "$1: PASS"
- fi
+}
+perf_script_samples() {
- echo "Looking at perf.data file for dumping samples:"
- # from arm-spe.c/arm_spe_synth_events()
- events="(ld1-miss|ld1-access|llc-miss|lld-access|tlb-miss|tlb-access|branch-miss|remote-access|memory)"
- # Below is an example of the samples dumping:
- # dd 3048 [002] 1 l1d-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 tlb-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 memory: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- perf script -F,-time -i ${perfdata} 2>&1 | \
egrep " +$1 +[0-9]+ .* +${events}:(.*:)? +" > /dev/null 2>&1
+}
+perf_report_samples() {
- echo "Looking at perf.data file for reporting samples:"
- # Below is an example of the samples reporting:
- # 73.04% 73.04% dd libc-2.27.so [.] _dl_addr
- # 7.71% 7.71% dd libc-2.27.so [.] getenv
- # 2.59% 2.59% dd ld-2.27.so [.] strcmp
- perf report --stdio -i ${perfdata} 2>&1 | \
egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " > /dev/null 2>&1
+}
+arm_spe_snapshot_test() {
- echo "Recording trace with snapshot mode $perfdata"
- perf record -o ${perfdata} -e arm_spe// -S \
-- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 &
- PERFPID=$!
- # Wait for perf program
- sleep 1
- # Send signal to snapshot trace data
- kill -USR2 $PERFPID
- # Stop perf program
- kill $PERFPID
- wait $PERFPID
- perf_script_samples dd &&
- perf_report_samples dd
- err=$?
- arm_spe_report "SPE snapshot testing" $err
+}
+arm_spe_snapshot_test +exit $glb_err \ No newline at end of file -- 2.17.1
On 20/10/2021 14:13, Leo Yan wrote:
On Thu, Sep 16, 2021 at 04:46:35PM +0100, German Gomez wrote:
Shell script test_arm_spe.sh has been added to test the recording of SPE tracing events in snapshot mode.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com
tools/perf/tests/shell/test_arm_spe.sh | 91 ++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100755 tools/perf/tests/shell/test_arm_spe.sh
diff --git a/tools/perf/tests/shell/test_arm_spe.sh b/tools/perf/tests/shell/test_arm_spe.sh new file mode 100755 index 000000000000..9ed817e76f95 --- /dev/null +++ b/tools/perf/tests/shell/test_arm_spe.sh @@ -0,0 +1,91 @@ +#!/bin/sh +# Check Arm SPE trace data recording and synthesized samples
+# Uses the 'perf record' to record trace data of Arm SPE events; +# then verify if any SPE event samples are generated by SPE with +# 'perf script' and 'perf report' commands.
+# SPDX-License-Identifier: GPL-2.0 +# German Gomez german.gomez@arm.com, 2021
+skip_if_no_arm_spe_event() {
- perf list | egrep -q 'arm_spe_[0-9]+//' && return 0
- # arm_spe event doesn't exist
- return 2
+}
+skip_if_no_arm_spe_event || exit 2
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) +glb_err=0
+cleanup_files() +{
- rm -f ${perfdata}
- trap - exit term int
- kill -2 $$ # Forward sigint to parent
I understand you copy this code from Arm cs-etm testing, but I found the sentence 'kill -2 $$' will cause a failure at my side with the command:
root@ubuntu:/home/leoy/linux/tools/perf# ./perf test 85 -v 85: Check Arm SPE trace data recording and synthesized samples : --- start --- test child forked, pid 29053 Recording trace with snapshot mode /tmp/__perf_test.perf.data.uughb Looking at perf.data file for dumping samples: Looking at perf.data file for reporting samples: SPE snapshot testing: PASS test child finished with -1 ---- end ---- Check Arm SPE trace data recording and synthesized samples: FAILED!
I changed to use below code and looks it works for me:
if [[ "$1" == "int" ]]; then kill -SIGINT $$ fi if [[ "$1" == "term" ]]; then kill -SIGTERM $$ fi
Thanks, Leo
This is quite interesting. It looks like the issue is caused by the update from dash 0.5.8 on Ubuntu 18 to dash 0.5.10 on Ubuntu 20. Specifically the commit that causes the issue is:
commit 9e5cd41d9605e4caaac3aacdc0482f6ee220a298 Author: Herbert Xu herbert@gondor.apana.org.au Date: Mon May 7 00:40:34 2018 +0800
jobs - Do not block when waiting on SIGCHLD
Because of the nature of SIGCHLD, the process may have already been waited on and therefore we must be prepared for the case that wait may block. So ensure that it doesn't by using WNOHANG.
Furthermore, multiple jobs may have exited when gotsigchld is set. Therefore we need to wait until there are no zombies left.
Lastly, waitforjob needs to be called with interrupts off and the original patch broke that.
Fixes: 03876c0743a5 ("eval: Reap zombies after built-in...") Signed-off-by: Herbert Xu herbert@gondor.apana.org.au
This also means that the Coresight shell test will not be working anymore because I added the same trap to it so that it could be run in a loop. I'm going to compare the bahaviour to bash to see which one is doing the right thing and what the correct change to make to fix it is. Or a bug needs to be reported.
Thanks James
- exit $glb_err
+}
+trap cleanup_files exit term int
+arm_spe_report() {
- if [ $2 != 0 ]; then
echo "$1: FAIL"
glb_err=$2
- else
echo "$1: PASS"
- fi
+}
+perf_script_samples() {
- echo "Looking at perf.data file for dumping samples:"
- # from arm-spe.c/arm_spe_synth_events()
- events="(ld1-miss|ld1-access|llc-miss|lld-access|tlb-miss|tlb-access|branch-miss|remote-access|memory)"
- # Below is an example of the samples dumping:
- # dd 3048 [002] 1 l1d-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 tlb-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 memory: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- perf script -F,-time -i ${perfdata} 2>&1 | \
egrep " +$1 +[0-9]+ .* +${events}:(.*:)? +" > /dev/null 2>&1
+}
+perf_report_samples() {
- echo "Looking at perf.data file for reporting samples:"
- # Below is an example of the samples reporting:
- # 73.04% 73.04% dd libc-2.27.so [.] _dl_addr
- # 7.71% 7.71% dd libc-2.27.so [.] getenv
- # 2.59% 2.59% dd ld-2.27.so [.] strcmp
- perf report --stdio -i ${perfdata} 2>&1 | \
egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " > /dev/null 2>&1
+}
+arm_spe_snapshot_test() {
- echo "Recording trace with snapshot mode $perfdata"
- perf record -o ${perfdata} -e arm_spe// -S \
-- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 &
- PERFPID=$!
- # Wait for perf program
- sleep 1
- # Send signal to snapshot trace data
- kill -USR2 $PERFPID
- # Stop perf program
- kill $PERFPID
- wait $PERFPID
- perf_script_samples dd &&
- perf_report_samples dd
- err=$?
- arm_spe_report "SPE snapshot testing" $err
+}
+arm_spe_snapshot_test +exit $glb_err \ No newline at end of file -- 2.17.1
On 02/11/2021 14:07, James Clark wrote:
On 20/10/2021 14:13, Leo Yan wrote:
On Thu, Sep 16, 2021 at 04:46:35PM +0100, German Gomez wrote:
Shell script test_arm_spe.sh has been added to test the recording of SPE tracing events in snapshot mode.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com
tools/perf/tests/shell/test_arm_spe.sh | 91 ++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100755 tools/perf/tests/shell/test_arm_spe.sh
diff --git a/tools/perf/tests/shell/test_arm_spe.sh b/tools/perf/tests/shell/test_arm_spe.sh new file mode 100755 index 000000000000..9ed817e76f95 --- /dev/null +++ b/tools/perf/tests/shell/test_arm_spe.sh @@ -0,0 +1,91 @@ +#!/bin/sh +# Check Arm SPE trace data recording and synthesized samples
+# Uses the 'perf record' to record trace data of Arm SPE events; +# then verify if any SPE event samples are generated by SPE with +# 'perf script' and 'perf report' commands.
+# SPDX-License-Identifier: GPL-2.0 +# German Gomez german.gomez@arm.com, 2021
+skip_if_no_arm_spe_event() {
- perf list | egrep -q 'arm_spe_[0-9]+//' && return 0
- # arm_spe event doesn't exist
- return 2
+}
+skip_if_no_arm_spe_event || exit 2
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) +glb_err=0
+cleanup_files() +{
- rm -f ${perfdata}
- trap - exit term int
- kill -2 $$ # Forward sigint to parent
I understand you copy this code from Arm cs-etm testing, but I found the sentence 'kill -2 $$' will cause a failure at my side with the command:
root@ubuntu:/home/leoy/linux/tools/perf# ./perf test 85 -v 85: Check Arm SPE trace data recording and synthesized samples : --- start --- test child forked, pid 29053 Recording trace with snapshot mode /tmp/__perf_test.perf.data.uughb Looking at perf.data file for dumping samples: Looking at perf.data file for reporting samples: SPE snapshot testing: PASS test child finished with -1 ---- end ---- Check Arm SPE trace data recording and synthesized samples: FAILED!
I changed to use below code and looks it works for me:
if [[ "$1" == "int" ]]; then kill -SIGINT $$ fi if [[ "$1" == "term" ]]; then kill -SIGTERM $$ fi
Thanks, Leo
This is quite interesting. It looks like the issue is caused by the update from dash 0.5.8 on Ubuntu 18 to dash 0.5.10 on Ubuntu 20. Specifically the commit that causes the issue is:
commit 9e5cd41d9605e4caaac3aacdc0482f6ee220a298 Author: Herbert Xu herbert@gondor.apana.org.au Date: Mon May 7 00:40:34 2018 +0800
jobs - Do not block when waiting on SIGCHLD
Because of the nature of SIGCHLD, the process may have already been waited on and therefore we must be prepared for the case that wait may block. So ensure that it doesn't by using WNOHANG. Furthermore, multiple jobs may have exited when gotsigchld is set. Therefore we need to wait until there are no zombies left. Lastly, waitforjob needs to be called with interrupts off and the original patch broke that. Fixes: 03876c0743a5 ("eval: Reap zombies after built-in...") Signed-off-by: Herbert Xu herbert@gondor.apana.org.au
This also means that the Coresight shell test will not be working anymore because I added the same trap to it so that it could be run in a loop. I'm going to compare the bahaviour to bash to see which one is doing the right thing and what the correct change to make to fix it is. Or a bug needs to be reported.
Thanks James
Ok, it seems like I was relying on buggy dash behaviour for my original change. Even with this:
if [[ "$1" == "int" ]]; then kill -SIGINT $$ fi if [[ "$1" == "term" ]]; then kill -SIGTERM $$ fi
it still doesn't allow you to break out of running it in a while loop. This is only because of the exit code, rather than any kind of signal propagation. Actually it's possible to stop it with Ctrl-\ rather than Ctrl-C, and that doesn't require any extra handling in the script.
For that reason I'm happy to go with Leo's original suggestion when I first added this which was to not have any extra kill at all.
Another fix could be this, but I'm not too keen on it because I don't think any other tests behave like this:
[ "$1" = "int" ] || exit 1 [ "$1" = "term" ] || exit 1
- exit $glb_err
+}
+trap cleanup_files exit term int
+arm_spe_report() {
- if [ $2 != 0 ]; then
echo "$1: FAIL"
glb_err=$2
- else
echo "$1: PASS"
- fi
+}
+perf_script_samples() {
- echo "Looking at perf.data file for dumping samples:"
- # from arm-spe.c/arm_spe_synth_events()
- events="(ld1-miss|ld1-access|llc-miss|lld-access|tlb-miss|tlb-access|branch-miss|remote-access|memory)"
- # Below is an example of the samples dumping:
- # dd 3048 [002] 1 l1d-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 tlb-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 memory: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- perf script -F,-time -i ${perfdata} 2>&1 | \
egrep " +$1 +[0-9]+ .* +${events}:(.*:)? +" > /dev/null 2>&1
+}
+perf_report_samples() {
- echo "Looking at perf.data file for reporting samples:"
- # Below is an example of the samples reporting:
- # 73.04% 73.04% dd libc-2.27.so [.] _dl_addr
- # 7.71% 7.71% dd libc-2.27.so [.] getenv
- # 2.59% 2.59% dd ld-2.27.so [.] strcmp
- perf report --stdio -i ${perfdata} 2>&1 | \
egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " > /dev/null 2>&1
+}
+arm_spe_snapshot_test() {
- echo "Recording trace with snapshot mode $perfdata"
- perf record -o ${perfdata} -e arm_spe// -S \
-- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 &
- PERFPID=$!
- # Wait for perf program
- sleep 1
- # Send signal to snapshot trace data
- kill -USR2 $PERFPID
- # Stop perf program
- kill $PERFPID
- wait $PERFPID
- perf_script_samples dd &&
- perf_report_samples dd
- err=$?
- arm_spe_report "SPE snapshot testing" $err
+}
+arm_spe_snapshot_test +exit $glb_err \ No newline at end of file -- 2.17.1
Hi James, Leo,
Thank you for testing the patch.
On 02/11/2021 15:37, James Clark wrote:
[...] Ok, it seems like I was relying on buggy dash behaviour for my original change. Even with this:
if [[ "$1" == "int" ]]; then kill -SIGINT $$ fi if [[ "$1" == "term" ]]; then kill -SIGTERM $$ fi
it still doesn't allow you to break out of running it in a while loop. This is only because of the exit code, rather than any kind of signal propagation. Actually it's possible to stop it with Ctrl-\ rather than Ctrl-C, and that doesn't require any extra handling in the script.
For that reason I'm happy to go with Leo's original suggestion when I first added this which was to not have any extra kill at all.
Thanks for debugging the issue, I think I will consider this fix in the re-submission.
Thanks, German
Another fix could be this, but I'm not too keen on it because I don't think any other tests behave like this:
[ "$1" = "int" ] || exit 1 [ "$1" = "term" ] || exit 1
- exit $glb_err
+}
+trap cleanup_files exit term int
+arm_spe_report() {
- if [ $2 != 0 ]; then
echo "$1: FAIL"
glb_err=$2
- else
echo "$1: PASS"
- fi
+}
+perf_script_samples() {
- echo "Looking at perf.data file for dumping samples:"
- # from arm-spe.c/arm_spe_synth_events()
- events="(ld1-miss|ld1-access|llc-miss|lld-access|tlb-miss|tlb-access|branch-miss|remote-access|memory)"
- # Below is an example of the samples dumping:
- # dd 3048 [002] 1 l1d-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 tlb-access: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- # dd 3048 [002] 1 memory: ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
- perf script -F,-time -i ${perfdata} 2>&1 | \
egrep " +$1 +[0-9]+ .* +${events}:(.*:)? +" > /dev/null 2>&1
+}
+perf_report_samples() {
- echo "Looking at perf.data file for reporting samples:"
- # Below is an example of the samples reporting:
- # 73.04% 73.04% dd libc-2.27.so [.] _dl_addr
- # 7.71% 7.71% dd libc-2.27.so [.] getenv
- # 2.59% 2.59% dd ld-2.27.so [.] strcmp
- perf report --stdio -i ${perfdata} 2>&1 | \
egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " > /dev/null 2>&1
+}
+arm_spe_snapshot_test() {
- echo "Recording trace with snapshot mode $perfdata"
- perf record -o ${perfdata} -e arm_spe// -S \
-- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 &
- PERFPID=$!
- # Wait for perf program
- sleep 1
- # Send signal to snapshot trace data
- kill -USR2 $PERFPID
- # Stop perf program
- kill $PERFPID
- wait $PERFPID
- perf_script_samples dd &&
- perf_report_samples dd
- err=$?
- arm_spe_report "SPE snapshot testing" $err
+}
+arm_spe_snapshot_test +exit $glb_err \ No newline at end of file -- 2.17.1
On Thu, Sep 16, 2021 at 04:46:31PM +0100, German Gomez wrote:
From: Andrew Kilroy andrew.kilroy@arm.com
Since the size is already printed earlier in hex, print the same data using the same format, in hex.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: Andrew Kilroy andrew.kilroy@arm.com Signed-off-by: German Gomez german.gomez@arm.com
Reviewed-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f323adb1af85..4f672f7d008c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -537,7 +537,7 @@ static void cs_etm__dump_event(struct cs_etm_queue *etmq, fprintf(stdout, "\n"); color_fprintf(stdout, color,
". ... CoreSight %s Trace data: size %zu bytes\n",
". ... CoreSight %s Trace data: size %#zx bytes\n", cs_etm_decoder__get_name(etmq->decoder), buffer->size);
do { -- 2.17.1
Hi German,
On Thu, Sep 16, 2021 at 04:46:31PM +0100, German Gomez wrote:
From: Andrew Kilroy andrew.kilroy@arm.com
Since the size is already printed earlier in hex, print the same data using the same format, in hex.
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: Andrew Kilroy andrew.kilroy@arm.com Signed-off-by: German Gomez german.gomez@arm.com
tools/perf/util/cs-etm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f323adb1af85..4f672f7d008c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -537,7 +537,7 @@ static void cs_etm__dump_event(struct cs_etm_queue *etmq, fprintf(stdout, "\n"); color_fprintf(stdout, color,
". ... CoreSight %s Trace data: size %zu bytes\n",
". ... CoreSight %s Trace data: size %#zx bytes\n",
Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org
A couple of things to improve for your next interactions with the Linux community:
1) Using a cover letter, even for small changes, is always a good idea. 2) RB tags should be picked up publicly rather than done internally and added to a patchset. 3) Keep patches semantically grouped. Here patches 04 and 05 have nothing to do with 01, 02 and 03.
Moreover Arnaldo queues changes to the perf tools but I don't see him CC'ed to this patchset. As such he will not see your work. Ask James about how to proceed when submitting patches to the perf tools.
Thanks, Mathieu
cs_etm_decoder__get_name(etmq->decoder), buffer->size);
do { -- 2.17.1
Hi Mathieu,
Thanks for your feedback. I will keep these points in mind for future submissions.
On 23/09/2021 17:24, Mathieu Poirier wrote:
Hi German,
On Thu, Sep 16, 2021 at 04:46:31PM +0100, German Gomez wrote:
[...]
Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org
A couple of things to improve for your next interactions with the Linux community:
- Using a cover letter, even for small changes, is always a good idea.
- RB tags should be picked up publicly rather than done internally and added to
a patchset. 3) Keep patches semantically grouped. Here patches 04 and 05 have nothing to do with 01, 02 and 03.
Did you perhaps mean separating 01 and 02 from the rest? I grouped 03 to 05 because they were related to snapshot mode.
Thanks, German
Moreover Arnaldo queues changes to the perf tools but I don't see him CC'ed to this patchset. As such he will not see your work. Ask James about how to proceed when submitting patches to the perf tools.
Thanks, Mathieu
cs_etm_decoder__get_name(etmq->decoder), buffer->size);
do { -- 2.17.1
On Thu, Sep 30, 2021 at 01:09:16PM +0100, German Gomez wrote:
Hi Mathieu,
Thanks for your feedback. I will keep these points in mind for future submissions.
On 23/09/2021 17:24, Mathieu Poirier wrote:
Hi German,
On Thu, Sep 16, 2021 at 04:46:31PM +0100, German Gomez wrote:
[...]
Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org
A couple of things to improve for your next interactions with the Linux community:
- Using a cover letter, even for small changes, is always a good idea.
- RB tags should be picked up publicly rather than done internally and added to
a patchset. 3) Keep patches semantically grouped. Here patches 04 and 05 have nothing to do with 01, 02 and 03.
Did you perhaps mean separating 01 and 02 from the rest? I grouped 03 to 05 because they were related to snapshot mode.
Yes - you are correct. It should have been 01 and 02 in one set and the rest in another set.
Thanks, German
Moreover Arnaldo queues changes to the perf tools but I don't see him CC'ed to this patchset. As such he will not see your work. Ask James about how to proceed when submitting patches to the perf tools.
Thanks, Mathieu
cs_etm_decoder__get_name(etmq->decoder), buffer->size);
do {
2.17.1