This is the second revision of a set that fix miscellaneous problems preventing snapshot mode from working properly when CoreSight events are used.
It applies cleanly on the coresight next branch[1] and will be posted again when 5.2-rc1 comes out.
Best regards, Mathieu
[1]. https://git.linaro.org/kernel/coresight.git/ branch next
Changes for V2: * Drop requirement to make the perf AUX buffer the same size as the sink buffers. * Re-worked the user space algorithm to find '*head' and '*old". * Fixed typo in changelogs (Leo).
Mathieu Poirier (6): coresight: etb10: Properly set AUX buffer head in snapshot mode coresight: tmc-etr: Properly set AUX buffer head in snapshot mode coresight: tmc-etf: Properly set AUX buffer head in snapshot mode coresight: tmc-etf: Fix snapshot mode update function coresight: perf: Don't set the truncated flag in snapshot mode perf tools: Properly set the value of 'old' and 'head' in snapshot mode
drivers/hwtracing/coresight/coresight-etb10.c | 21 ++- .../hwtracing/coresight/coresight-tmc-etf.c | 28 ++-- .../hwtracing/coresight/coresight-tmc-etr.c | 19 ++- tools/perf/arch/arm/util/cs-etm.c | 124 +++++++++++++++++- 4 files changed, 165 insertions(+), 27 deletions(-)
Unify amongst sink drivers how the AUX ring buffer head is communicated to user space. That way the same algorithm in cs_etm_find_snapshot() can be used to determine where the latest data is and how much of it to access.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- drivers/hwtracing/coresight/coresight-etb10.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..60e753b1768d 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -548,13 +548,14 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER);
/* - * In snapshot mode we have to update the handle->head to point - * to the new location. + * In snapshot mode we simply increment the head by the number of byte + * that were written. User space function cs_etm_find_snapshot() will + * figure out how many bytes to get from the AUX buffer based on the + * position of the head. */ - if (buf->snapshot) { - handle->head = (cur * PAGE_SIZE) + offset; - to_read = buf->nr_pages << PAGE_SHIFT; - } + if (buf->snapshot) + handle->head += to_read; + __etb_enable_hw(drvdata); CS_LOCK(drvdata->base); out:
Hi Mathieu,
On 14/05/2019 20:40, Mathieu Poirier wrote:
Unify amongst sink drivers how the AUX ring buffer head is communicated to user space. That way the same algorithm in cs_etm_find_snapshot()
I would leave the userspace tool's function name out of the commit description and the comment below. We could instead say: "That way the same algorithm can be used by the userspace tool to determine the position and the size of the latest snapshot data."
can be used to determine where the latest data is and how much of it to access.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..60e753b1768d 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -548,13 +548,14 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER); /*
* In snapshot mode we have to update the handle->head to point
* to the new location.
* In snapshot mode we simply increment the head by the number of byte
* that were written. User space function cs_etm_find_snapshot() will
* figure out how many bytes to get from the AUX buffer based on the
*/* position of the head.
- if (buf->snapshot) {
handle->head = (cur * PAGE_SIZE) + offset;
to_read = buf->nr_pages << PAGE_SHIFT;
- }
- if (buf->snapshot)
handle->head += to_read;
- __etb_enable_hw(drvdata); CS_LOCK(drvdata->base); out:
Otherwise:
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com
Good day Suzuki,
On Wed, 15 May 2019 at 03:45, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi Mathieu,
On 14/05/2019 20:40, Mathieu Poirier wrote:
Unify amongst sink drivers how the AUX ring buffer head is communicated to user space. That way the same algorithm in cs_etm_find_snapshot()
I would leave the userspace tool's function name out of the commit description and the comment below. We could instead say: "That way the same algorithm can be used by the userspace tool to determine the position and the size of the latest snapshot data."
I purposely added the name of the function there so that people can quickly find it and avoid any misunderstanding about the code in question. But I also have the same information as a comment in the code, which should be sufficient. I'll fix it.
can be used to determine where the latest data is and how much of it to access.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..60e753b1768d 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -548,13 +548,14 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER);
/*
* In snapshot mode we have to update the handle->head to point
* to the new location.
* In snapshot mode we simply increment the head by the number of byte
* that were written. User space function cs_etm_find_snapshot() will
* figure out how many bytes to get from the AUX buffer based on the
* position of the head. */
if (buf->snapshot) {
handle->head = (cur * PAGE_SIZE) + offset;
to_read = buf->nr_pages << PAGE_SHIFT;
}
if (buf->snapshot)
handle->head += to_read;
out:__etb_enable_hw(drvdata); CS_LOCK(drvdata->base);
Otherwise:
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com
Is this for all the kernel space patches or just this one?
Hi Mathieu,
On 15/05/2019 15:28, Mathieu Poirier wrote:
Good day Suzuki,
On Wed, 15 May 2019 at 03:45, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi Mathieu,
On 14/05/2019 20:40, Mathieu Poirier wrote:
Unify amongst sink drivers how the AUX ring buffer head is communicated to user space. That way the same algorithm in cs_etm_find_snapshot()
I would leave the userspace tool's function name out of the commit description and the comment below. We could instead say: "That way the same algorithm can be used by the userspace tool to determine the position and the size of the latest snapshot data."
I purposely added the name of the function there so that people can quickly find it and avoid any misunderstanding about the code in question. But I also have the same information as a comment in the code, which should be sufficient. I'll fix it.
No need to resend the series as it is just the comment and description. You may fix up both before committing.
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..60e753b1768d 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -548,13 +548,14 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER);
/*
* In snapshot mode we have to update the handle->head to point
* to the new location.
* In snapshot mode we simply increment the head by the number of byte
* that were written. User space function cs_etm_find_snapshot() will
* figure out how many bytes to get from the AUX buffer based on the
* position of the head. */
if (buf->snapshot) {
handle->head = (cur * PAGE_SIZE) + offset;
to_read = buf->nr_pages << PAGE_SHIFT;
}
if (buf->snapshot)
handle->head += to_read;
out:__etb_enable_hw(drvdata); CS_LOCK(drvdata->base);
Otherwise:
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com
Is this for all the kernel space patches or just this one?
It was initially for the first patch. But now I realize that all the other patches are of similar approach. I will add a different tag for better tracking.
Cheers Suzuki
Unify amongst sink drivers how the AUX ring buffer head is communicated to user space. That way the same algorithm in cs_etm_find_snapshot() can be used to determine where the latest data is and how much of it to access.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index df6e4b0b84e9..cc8401c76c39 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1501,14 +1501,13 @@ tmc_update_etr_buffer(struct coresight_device *csdev, tmc_etr_sync_perf_buffer(etr_perf);
/* - * Update handle->head in snapshot mode. Also update the size to the - * hardware buffer size if there was an overflow. + * In snapshot mode we simply increment the head by the number of byte + * that were written. User space function cs_etm_find_snapshot() will + * figure out how many bytes to get from the AUX buffer based on the + * position of the head. */ - if (etr_perf->snapshot) { + if (etr_perf->snapshot) handle->head += size; - if (etr_buf->full) - size = etr_buf->size; - }
lost |= etr_buf->full; out:
Unify amongst sink drivers how the AUX ring buffer head is communicated to user space. That way the same algorithm in cs_etm_find_snapshot() can be used to determine where the latest data is and how much of it to access.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 2527b5d3b65e..d026bd04a6af 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -542,11 +542,15 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, } }
- /* In snapshot mode we have to update the head */ - if (buf->snapshot) { - handle->head = (cur * PAGE_SIZE) + offset; - to_read = buf->nr_pages << PAGE_SHIFT; - } + /* + * In snapshot mode we simply increment the head by the number of byte + * that were written. User space function cs_etm_find_snapshot() will + * figure out how many bytes to get from the AUX buffer based on the + * position of the head. + */ + if (buf->snapshot) + handle->head += to_read; + CS_LOCK(drvdata->base); out: spin_unlock_irqrestore(&drvdata->spinlock, flags);
When working in snapshot mode function perf_aux_output_begin() does not set the handle->size because the size is expected to be deduced by the placement of the "head" and "old" pointers in user space. As such there is no point in trying to adjust the amount of data to copy to the ring buffer.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Reviewed-by: Leo Yan leo.yan@linaro.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d026bd04a6af..31d41e2ad955 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -477,9 +477,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, /* * The TMC RAM buffer may be bigger than the space available in the * perf ring buffer (handle->size). If so advance the RRP so that we - * get the latest trace data. + * get the latest trace data. In snapshot mode none of that matters + * since we are expected to clobber stale data in favour of the latest + * traces. */ - if (to_read > handle->size) { + if (!buf->snapshot && to_read > handle->size) { u32 mask = 0;
/*
This patch avoids setting the truncated flag when operating in snapshot mode since the trace buffer is expected to be truncated and discontinuous from one snapshot to another. Moreover when the truncated flag is set the perf core stops enabling the event, waiting for user space to consume the data. In snapshot mode this is clearly not what we want since it results in stale data.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- drivers/hwtracing/coresight/coresight-etb10.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 +++++++- 3 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 60e753b1768d..516d67cd7759 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -512,7 +512,13 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, lost = true; }
- if (lost) + /* + * Don't set the TRUNCATED flag in snapshot mode because 1) the + * captured buffer is expected to be truncated and 2) a full buffer + * prevents the event from being re-enabled by the perf core, + * resulting in stale data being send to user space. + */ + if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
/* finally tell HW where we want to start reading from */ diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 31d41e2ad955..bd5f3b57eebd 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -518,7 +518,13 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, lost = true; }
- if (lost) + /* + * Don't set the TRUNCATED flag in snapshot mode because 1) the + * captured buffer is expected to be truncated and 2) a full buffer + * prevents the event from being re-enabled by the perf core, + * resulting in stale data being send to user space. + */ + if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
cur = buf->cur; diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index cc8401c76c39..1fc3db8045e1 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1511,7 +1511,13 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
lost |= etr_buf->full; out: - if (lost) + /* + * Don't set the TRUNCATED flag in snapshot mode because 1) the + * captured buffer is expected to be truncated and 2) a full buffer + * prevents the event from being re-enabled by the perf core, + * resulting in stale data being send to user space. + */ + if (!etr_perf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); return size; }
On 14/05/2019 20:40, Mathieu Poirier wrote:
This patch avoids setting the truncated flag when operating in snapshot mode since the trace buffer is expected to be truncated and discontinuous from one snapshot to another. Moreover when the truncated flag is set the perf core stops enabling the event, waiting for user space to consume the data. In snapshot mode this is clearly not what we want since it results in stale data.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
For patches 2-5:
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com
This patch adds the necessay intelligence to properly compute the value of 'old' and 'head' when operating in snapshot mode. That way we can get the latest information in the AUX buffer and be compatible with the generic AUX ring buffer mechanic.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/arch/arm/util/cs-etm.c | 124 ++++++++++++++++++++++++++++-- 1 file changed, 119 insertions(+), 5 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 911426721170..3734e3fd18f8 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -31,6 +31,8 @@ struct cs_etm_recording { struct auxtrace_record itr; struct perf_pmu *cs_etm_pmu; struct perf_evlist *evlist; + int wrapped_cnt; + bool *wrapped; bool snapshot_mode; size_t snapshot_size; }; @@ -536,16 +538,126 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, return 0; }
-static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused, +static int cs_etm_alloc_wrapped_array(struct cs_etm_recording *ptr, int idx) +{ + bool *wrapped; + int cnt = ptr->wrapped_cnt; + + /* Make @ptr->wrapped as big as @idx */ + while (cnt <= idx) + cnt++; + + /* Free'ed in cs_etm_recording_free() */ + wrapped = reallocarray(ptr->wrapped, cnt, sizeof(bool)); + if (!wrapped) + return -ENOMEM; + + ptr->wrapped_cnt = cnt; + ptr->wrapped = wrapped; + + return 0; +} + +static bool cs_etm_buffer_has_wrapped(unsigned char *buffer, + size_t buffer_size, u64 head) +{ + u64 i, watermark; + u64 *buf = (u64 *)buffer; + size_t buf_size = buffer_size; + + /* + * We want to look the very last 512 byte (chosen arbitrarily) in + * the ring buffer. + */ + watermark = buf_size - 512; + + /* + * @head is continuously increasing - if its value is equal or greater + * than the size of the ring buffer, it has wrapped around. + */ + if (head >= buffer_size) + return true; + + /* + * The value of @head is somewhere within the size of the ring buffer. + * This can be that there hasn't been enough data to fill the ring + * buffer yet or the trace time was so long that @head has numerically + * wrapped around. To find we need to check if we have data at the very + * end of the ring buffer. We can reliably do this because mmap'ed + * pages are zeroed out and there is a fresh mapping with every new + * session. + */ + + /* @head is less than 512 byte from the end of the ring buffer */ + if (head > watermark) + watermark = head; + + /* + * Speed things up by using 64 bit transactions (see "u64 *buf" above) + */ + watermark >>= 3; + buf_size >>= 3; + + /* + * If we find trace data at the end of the ring buffer, @head has + * been there and has numerically wrapped around at least once. + */ + for (i = watermark; i < buf_size; i++) + if (buf[i]) + return true; + + return false; +} + +static int cs_etm_find_snapshot(struct auxtrace_record *itr, int idx, struct auxtrace_mmap *mm, - unsigned char *data __maybe_unused, + unsigned char *data, u64 *head, u64 *old) { - pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n", + int err; + bool wrapped; + struct cs_etm_recording *ptr = + container_of(itr, struct cs_etm_recording, itr); + + /* + * Allocate memory to keep track of wrapping if this is the first + * time we deal with this *mm. + */ + if (idx >= ptr->wrapped_cnt) { + err = cs_etm_alloc_wrapped_array(ptr, idx); + if (err) + return err; + } + + /* + * Check to see if *head has wrapped around. If it hasn't only the + * amount of data between *head and *old is snapshot'ed to avoid + * bloating the perf.data file with zeros. But as soon as *head has + * wrapped around the entire size of the AUX ring buffer it taken. + */ + wrapped = ptr->wrapped[idx]; + if (!wrapped && cs_etm_buffer_has_wrapped(data, mm->len, *head)) { + wrapped = true; + ptr->wrapped[idx] = true; + } + + pr_debug3("%s: mmap index %d old head %lx new head %lx size %lx\n", __func__, idx, (size_t)*old, (size_t)*head, mm->len);
- *old = *head; - *head += mm->len; + /* No wrap has occurred, we can just use *head and *old. */ + if (!wrapped) + return 0; + + /* + * *head has wrapped around - adjust *head and *old to pickup the + * entire content of the AUX buffer. + */ + if (*head >= mm->len) { + *old = *head - mm->len; + } else { + *head += mm->len; + *old = *head - mm->len; + }
return 0; } @@ -586,6 +698,8 @@ static void cs_etm_recording_free(struct auxtrace_record *itr) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); + + zfree(&ptr->wrapped); free(ptr); }
On Tue, May 14, 2019 at 01:40:18PM -0600, Mathieu Poirier wrote:
This patch adds the necessay intelligence to properly compute the value of 'old' and 'head' when operating in snapshot mode. That way we can get the latest information in the AUX buffer and be compatible with the generic AUX ring buffer mechanic.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
tools/perf/arch/arm/util/cs-etm.c | 124 ++++++++++++++++++++++++++++-- 1 file changed, 119 insertions(+), 5 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 911426721170..3734e3fd18f8 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -31,6 +31,8 @@ struct cs_etm_recording { struct auxtrace_record itr; struct perf_pmu *cs_etm_pmu; struct perf_evlist *evlist;
- int wrapped_cnt;
- bool *wrapped; bool snapshot_mode; size_t snapshot_size;
}; @@ -536,16 +538,126 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, return 0; } -static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused, +static int cs_etm_alloc_wrapped_array(struct cs_etm_recording *ptr, int idx) +{
- bool *wrapped;
- int cnt = ptr->wrapped_cnt;
- /* Make @ptr->wrapped as big as @idx */
- while (cnt <= idx)
cnt++;
- /* Free'ed in cs_etm_recording_free() */
- wrapped = reallocarray(ptr->wrapped, cnt, sizeof(bool));
- if (!wrapped)
return -ENOMEM;
- ptr->wrapped_cnt = cnt;
- ptr->wrapped = wrapped;
- return 0;
+}
+static bool cs_etm_buffer_has_wrapped(unsigned char *buffer,
size_t buffer_size, u64 head)
+{
- u64 i, watermark;
- u64 *buf = (u64 *)buffer;
- size_t buf_size = buffer_size;
- /*
* We want to look the very last 512 byte (chosen arbitrarily) in
* the ring buffer.
*/
- watermark = buf_size - 512;
- /*
* @head is continuously increasing - if its value is equal or greater
* than the size of the ring buffer, it has wrapped around.
*/
- if (head >= buffer_size)
return true;
- /*
* The value of @head is somewhere within the size of the ring buffer.
* This can be that there hasn't been enough data to fill the ring
* buffer yet or the trace time was so long that @head has numerically
* wrapped around. To find we need to check if we have data at the very
* end of the ring buffer. We can reliably do this because mmap'ed
* pages are zeroed out and there is a fresh mapping with every new
* session.
*/
- /* @head is less than 512 byte from the end of the ring buffer */
- if (head > watermark)
watermark = head;
- /*
* Speed things up by using 64 bit transactions (see "u64 *buf" above)
*/
- watermark >>= 3;
- buf_size >>= 3;
- /*
* If we find trace data at the end of the ring buffer, @head has
* been there and has numerically wrapped around at least once.
*/
- for (i = watermark; i < buf_size; i++)
if (buf[i])
I think here should be: if (buf[i << 3])
Otherwise, this patch looks good to me.
return true;
- return false;
+}
+static int cs_etm_find_snapshot(struct auxtrace_record *itr, int idx, struct auxtrace_mmap *mm,
unsigned char *data __maybe_unused,
unsigned char *data, u64 *head, u64 *old)
{
- pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n",
- int err;
- bool wrapped;
- struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
- /*
* Allocate memory to keep track of wrapping if this is the first
* time we deal with this *mm.
*/
- if (idx >= ptr->wrapped_cnt) {
err = cs_etm_alloc_wrapped_array(ptr, idx);
if (err)
return err;
- }
- /*
* Check to see if *head has wrapped around. If it hasn't only the
* amount of data between *head and *old is snapshot'ed to avoid
* bloating the perf.data file with zeros. But as soon as *head has
* wrapped around the entire size of the AUX ring buffer it taken.
*/
- wrapped = ptr->wrapped[idx];
- if (!wrapped && cs_etm_buffer_has_wrapped(data, mm->len, *head)) {
wrapped = true;
ptr->wrapped[idx] = true;
- }
- pr_debug3("%s: mmap index %d old head %lx new head %lx size %lx\n", __func__, idx, (size_t)*old, (size_t)*head, mm->len);
- *old = *head;
- *head += mm->len;
- /* No wrap has occurred, we can just use *head and *old. */
- if (!wrapped)
return 0;
- /*
* *head has wrapped around - adjust *head and *old to pickup the
* entire content of the AUX buffer.
*/
- if (*head >= mm->len) {
*old = *head - mm->len;
- } else {
*head += mm->len;
*old = *head - mm->len;
- }
return 0; } @@ -586,6 +698,8 @@ static void cs_etm_recording_free(struct auxtrace_record *itr) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr);
- zfree(&ptr->wrapped); free(ptr);
} -- 2.17.1
On Thu, 16 May 2019 at 09:00, Leo Yan leo.yan@linaro.org wrote:
On Tue, May 14, 2019 at 01:40:18PM -0600, Mathieu Poirier wrote:
This patch adds the necessay intelligence to properly compute the value of 'old' and 'head' when operating in snapshot mode. That way we can get the latest information in the AUX buffer and be compatible with the generic AUX ring buffer mechanic.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
tools/perf/arch/arm/util/cs-etm.c | 124 ++++++++++++++++++++++++++++-- 1 file changed, 119 insertions(+), 5 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 911426721170..3734e3fd18f8 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -31,6 +31,8 @@ struct cs_etm_recording { struct auxtrace_record itr; struct perf_pmu *cs_etm_pmu; struct perf_evlist *evlist;
int wrapped_cnt;
bool *wrapped; bool snapshot_mode; size_t snapshot_size;
}; @@ -536,16 +538,126 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, return 0; }
-static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused, +static int cs_etm_alloc_wrapped_array(struct cs_etm_recording *ptr, int idx) +{
bool *wrapped;
int cnt = ptr->wrapped_cnt;
/* Make @ptr->wrapped as big as @idx */
while (cnt <= idx)
cnt++;
/* Free'ed in cs_etm_recording_free() */
wrapped = reallocarray(ptr->wrapped, cnt, sizeof(bool));
if (!wrapped)
return -ENOMEM;
ptr->wrapped_cnt = cnt;
ptr->wrapped = wrapped;
return 0;
+}
+static bool cs_etm_buffer_has_wrapped(unsigned char *buffer,
size_t buffer_size, u64 head)
+{
u64 i, watermark;
u64 *buf = (u64 *)buffer;
size_t buf_size = buffer_size;
/*
* We want to look the very last 512 byte (chosen arbitrarily) in
* the ring buffer.
*/
watermark = buf_size - 512;
/*
* @head is continuously increasing - if its value is equal or greater
* than the size of the ring buffer, it has wrapped around.
*/
if (head >= buffer_size)
return true;
/*
* The value of @head is somewhere within the size of the ring buffer.
* This can be that there hasn't been enough data to fill the ring
* buffer yet or the trace time was so long that @head has numerically
* wrapped around. To find we need to check if we have data at the very
* end of the ring buffer. We can reliably do this because mmap'ed
* pages are zeroed out and there is a fresh mapping with every new
* session.
*/
/* @head is less than 512 byte from the end of the ring buffer */
if (head > watermark)
watermark = head;
/*
* Speed things up by using 64 bit transactions (see "u64 *buf" above)
*/
watermark >>= 3;
buf_size >>= 3;
/*
* If we find trace data at the end of the ring buffer, @head has
* been there and has numerically wrapped around at least once.
*/
for (i = watermark; i < buf_size; i++)
if (buf[i])
I think here should be: if (buf[i << 3])
It would be if buf[] was a char *, but it is a u64 *.
Otherwise, this patch looks good to me.
return true;
return false;
+}
+static int cs_etm_find_snapshot(struct auxtrace_record *itr, int idx, struct auxtrace_mmap *mm,
unsigned char *data __maybe_unused,
unsigned char *data, u64 *head, u64 *old)
{
pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n",
int err;
bool wrapped;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
/*
* Allocate memory to keep track of wrapping if this is the first
* time we deal with this *mm.
*/
if (idx >= ptr->wrapped_cnt) {
err = cs_etm_alloc_wrapped_array(ptr, idx);
if (err)
return err;
}
/*
* Check to see if *head has wrapped around. If it hasn't only the
* amount of data between *head and *old is snapshot'ed to avoid
* bloating the perf.data file with zeros. But as soon as *head has
* wrapped around the entire size of the AUX ring buffer it taken.
*/
wrapped = ptr->wrapped[idx];
if (!wrapped && cs_etm_buffer_has_wrapped(data, mm->len, *head)) {
wrapped = true;
ptr->wrapped[idx] = true;
}
pr_debug3("%s: mmap index %d old head %lx new head %lx size %lx\n", __func__, idx, (size_t)*old, (size_t)*head, mm->len);
*old = *head;
*head += mm->len;
/* No wrap has occurred, we can just use *head and *old. */
if (!wrapped)
return 0;
/*
* *head has wrapped around - adjust *head and *old to pickup the
* entire content of the AUX buffer.
*/
if (*head >= mm->len) {
*old = *head - mm->len;
} else {
*head += mm->len;
*old = *head - mm->len;
} return 0;
} @@ -586,6 +698,8 @@ static void cs_etm_recording_free(struct auxtrace_record *itr) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr);
zfree(&ptr->wrapped); free(ptr);
}
-- 2.17.1
On Mon, May 20, 2019 at 01:53:29PM -0600, Mathieu Poirier wrote:
[...]
+static bool cs_etm_buffer_has_wrapped(unsigned char *buffer,
size_t buffer_size, u64 head)
+{
u64 i, watermark;
u64 *buf = (u64 *)buffer;
size_t buf_size = buffer_size;
/*
* We want to look the very last 512 byte (chosen arbitrarily) in
* the ring buffer.
*/
watermark = buf_size - 512;
/*
* @head is continuously increasing - if its value is equal or greater
* than the size of the ring buffer, it has wrapped around.
*/
if (head >= buffer_size)
return true;
/*
* The value of @head is somewhere within the size of the ring buffer.
* This can be that there hasn't been enough data to fill the ring
* buffer yet or the trace time was so long that @head has numerically
* wrapped around. To find we need to check if we have data at the very
* end of the ring buffer. We can reliably do this because mmap'ed
* pages are zeroed out and there is a fresh mapping with every new
* session.
*/
/* @head is less than 512 byte from the end of the ring buffer */
if (head > watermark)
watermark = head;
/*
* Speed things up by using 64 bit transactions (see "u64 *buf" above)
*/
watermark >>= 3;
buf_size >>= 3;
/*
* If we find trace data at the end of the ring buffer, @head has
* been there and has numerically wrapped around at least once.
*/
for (i = watermark; i < buf_size; i++)
if (buf[i])
I think here should be: if (buf[i << 3])
It would be if buf[] was a char *, but it is a u64 *.
You are right, I missed that. Sorry for noise.
Thanks, Leo Yan
On Tue, May 14, 2019 at 01:40:12PM -0600, Mathieu Poirier wrote:
This is the second revision of a set that fix miscellaneous problems preventing snapshot mode from working properly when CoreSight events are used.
It applies cleanly on the coresight next branch[1] and will be posted again when 5.2-rc1 comes out.
I tested this patch set and it works as expected with ETF / ETR on Juno board.
Tested-by: Leo Yan leo.yan@linaro.org
Thanks, Leo Yan