This set fix miscellaneous problems that prevented perf's snapshot mode from working properly when CoreSight events are used.
Given the late state in the cycle, it is sent out for reviewing purposes and is not intended for the coming merge window.
Applies cleanly to coresight next[1].
Regards, Mathieu
[1]. https://git.linaro.org/kernel/coresight.git/log/?h=next
Mathieu Poirier (5): coresight: Fix buffer size in snapshot mode coresight: tmc-etf: Fix snapshot mode update function coresight: perf: Don't set the truncated flag in snapshot mode perf tools: Properly set the value of 'old' in snapshot mode docs: coresight: Document snapshot mode
Documentation/trace/coresight.txt | 41 ++++++++++++++++++- drivers/hwtracing/coresight/coresight-etb10.c | 31 +++++++++++++- .../hwtracing/coresight/coresight-tmc-etf.c | 34 +++++++++++++-- .../hwtracing/coresight/coresight-tmc-etr.c | 16 ++++++-- tools/perf/arch/arm/util/cs-etm.c | 12 +++++- 5 files changed, 124 insertions(+), 10 deletions(-)
In snapshot mode the buffer used by the sink devices need to be equal to the ring buffer size in order for the user space mechanic to work properly.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- drivers/hwtracing/coresight/coresight-etb10.c | 23 +++++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etf.c | 20 ++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etr.c | 8 +++++-- 3 files changed, 49 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..0764647b92bc 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -374,7 +374,30 @@ static void *etb_alloc_buffer(struct coresight_device *csdev, int nr_pages, bool overwrite) { int node, cpu = event->cpu; + u32 capacity; struct cs_buffers *buf; + struct etb_drvdata *drvdata; + + /* + * In snapsot mode the size of the perf ring buffer needs to be equal + * to the size of the device's internal memory if we want to reuse the + * generic AUX buffer management mechanic. + * + * For example (assuming 4096 byte page size): + * + * # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp + * 0x2000 + * # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP + * + */ + drvdata = dev_get_drvdata(csdev->dev.parent); + capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS; + + if (overwrite && + ((nr_pages << PAGE_SHIFT) != capacity)) { + dev_err(&csdev->dev, "Ring buffer not equal to device buffer"); + return NULL; + }
if (cpu == -1) cpu = smp_processor_id(); diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 2527b5d3b65e..7694833b13cb 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -380,6 +380,26 @@ static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, { int node, cpu = event->cpu; struct cs_buffers *buf; + struct tmc_drvdata *drvdata; + + /* + * In snapsot mode the size of the perf ring buffer needs to be equal + * to the size of the device's internal memory if we want to reuse the + * generic AUX buffer management mechanic. + * + * For example (assuming 4096 byte page size): + * + * # cat /sys/bus/coresight/devices/20010000.etf/buffer_size + * 0x10000 + * # perf record -e cs_etm/@20010000.etf/ -S -m,16 --per-thread $APP + * + */ + drvdata = dev_get_drvdata(csdev->dev.parent); + if (overwrite && + ((nr_pages << PAGE_SHIFT) != drvdata->size)) { + dev_err(&csdev->dev, "Ring buffer not equal to device buffer"); + return NULL; + }
if (cpu == -1) cpu = smp_processor_id(); diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index df6e4b0b84e9..b9881d6d41ba 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1188,9 +1188,13 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, struct perf_event *event,
/* * Try to match the perf ring buffer size if it is larger - * than the size requested via sysfs. + * than the size requested via sysfs. In snapsot mode the size + * of the perf ring buffer needs to be equal to the allocated + * size if we want to reuse the generic AUX buffer management + * mechanic. */ - if ((nr_pages << PAGE_SHIFT) > drvdata->size) { + if (snapshot || + (nr_pages << PAGE_SHIFT) > drvdata->size) { etr_buf = tmc_alloc_etr_buf(drvdata, (nr_pages << PAGE_SHIFT), 0, node, NULL); if (!IS_ERR(etr_buf))
When working in snapshot mode function perf_aux_output_begin() does not set the handle->size because the size is expected to be deduced by the placement of the "head" and "old" pointers in user space. As such there is no point in trying to adjust the amount of data to copy to the ring buffer.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 7694833b13cb..d3025634f5e6 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -497,9 +497,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, /* * The TMC RAM buffer may be bigger than the space available in the * perf ring buffer (handle->size). If so advance the RRP so that we - * get the latest trace data. + * get the latest trace data. In snapshot mode none of that matters + * since we are expected to clobber stale data in favour of the latest + * traces. */ - if (to_read > handle->size) { + if (!buf->snapshot && to_read > handle->size) { u32 mask = 0;
/*
This patch avoids setting the truncated flag when operaring in snapshot mode since the trace buffer is expected to be truncated and discontinuous from one snapshot to another. Moreover when the truncated flag is set the perf core stops enabling the event, waiting for user space to consume the data. In snapshot mode this is clearly not what we want since it results in stale data.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- drivers/hwtracing/coresight/coresight-etb10.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 +++++++- 3 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 0764647b92bc..6ff48be91f61 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -535,7 +535,13 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, lost = true; }
- if (lost) + /* + * Don't set the TRUNCATED flag in snapshot mode because 1) the + * captured buffer is expected to be truncated and 2) a full buffer + * prevents the event from being re-enabled by the perf core, + * resulting in stale data being send to user space. + */ + if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
/* finally tell HW where we want to start reading from */ diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d3025634f5e6..8039bd389034 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -538,7 +538,13 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, lost = true; }
- if (lost) + /* + * Don't set the TRUNCATED flag in snapshot mode because 1) the + * captured buffer is expected to be truncated and 2) a full buffer + * prevents the event from being re-enabled by the perf core, + * resulting in stale data being send to user space. + */ + if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
cur = buf->cur; diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index b9881d6d41ba..718586a083af 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1516,7 +1516,13 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
lost |= etr_buf->full; out: - if (lost) + /* + * Don't set the TRUNCATED flag in snapshot mode because 1) the + * captured buffer is expected to be truncated and 2) a full buffer + * prevents the event from being re-enabled by the perf core, + * resulting in stale data being send to user space. + */ + if (!etr_perf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); return size; }
In snapshot mode the value of the 'old' pointer needs to be adjusted when 'head' has wrapped around in order to get the latest information in the buffer and be compatible with the generic AUX ring buffer mechanic.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- tools/perf/arch/arm/util/cs-etm.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 911426721170..4e73fe1a6978 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -541,11 +541,19 @@ static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused, unsigned char *data __maybe_unused, u64 *head, u64 *old) { + bool wrapped; + pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n", __func__, idx, (size_t)*old, (size_t)*head, mm->len);
- *old = *head; - *head += mm->len; + /* + * If the last byte in the ring buffer isn't zero, the head has + * wrapped around. + */ + wrapped = !!(data[mm->len - 1]); + + if (wrapped) + *old = *head - mm->len;
return 0; }
Because of the resctriction imposed by the internal memory buffer of some CoreSight sink, using the framework in snapshot mode can be a little trickly. As such document the process to make user experience more enjoyable.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org --- Documentation/trace/coresight.txt | 41 ++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/Documentation/trace/coresight.txt b/Documentation/trace/coresight.txt index efbc832146e7..6c602ae379e2 100644 --- a/Documentation/trace/coresight.txt +++ b/Documentation/trace/coresight.txt @@ -363,7 +363,6 @@ The --itrace option controls the type and frequency of synthesized events Note that only 64-bit programs are currently supported - further work is required to support instruction decode of 32-bit Arm programs.
- Generating coverage files for Feedback Directed Optimization: AutoFDO ---------------------------------------------------------------------
@@ -394,6 +393,46 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto Bubble sorting array of 30000 elements 5806 ms
+2.2) Snapshot mode: + +Using perf's built-in snapshot mode with CoreSight tracers is supported - to +do so the '-S' command line option needs to be specified. Since current sink +devices are used in double-buffer mode when operated from the perf interface, +the size of the perf ring buffer needs to be adjusted to match the size of the +buffer used by the CoreSight sinks. From the perf command line it is possible +to specify the number of pages to use for a session using the '-m,X' option, +where X is the amount of pages. + +The system memory buffer used by ETR devices is automatically adjusted +to match the size of the perf ring buffer and as such does not need to be +modified on the perf command line. For ETB and ETF devices the perf ring +buffer size need to be adjusted to match the size of the internal buffer. + +The following examples assume a system page size of 4096 byte: + + # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp + 0x2000 + # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP + + # cat /sys/bus/coresight/devices/20010000.etf/buffer_size + 0x10000 + # perf record -e cs_etm/@20010000.etf/ -S -m,16 --per-thread $APP + + # perf record -e cs_etm/@20070000.etr/ -S --per-thread $APP + +Once an application is launched trace snapshot are collected by sending the +USR2 message to the process being monitored: + + # perf record -e cs_etm/@20070000.etr/ -S --per-thread $APP & + [1] 14808 + # kill -USR2 14808 + ... + ... + # kill -USR2 14808 + ... + ... + # kill 14808 +
How to use the STM module -------------------------
On Wed, May 01, 2019 at 11:50:48AM -0600, Mathieu Poirier wrote:
In snapshot mode the buffer used by the sink devices need to be equal to the ring buffer size in order for the user space mechanic to work properly.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 23 +++++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etf.c | 20 ++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etr.c | 8 +++++-- 3 files changed, 49 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..0764647b92bc 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -374,7 +374,30 @@ static void *etb_alloc_buffer(struct coresight_device *csdev, int nr_pages, bool overwrite) { int node, cpu = event->cpu;
- u32 capacity; struct cs_buffers *buf;
- struct etb_drvdata *drvdata;
- /*
* In snapsot mode the size of the perf ring buffer needs to be equal
* to the size of the device's internal memory if we want to reuse the
* generic AUX buffer management mechanic.
*
* For example (assuming 4096 byte page size):
Here is delibrately to write as '4096 byte'? Though I think should be '4096 bytes' but I am not confident which is right ...
*
* # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
* 0x2000
* # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
*
*/
- drvdata = dev_get_drvdata(csdev->dev.parent);
- capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
- if (overwrite &&
((nr_pages << PAGE_SHIFT) != capacity)) {
dev_err(&csdev->dev, "Ring buffer not equal to device buffer");
return NULL;
- }
if (cpu == -1) cpu = smp_processor_id(); diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 2527b5d3b65e..7694833b13cb 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -380,6 +380,26 @@ static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, { int node, cpu = event->cpu; struct cs_buffers *buf;
- struct tmc_drvdata *drvdata;
- /*
* In snapsot mode the size of the perf ring buffer needs to be equal
* to the size of the device's internal memory if we want to reuse the
* generic AUX buffer management mechanic.
*
* For example (assuming 4096 byte page size):
*
* # cat /sys/bus/coresight/devices/20010000.etf/buffer_size
* 0x10000
* # perf record -e cs_etm/@20010000.etf/ -S -m,16 --per-thread $APP
*
*/
- drvdata = dev_get_drvdata(csdev->dev.parent);
- if (overwrite &&
((nr_pages << PAGE_SHIFT) != drvdata->size)) {
dev_err(&csdev->dev, "Ring buffer not equal to device buffer");
return NULL;
- }
if (cpu == -1) cpu = smp_processor_id(); diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index df6e4b0b84e9..b9881d6d41ba 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1188,9 +1188,13 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, struct perf_event *event, /* * Try to match the perf ring buffer size if it is larger
* than the size requested via sysfs.
* than the size requested via sysfs. In snapsot mode the size
* of the perf ring buffer needs to be equal to the allocated
* size if we want to reuse the generic AUX buffer management
*/* mechanic.
- if ((nr_pages << PAGE_SHIFT) > drvdata->size) {
- if (snapshot ||
etr_buf = tmc_alloc_etr_buf(drvdata, (nr_pages << PAGE_SHIFT), 0, node, NULL); if (!IS_ERR(etr_buf))(nr_pages << PAGE_SHIFT) > drvdata->size) {
If tmc_alloc_etr_buf() returns failure then it's possible to run into the below sequence to allocate smaller buffer size for snapshot mode; which is not expected for snapshot mode.
So here if tmc_alloc_etr_buf() fails to allocate buffer for snapshot mode, should directly bail out with error code.
Thanks, Leo Yan
On Wed, May 01, 2019 at 11:50:49AM -0600, Mathieu Poirier wrote:
When working in snapshot mode function perf_aux_output_begin()
Do you mean perf_aux_output_end() rather than perf_aux_output_begin()?
I checked perf_aux_output_begin(), it will always set 'handle->size' to zero.
does not set the handle->size because the size is expected to be deduced by the placement of the "head" and "old" pointers in user space. As such there is no point in trying to adjust the amount of data to copy to the ring buffer.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
Rest looks good to me:
Reviewed-by: Leo Yan leo.yan@linaro.org
drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 7694833b13cb..d3025634f5e6 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -497,9 +497,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, /* * The TMC RAM buffer may be bigger than the space available in the * perf ring buffer (handle->size). If so advance the RRP so that we
* get the latest trace data.
* get the latest trace data. In snapshot mode none of that matters
* since we are expected to clobber stale data in favour of the latest
*/* traces.
- if (to_read > handle->size) {
- if (!buf->snapshot && to_read > handle->size) { u32 mask = 0;
/* -- 2.17.1
On Wed, May 01, 2019 at 11:50:50AM -0600, Mathieu Poirier wrote:
This patch avoids setting the truncated flag when operaring in snapshot
s/operaring/operating
mode since the trace buffer is expected to be truncated and discontinuous from one snapshot to another. Moreover when the truncated flag is set the perf core stops enabling the event, waiting for user space to consume the data. In snapshot mode this is clearly not what we want since it results in stale data.
Not sure if I understand correctly or not.
If set TRUNCATED flag and the user space has finished to read out the trace data, will perf not re-enable the event anymore for snapshot mode?
Seems to me, the perf core code cannot handle properly for TRUNCATED flag with snapshot mode. Sorry if introduce noise, will look into the perf core code.
Thanks, Leo Yan
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 +++++++- 3 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 0764647b92bc..6ff48be91f61 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -535,7 +535,13 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, lost = true; }
- if (lost)
- /*
* Don't set the TRUNCATED flag in snapshot mode because 1) the
* captured buffer is expected to be truncated and 2) a full buffer
* prevents the event from being re-enabled by the perf core,
* resulting in stale data being send to user space.
*/
- if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
/* finally tell HW where we want to start reading from */ diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d3025634f5e6..8039bd389034 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -538,7 +538,13 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, lost = true; }
- if (lost)
- /*
* Don't set the TRUNCATED flag in snapshot mode because 1) the
* captured buffer is expected to be truncated and 2) a full buffer
* prevents the event from being re-enabled by the perf core,
* resulting in stale data being send to user space.
*/
- if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
cur = buf->cur; diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index b9881d6d41ba..718586a083af 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1516,7 +1516,13 @@ tmc_update_etr_buffer(struct coresight_device *csdev, lost |= etr_buf->full; out:
- if (lost)
- /*
* Don't set the TRUNCATED flag in snapshot mode because 1) the
* captured buffer is expected to be truncated and 2) a full buffer
* prevents the event from being re-enabled by the perf core,
* resulting in stale data being send to user space.
*/
- if (!etr_perf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); return size;
}
2.17.1
On Wed, May 01, 2019 at 11:50:51AM -0600, Mathieu Poirier wrote:
In snapshot mode the value of the 'old' pointer needs to be adjusted when 'head' has wrapped around in order to get the latest information in the buffer and be compatible with the generic AUX ring buffer mechanic.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
tools/perf/arch/arm/util/cs-etm.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 911426721170..4e73fe1a6978 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -541,11 +541,19 @@ static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused, unsigned char *data __maybe_unused, u64 *head, u64 *old) {
- bool wrapped;
- pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n", __func__, idx, (size_t)*old, (size_t)*head, mm->len);
- *old = *head;
- *head += mm->len;
- /*
* If the last byte in the ring buffer isn't zero, the head has
* wrapped around.
*/
- wrapped = !!(data[mm->len - 1]);
This is confused for me since I can think out two cases might break this checking.
The first case is the trace data stream might be zero at the end of the buffer; the second case is that the buffer is not really wrapped around at this time but the end of buffer contains the stale data by previous time.
Could you confirm both cases will not happen?
Will do more testing for this patch set.
Thanks, Leo Yan
- if (wrapped)
*old = *head - mm->len;
return 0; } -- 2.17.1
Hi Mathieu,
On 01/05/2019 18:50, Mathieu Poirier wrote:
In snapshot mode the buffer used by the sink devices need to be equal to the ring buffer size in order for the user space mechanic to work properly.
The patch as such looks good to me. However I don't understand the need for it for ETB and ETF. We can't use the AUX buf directly anyway for these devices. We could always pretend that there was no overflow and simply copy it to the AUX buf. The decoder would know the end of trace packets. What am I missing here ?
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 23 +++++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etf.c | 20 ++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etr.c | 8 +++++-- 3 files changed, 49 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..0764647b92bc 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -374,7 +374,30 @@ static void *etb_alloc_buffer(struct coresight_device *csdev, int nr_pages, bool overwrite) { int node, cpu = event->cpu;
- u32 capacity; struct cs_buffers *buf;
- struct etb_drvdata *drvdata;
- /*
* In snapsot mode the size of the perf ring buffer needs to be equal
* to the size of the device's internal memory if we want to reuse the
* generic AUX buffer management mechanic.
*
* For example (assuming 4096 byte page size):
*
* # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
* 0x2000
* # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
*
*/
If at all we need to do this, Is there a way to force the perf to use the appropriate size for AUX buf depending on the sink ? I would prefer that instead of finding this magic number and calculating it based on the page_size. If we could not do that easily, it may be a good idea to expose something like, "buffer_pages" under sysfs to help the user a bit.
Suzuki
On 01/05/2019 18:50, Mathieu Poirier wrote:
When working in snapshot mode function perf_aux_output_begin() does not set the handle->size because the size is expected to be deduced by the placement of the "head" and "old" pointers in user space. As such there is no point in trying to adjust the amount of data to copy to the ring buffer.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 7694833b13cb..d3025634f5e6 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -497,9 +497,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, /* * The TMC RAM buffer may be bigger than the space available in the * perf ring buffer (handle->size). If so advance the RRP so that we
* get the latest trace data.
* get the latest trace data. In snapshot mode none of that matters
* since we are expected to clobber stale data in favour of the latest
*/* traces.
- if (to_read > handle->size) {
- if (!buf->snapshot && to_read > handle->size) { u32 mask = 0;
/*
Looks good to me.
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com
On Tue, 7 May 2019 at 02:13, Leo Yan leo.yan@linaro.org wrote:
On Wed, May 01, 2019 at 11:50:49AM -0600, Mathieu Poirier wrote:
When working in snapshot mode function perf_aux_output_begin()
Do you mean perf_aux_output_end() rather than perf_aux_output_begin()?
I checked perf_aux_output_begin(), it will always set 'handle->size' to zero.
When not operating in snapshot mode perf_aux_output_begin() sets handle->size:
https://elixir.bootlin.com/linux/latest/source/kernel/events/ring_buffer.c#L...
does not set the handle->size because the size is expected to be deduced by the placement of the "head" and "old" pointers in user space. As such there is no point in trying to adjust the amount of data to copy to the ring buffer.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
Rest looks good to me:
Reviewed-by: Leo Yan leo.yan@linaro.org
drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 7694833b13cb..d3025634f5e6 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -497,9 +497,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, /* * The TMC RAM buffer may be bigger than the space available in the * perf ring buffer (handle->size). If so advance the RRP so that we
* get the latest trace data.
* get the latest trace data. In snapshot mode none of that matters
* since we are expected to clobber stale data in favour of the latest
* traces. */
if (to_read > handle->size) {
if (!buf->snapshot && to_read > handle->size) { u32 mask = 0; /*
-- 2.17.1
On Tue, 7 May 2019 at 01:39, Leo Yan leo.yan@linaro.org wrote:
On Wed, May 01, 2019 at 11:50:48AM -0600, Mathieu Poirier wrote:
In snapshot mode the buffer used by the sink devices need to be equal to the ring buffer size in order for the user space mechanic to work properly.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 23 +++++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etf.c | 20 ++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etr.c | 8 +++++-- 3 files changed, 49 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..0764647b92bc 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -374,7 +374,30 @@ static void *etb_alloc_buffer(struct coresight_device *csdev, int nr_pages, bool overwrite) { int node, cpu = event->cpu;
u32 capacity; struct cs_buffers *buf;
struct etb_drvdata *drvdata;
/*
* In snapsot mode the size of the perf ring buffer needs to be equal
* to the size of the device's internal memory if we want to reuse the
* generic AUX buffer management mechanic.
*
* For example (assuming 4096 byte page size):
Here is delibrately to write as '4096 byte'? Though I think should be '4096 bytes' but I am not confident which is right ...
Well, English isn't my first language either but I think it is correct since I am referring to "the" page size and "4096 byte" is an adjective of it.
*
* # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
* 0x2000
* # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
*
*/
drvdata = dev_get_drvdata(csdev->dev.parent);
capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
if (overwrite &&
((nr_pages << PAGE_SHIFT) != capacity)) {
dev_err(&csdev->dev, "Ring buffer not equal to device buffer");
return NULL;
} if (cpu == -1) cpu = smp_processor_id();
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 2527b5d3b65e..7694833b13cb 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -380,6 +380,26 @@ static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, { int node, cpu = event->cpu; struct cs_buffers *buf;
struct tmc_drvdata *drvdata;
/*
* In snapsot mode the size of the perf ring buffer needs to be equal
* to the size of the device's internal memory if we want to reuse the
* generic AUX buffer management mechanic.
*
* For example (assuming 4096 byte page size):
*
* # cat /sys/bus/coresight/devices/20010000.etf/buffer_size
* 0x10000
* # perf record -e cs_etm/@20010000.etf/ -S -m,16 --per-thread $APP
*
*/
drvdata = dev_get_drvdata(csdev->dev.parent);
if (overwrite &&
((nr_pages << PAGE_SHIFT) != drvdata->size)) {
dev_err(&csdev->dev, "Ring buffer not equal to device buffer");
return NULL;
} if (cpu == -1) cpu = smp_processor_id();
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index df6e4b0b84e9..b9881d6d41ba 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1188,9 +1188,13 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, struct perf_event *event,
/* * Try to match the perf ring buffer size if it is larger
* than the size requested via sysfs.
* than the size requested via sysfs. In snapsot mode the size
* of the perf ring buffer needs to be equal to the allocated
* size if we want to reuse the generic AUX buffer management
* mechanic. */
if ((nr_pages << PAGE_SHIFT) > drvdata->size) {
if (snapshot ||
(nr_pages << PAGE_SHIFT) > drvdata->size) { etr_buf = tmc_alloc_etr_buf(drvdata, (nr_pages << PAGE_SHIFT), 0, node, NULL); if (!IS_ERR(etr_buf))
If tmc_alloc_etr_buf() returns failure then it's possible to run into the below sequence to allocate smaller buffer size for snapshot mode; which is not expected for snapshot mode.
So here if tmc_alloc_etr_buf() fails to allocate buffer for snapshot mode, should directly bail out with error code.
You are quite right - I need to fix this.
Thanks, Leo Yan
On Tue, 7 May 2019 at 02:29, Leo Yan leo.yan@linaro.org wrote:
On Wed, May 01, 2019 at 11:50:50AM -0600, Mathieu Poirier wrote:
This patch avoids setting the truncated flag when operaring in snapshot
s/operaring/operating
Ok
mode since the trace buffer is expected to be truncated and discontinuous from one snapshot to another. Moreover when the truncated flag is set the perf core stops enabling the event, waiting for user space to consume the data. In snapshot mode this is clearly not what we want since it results in stale data.
Not sure if I understand correctly or not.
If set TRUNCATED flag and the user space has finished to read out the trace data, will perf not re-enable the event anymore for snapshot mode?
You are correct, once user space has read truncated trace data the event is scheduled again. But here we want to continue accumulating data in the ring buffer without sending it to user space, that is the idea behind snaphsot mode. As such we will get truncated data but we want to continue accumulating it until user space decides that it wants to read a snapshot. If the TRUNCATED flag is set when the ring buffer has been filled the event is not scheduled and trace data not accumulated anymore, leading to a stale snapshot when user space requests it.
Seems to me, the perf core code cannot handle properly for TRUNCATED flag with snapshot mode. Sorry if introduce noise, will look into the perf core code.
Thanks, Leo Yan
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 +++++++- drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 +++++++- 3 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 0764647b92bc..6ff48be91f61 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -535,7 +535,13 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, lost = true; }
if (lost)
/*
* Don't set the TRUNCATED flag in snapshot mode because 1) the
* captured buffer is expected to be truncated and 2) a full buffer
* prevents the event from being re-enabled by the perf core,
* resulting in stale data being send to user space.
*/
if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); /* finally tell HW where we want to start reading from */
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d3025634f5e6..8039bd389034 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -538,7 +538,13 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, lost = true; }
if (lost)
/*
* Don't set the TRUNCATED flag in snapshot mode because 1) the
* captured buffer is expected to be truncated and 2) a full buffer
* prevents the event from being re-enabled by the perf core,
* resulting in stale data being send to user space.
*/
if (!buf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); cur = buf->cur;
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index b9881d6d41ba..718586a083af 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1516,7 +1516,13 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
lost |= etr_buf->full;
out:
if (lost)
/*
* Don't set the TRUNCATED flag in snapshot mode because 1) the
* captured buffer is expected to be truncated and 2) a full buffer
* prevents the event from being re-enabled by the perf core,
* resulting in stale data being send to user space.
*/
if (!etr_perf->snapshot && lost) perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); return size;
}
2.17.1
On Tue, 7 May 2019 at 02:44, Leo Yan leo.yan@linaro.org wrote:
On Wed, May 01, 2019 at 11:50:51AM -0600, Mathieu Poirier wrote:
In snapshot mode the value of the 'old' pointer needs to be adjusted when 'head' has wrapped around in order to get the latest information in the buffer and be compatible with the generic AUX ring buffer mechanic.
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
tools/perf/arch/arm/util/cs-etm.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 911426721170..4e73fe1a6978 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -541,11 +541,19 @@ static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused, unsigned char *data __maybe_unused, u64 *head, u64 *old) {
bool wrapped;
pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n", __func__, idx, (size_t)*old, (size_t)*head, mm->len);
*old = *head;
*head += mm->len;
/*
* If the last byte in the ring buffer isn't zero, the head has
* wrapped around.
*/
wrapped = !!(data[mm->len - 1]);
This is confused for me since I can think out two cases might break this checking.
The first case is the trace data stream might be zero at the end of the buffer;
I just realized there is a better way to do this - since "*head" is continiously incrementing I will simply compare it to mm->len. If it is equal of bigger, the head has wrapped around.
the second case is that the buffer is not really wrapped around at this time but the end of buffer contains the stale data by previous time.
That would mean the snapshots were really close together. In that case we'd simply get the tail end of the previous snapshot, which is fine since it is was close enough that we do want that data.
Could you confirm both cases will not happen?
Will do more testing for this patch set.
Thanks, Leo Yan
if (wrapped)
*old = *head - mm->len; return 0;
}
2.17.1
Hey Suzuki,
On Tue, 7 May 2019 at 02:50, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi Mathieu,
On 01/05/2019 18:50, Mathieu Poirier wrote:
In snapshot mode the buffer used by the sink devices need to be equal to the ring buffer size in order for the user space mechanic to work properly.
The patch as such looks good to me. However I don't understand the need for it for ETB and ETF. We can't use the AUX buf directly anyway for these devices. We could always pretend that there was no overflow and simply copy it to the AUX buf. The decoder would know the end of trace packets. What am I missing here ?
The problem here is to figure out how to recognised a buffer wrap has occurred in function cs_etm_find_snapshot() and how to compute the value of '*old'. But looking at patch 4/5 again I came up with a better way to proceed, one that should remove the need for this patch. I will send another revision.
Mathieu
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org
drivers/hwtracing/coresight/coresight-etb10.c | 23 +++++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etf.c | 20 ++++++++++++++++ .../hwtracing/coresight/coresight-tmc-etr.c | 8 +++++-- 3 files changed, 49 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4ee4c80a4354..0764647b92bc 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -374,7 +374,30 @@ static void *etb_alloc_buffer(struct coresight_device *csdev, int nr_pages, bool overwrite) { int node, cpu = event->cpu;
u32 capacity; struct cs_buffers *buf;
struct etb_drvdata *drvdata;
/*
* In snapsot mode the size of the perf ring buffer needs to be equal
* to the size of the device's internal memory if we want to reuse the
* generic AUX buffer management mechanic.
*
* For example (assuming 4096 byte page size):
*
* # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
* 0x2000
* # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
*
*/
If at all we need to do this, Is there a way to force the perf to use the appropriate size for AUX buf depending on the sink ? I would prefer that instead of finding this magic number and calculating it based on the page_size. If we could not do that easily, it may be a good idea to expose something like, "buffer_pages" under sysfs to help the user a bit.
Suzuki
On Wed, May 01, 2019 at 11:50:52AM -0600, Mathieu Poirier wrote:
[...]
+2.2) Snapshot mode:
+Using perf's built-in snapshot mode with CoreSight tracers is supported - to +do so the '-S' command line option needs to be specified. Since current sink +devices are used in double-buffer mode when operated from the perf interface, +the size of the perf ring buffer needs to be adjusted to match the size of the +buffer used by the CoreSight sinks. From the perf command line it is possible +to specify the number of pages to use for a session using the '-m,X' option, +where X is the amount of pages.
+The system memory buffer used by ETR devices is automatically adjusted +to match the size of the perf ring buffer and as such does not need to be +modified on the perf command line. For ETB and ETF devices the perf ring +buffer size need to be adjusted to match the size of the internal buffer.
+The following examples assume a system page size of 4096 byte:
- # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
- 0x2000
- # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
In this case it shows the usage for etb, thus should: s/20010000.etf/20010000.etb/
BTW, the user needs to convert the rdp to byte size with multiplying 4, it's good to explain for this in the doc or give related info in the driver warning log.
Thanks, Leo Yan
- # cat /sys/bus/coresight/devices/20010000.etf/buffer_size
- 0x10000
- # perf record -e cs_etm/@20010000.etf/ -S -m,16 --per-thread $APP
- # perf record -e cs_etm/@20070000.etr/ -S --per-thread $APP
+Once an application is launched trace snapshot are collected by sending the +USR2 message to the process being monitored:
- # perf record -e cs_etm/@20070000.etr/ -S --per-thread $APP &
- [1] 14808
- # kill -USR2 14808
- ...
- ...
- # kill -USR2 14808
- ...
- ...
- # kill 14808
How to use the STM module
-- 2.17.1
Hi,
On 11/05/2019 08:32, Leo Yan wrote:
On Wed, May 01, 2019 at 11:50:52AM -0600, Mathieu Poirier wrote:
[...]
+2.2) Snapshot mode:
+Using perf's built-in snapshot mode with CoreSight tracers is supported - to +do so the '-S' command line option needs to be specified. Since current sink +devices are used in double-buffer mode when operated from the perf interface, +the size of the perf ring buffer needs to be adjusted to match the size of the +buffer used by the CoreSight sinks. From the perf command line it is possible +to specify the number of pages to use for a session using the '-m,X' option, +where X is the amount of pages.
+The system memory buffer used by ETR devices is automatically adjusted +to match the size of the perf ring buffer and as such does not need to be +modified on the perf command line. For ETB and ETF devices the perf ring +buffer size need to be adjusted to match the size of the internal buffer.
+The following examples assume a system page size of 4096 byte:
- # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
- 0x2000
- # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
In this case it shows the usage for etb, thus should: s/20010000.etf/20010000.etb/
BTW, the user needs to convert the rdp to byte size with multiplying 4, it's good to explain for this in the doc or give related info in the driver warning log.
If at all we want to match the aux space size with that of the device buffer, I recommend exposing this via a new "buf_pages" attribute under the sysfs to help the user.
Cheers Suzuki
Hi Suzuki,
On Mon, May 13, 2019 at 09:37:01AM +0100, Suzuki K Poulose wrote:
Hi,
On 11/05/2019 08:32, Leo Yan wrote:
On Wed, May 01, 2019 at 11:50:52AM -0600, Mathieu Poirier wrote:
[...]
+2.2) Snapshot mode:
+Using perf's built-in snapshot mode with CoreSight tracers is supported - to +do so the '-S' command line option needs to be specified. Since current sink +devices are used in double-buffer mode when operated from the perf interface, +the size of the perf ring buffer needs to be adjusted to match the size of the +buffer used by the CoreSight sinks. From the perf command line it is possible +to specify the number of pages to use for a session using the '-m,X' option, +where X is the amount of pages.
+The system memory buffer used by ETR devices is automatically adjusted +to match the size of the perf ring buffer and as such does not need to be +modified on the perf command line. For ETB and ETF devices the perf ring +buffer size need to be adjusted to match the size of the internal buffer.
+The following examples assume a system page size of 4096 byte:
- # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
- 0x2000
- # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
In this case it shows the usage for etb, thus should: s/20010000.etf/20010000.etb/
BTW, the user needs to convert the rdp to byte size with multiplying 4, it's good to explain for this in the doc or give related info in the driver warning log.
If at all we want to match the aux space size with that of the device buffer, I recommend exposing this via a new "buf_pages" attribute under the sysfs to help the user.
Agree, I also saw you suggested in another email. Using "buf_pages" is directive and consistent for different sink devices.
Thanks, Leo Yan
On Mon, 13 May 2019 at 05:16, Leo Yan leo.yan@linaro.org wrote:
Hi Suzuki,
On Mon, May 13, 2019 at 09:37:01AM +0100, Suzuki K Poulose wrote:
Hi,
On 11/05/2019 08:32, Leo Yan wrote:
On Wed, May 01, 2019 at 11:50:52AM -0600, Mathieu Poirier wrote:
[...]
+2.2) Snapshot mode:
+Using perf's built-in snapshot mode with CoreSight tracers is supported - to +do so the '-S' command line option needs to be specified. Since current sink +devices are used in double-buffer mode when operated from the perf interface, +the size of the perf ring buffer needs to be adjusted to match the size of the +buffer used by the CoreSight sinks. From the perf command line it is possible +to specify the number of pages to use for a session using the '-m,X' option, +where X is the amount of pages.
+The system memory buffer used by ETR devices is automatically adjusted +to match the size of the perf ring buffer and as such does not need to be +modified on the perf command line. For ETB and ETF devices the perf ring +buffer size need to be adjusted to match the size of the internal buffer.
+The following examples assume a system page size of 4096 byte:
- # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
- 0x2000
- # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
In this case it shows the usage for etb, thus should: s/20010000.etf/20010000.etb/
BTW, the user needs to convert the rdp to byte size with multiplying 4, it's good to explain for this in the doc or give related info in the driver warning log.
If at all we want to match the aux space size with that of the device buffer, I recommend exposing this via a new "buf_pages" attribute under the sysfs to help the user.
Agree, I also saw you suggested in another email. Using "buf_pages" is directive and consistent for different sink devices.
I've been thinking about this. Both ETR and ETF have a 'buffer_size" entry in sysfs and ETB does not. The first thing I suggest is to add a "buffer_size" entry to the ETB driver so that all sink devices look the same. From there enhance the current information carried by "buffer_size" to include pages. For example:
$cat 20070000.etr/buffer_size 0x100000 0x100
$ cat 20010000.etf/buffer_size 0x10000 0x10
$ cat 20010000.etb/buffer_size 0x8000 0x8
That way we have the information we need without adding more entries to sysfs - all we need to do is update the documentation (which we'd have to do anyway). Let me know what you think.
Mathieu
Thanks, Leo Yan
On Mon, 13 May 2019 at 14:01, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On Mon, 13 May 2019 at 05:16, Leo Yan leo.yan@linaro.org wrote:
Hi Suzuki,
On Mon, May 13, 2019 at 09:37:01AM +0100, Suzuki K Poulose wrote:
Hi,
On 11/05/2019 08:32, Leo Yan wrote:
On Wed, May 01, 2019 at 11:50:52AM -0600, Mathieu Poirier wrote:
[...]
+2.2) Snapshot mode:
+Using perf's built-in snapshot mode with CoreSight tracers is supported - to +do so the '-S' command line option needs to be specified. Since current sink +devices are used in double-buffer mode when operated from the perf interface, +the size of the perf ring buffer needs to be adjusted to match the size of the +buffer used by the CoreSight sinks. From the perf command line it is possible +to specify the number of pages to use for a session using the '-m,X' option, +where X is the amount of pages.
+The system memory buffer used by ETR devices is automatically adjusted +to match the size of the perf ring buffer and as such does not need to be +modified on the perf command line. For ETB and ETF devices the perf ring +buffer size need to be adjusted to match the size of the internal buffer.
+The following examples assume a system page size of 4096 byte:
- # cat /sys/bus/coresight/devices/20010000.etb/mgmt/rdp
- 0x2000
- # perf record -e cs_etm/@20010000.etf/ -S -m,8 --per-thread $APP
In this case it shows the usage for etb, thus should: s/20010000.etf/20010000.etb/
BTW, the user needs to convert the rdp to byte size with multiplying 4, it's good to explain for this in the doc or give related info in the driver warning log.
If at all we want to match the aux space size with that of the device buffer, I recommend exposing this via a new "buf_pages" attribute under the sysfs to help the user.
Agree, I also saw you suggested in another email. Using "buf_pages" is directive and consistent for different sink devices.
I've been thinking about this. Both ETR and ETF have a 'buffer_size" entry in sysfs and ETB does not. The first thing I suggest is to add a "buffer_size" entry to the ETB driver so that all sink devices look the same. From there enhance the current information carried by "buffer_size" to include pages. For example:
$cat 20070000.etr/buffer_size 0x100000 0x100
$ cat 20010000.etf/buffer_size 0x10000 0x10
$ cat 20010000.etb/buffer_size 0x8000 0x8
That way we have the information we need without adding more entries to sysfs - all we need to do is update the documentation (which we'd have to do anyway). Let me know what you think.
What I did forget to mention is that my next revision of the feature does not mandate to make the ring buffer match the size of the internal buffer memories. The above reply was in the context of letting users know how to make the size of the buffers match should they want to. But so far I haven't found a use case where doing so would be beneficial.
Mathieu
Thanks, Leo Yan